Scott & Tribble (2006) on discovering pottential patterns

The second aspect is summarised in that phrase “potential patterns”. How so? The process operates in two stages. First, all the effort of a concordancer or a word-listing application goes into reducing a vast and complex object to a much simpler shape. That is, a set of 100 million words on a confusing wealth of topics in a variety of styles and produced by innumerable people for a lot of different reasons gets reduced to a mere list in alphabetical order. A rich chaos of language is reduced, it is “boiled down” to a simpler set. In the vapours that have steamed off are all the facts about who wrote the texts and what they meant.We have therefore lost a great deal in that process, and if it damaged the original texts we would never dare do it.

The advantage comes in the second stage where one examines the boiled down extract, the list of words, the concordance. It is here that something not far different from the sometimes-scorned “intuition” comes in. This is imagination. Insight. Human beings are unable to see shapes, lists, displays, or sets without insight, without seeing in them “patterns”. It seems to be a characteristic of the homo sapiens mind that it is often unable to see things “as they are” but imposes on them a tendency, a trend, a pattern. From the earliest times, the very stars in the sky have been perceived as belonging in “constellations”. This capability can come at a cost, of course: it may be easy to spot a pattern in a cloud or in a constellation and thereby build up a mistaken theory; but the point is that it is this ordinary imaginative capacity, that of seeing a pattern, which is there in all of us and which makes it possible for corpus-based methods to make a relatively large impact on language theory. For with these twin resources, namely the tools to manipulate a lot of data in many different ways and without wasting much time, combined with the power of imagination and pattern-recognition, it becomes possible to chase up patterns that seem to be there and come up with insights affecting linguistic theory itself. The tools we use generate patterns (lists, plots, colour arrangements) and it is when we see these that in some cases the pattern “jumps out” at us. In other cases we may need training to see the patterns but the endeavour is itself largely a search for pattern.

Scott, M. & Tribble, C. 2006. Textual Patterns: keyword and corpus analysis in language education. Amsterdam: Benjamins. (pp. 5-6).