Constructions in Applied Linguistics: Innovation and Application of Corpus-based Construction Grammar
Sun, March 25, 8:00 to 11:15am, Sheraton Grand Chicago, Colorado Room
Ute Roemer, Georgia State University
This paper presents findings from a large-scale corpus study on the development of verb patterns in second language (L2) learners of English. It follows the lead of existing usage-based studies of L2 construction acquisition while considerably expanding their scope to hundreds of constructions and over 700,000 verb tokens. Using methods from Corpus Linguistics and Natural Language Processing, the study focuses on verb-argument constructions (VACs, e.g. the ‘V n n’ or ditransitive construction) and addresses the following research questions:
1. What are the first VACs acquired by beginning L2 learners of English?
2. How does the VAC repertoire of learners develop across proficiency levels?
3. How does the distribution of verbs in VACs in learner production develop across proficiency levels?
4. What role do formulaic sequences play in the L2 acquisition of VACs?
To address these questions, data on verbs and the constructions they occur in was exhaustively extracted from a dependency-parsed cross-sectional corpus of L2 writing. The corpus is a 6-million word subset of EFCAMDAT, the Education First-Cambridge Open Language Database, consisting of over 68,000 texts produced by L1 German and L1 Spanish learners at CEFR levels A1 through C1. Using a customized Python script, we generated frequency-sorted VAC and verb-VAC lists for each level and L1 (e.g., German_A1). We also extracted recurring multi-word clusters (spans 3, 4, and 5) around the 50 most frequent verbs in EFCAMDAT, together with information on frequency and cluster association strength (Mutual Information).
We will share selected results on verb construction development across learner proficiency levels. We expect to find an increase in VAC types, growth in VAC productivity and complexity, and a development from predominantly fixed sequences to more flexible and productive ones. The resulting findings help to expand our understanding of the processes that underlie construction acquisition in an L2 context.
Nicholas Groom, University of Birmingham, UK
Construction grammar is most strongly associated with cognitive linguistic theory and with research into language acquisition. In this paper, however, I demonstrate that construction grammar offers equally exciting opportunities to more socioculturally-oriented researchers, particularly those whose work focuses on identifying and analysing the meanings and values associated with particular discourse communities.
The potential power of construction-based approaches to sociocultural analysis was first demonstrated by Wulff et al (2007), who identified statistically significant differences in the ‘into-causative’ construction in American and British English. The paper asks why Wulff et al’s call for further research along similar lines has gone largely unheeded. It is proposed that a key reason may be that most current construction-based approaches are deductive in nature (i.e. the researcher decides which construction(s) to study), whereas socioculturally-oriented research is often exploratory in nature and thus more suited to inductive methods (i.e. where the aim is to discover which constructions are associated with a particular language variety or discourse community). The paper then proposes an adaptation of closed-class keywords analysis as a viable methodology for the inductive analysis of variety/discourse-specific constructions in large corpora. The remainder of the paper will provide a practical illustration of this approach, showing how corpus-based construction grammar can yield new insights into the relationship between phraseology (defined as preferred ways of saying) and epistemology (defined as preferred ways of knowing) in the specialized discourses of academic disciplines. The main empirical focus of the paper will be on a newly identified construction, the ‘WAY IN WHICH’ construction (as in This may have affected the way in which religious ideas were disseminated), and will draw on examples of this construction as it occurs in a large-scale corpus-based analysis of professional academic writing in the disciplinary discourses of history and literary criticism.
Closed-class keyword analysis (Groom 2010): the use of closed-class key words yields constructions of interest to the researcher
Phraseology can be repositioned as a discoursal rather than a lexicogrammatical phenomenon.
Florent Perek, University of Birmingham, UK
Identifying units of language that unite lexis and grammar as well as form and meaning offer substantial opportunities for resources for language learners. One such would be a ‘constructicon’: a listing of constructions adjusted to learner proficiency level. This paper argues for the use of two existing corpus-based descriptions of English that could be combined to form a constructicon: the grammar patterns identified as part of the COBUILD dictionary project, and the frames identified in the FrameNet project.
Grammar patterns focus on the complementation patterns commonly used with verbs, nouns and adjectives. FrameNet focuses on the roles associated with semantic frames. Although they differ in their scope and approach, both FrameNet and grammar patterns provide valency information as part of their output, but so far no attempt has been made to systematically compare and match this information. The present study fills this gap, focusing in particular on verbs. All argument realization information of verbs was extracted from FrameNet, resulting in a list of triplets of verb, semantic frame, and syntactic pattern. The FrameNet patterns were matched to the verb patterns listed in Francis et al. (1996) and the level of agreement quantified.
The paper demonstrates the use of FrameNet frames to add a semantic dimension to grammar patterns. Conversely, some verb classes defined by their occurrence with grammar patterns can help highlight relations between frames that are not recorded in FrameNet. We argue that matching FrameNet and grammar patterns can build a database of constructions, since semantically coherent set of frames paired with the syntactic realization of frame elements qualify as form-meaning pairs. This would complement the FrameNet Constructicon project (Fillmore et al. 2012) and make it more useful for learners, by focusing on frequent constructions rather than on idiosyncratic ones.
The % of COBUILD patterns in the Frame Net is < 50%.
Stefan Th. Gries, U. California, Santa Barbara
Fifteen years ago, Stefanow
+itsch and Gries introduced methods of measuring the co-occurrence of lexis and construction, identifying what are now called collostructions. Typically, these measures are based on a comparison of (i) the observed co-occurrence frequency of words with other words or constructions and (ii) the corresponding co-occurrence frequencies one would expect from a chance distribution. Examples for such measures include the log-likelihood ratio, MI, or the well-known chi-squared test. Researchers have thus studied the degree to which lexical items ‘like’ to occur in a specific construction (e.g., give, tell, show can be shown to be strongly attracted to the ditransitive construction) or which of two or more functionally similar constructions lexical items prefer (e.g., the will-future is associated more with low-dynamicity and general actions such as see, find, or know whereas the going-to future is associated more with dynamic and specific actions such as do, happen, or go). The observation of such preferred co-occurrences promise considerable improvements to information made available to language learners and teachers, as well as potentially modelling language acquisition processes.
While this approach has been very widely used and quite successful, its reliance on traditional association measures presents problems. First, these measures, unlike learning and comprehension, are bidirectional. Second, they confuse the potentially different effects of frequency and association. Third, the dispersion co-occurrences in the corpus are neglected, with the risk that seemingly high frequencies of underdispersed expressions skew the results.
In this paper, I outline remedies to these problems. I exemplify how unidirectional association measures (Delta P and the KL-divergence) better identify collostructions. I also include measures of dispersion of the collostructions, where corpus parts can be defined in terms of files or of (sub)registers. I exemplify the results for both rarer and more frequent constructions, specifically the ditransitive, the passive, and the will-future.
Towards a tuple-lization of corpus linguistics.
Frequency + association + dispersion
The speaker´s internal model of language is driven by usage
Language is highly patterned
Units of meaning
Textual colligation / Grammar pattern / Local Grammar
-A unified theory of language
-Examples are easy to find but difficult to systematise
-Can the identification of constructions be systematised?
Applications: Measuring learner progress and others
Are patterns and constructions interchangeable? Hunston doesn´t think so. One pattern may have/represent different constructions as different meanings are possible, ie, V – n – for -n
www.collinsdictionary.com with the original grammar patterns will be online this week