Measuring ling. complexity: A multidisciplinary perspective

Update

 All presentations here

IMG_3036

The Linguistics Research Unit of the Institute of Language and Communication hosted a workshop on ‘Measuring linguistic complexity: A multidisciplinary perspective’ on Friday 24 April, 2015. 

The main objective of the workshop were to bring together specialists from a number of different but related fields to discuss the construct of linguistic complexity and how it is typically measured in their respective research fields. 

The event was structured around keynote presentations by five distinguished scholars:

  • Philippe Blache (CNRS & Universite d’Aix-Marseille, France): Evaluating complexity in syntax: a computational model for a cognitive architecture
  • Alex Housen (Vrije Universiteit Brussel, Belgium): L2 complexity – A Difficult(y) Matter
  • Frederick J. Newmeyer (University of Washington, University of British Columbia, Simon Fraser University): The question of linguistic complexity: historical perspective
  • Advaith Siddharthan (University of Aberdeen, UK): Automatic Text Simplification and Linguistic Complexity Measurements
  • Benedikt Szmrercsanyi (KULeuven, Belgium): Measuring complexity in contrastive linguistics and contrastive dialectology

A round table closed the workshop.

Details about the event are available on the workshop website: http://www.uclouvain.be/en-linguistic-complexity.html

The number of participants is limited. Participation is free of charge but registration is required before Friday 3rd April (via our registration form at http://www.uclouvain.be/en-505315.html). 

Thomas François (Centre de traitement automatique du langage) & Magali Paquot (Centre for English Corpus Linguistics)

Conclusions

A multidimensional construct: Bulté & Housen (2012:23)

Shared challenges, shared oportunities

Where is the place of theory here?

Do we need new measures? Do we ned to validate existing ones?

The many facets of complexity.

Formal linguistics may be a good starting point but don’t have much to offer.

Building a research community ?

 

Digital natives and corpora in language learning #corpuslinguistics

For digital natives, “research” is more likely to mean a Google search than a trip to the library […] it remains to be seen how corpus resources co-exist with online services like Google and online distionaries and how learners’ search habits behave in both contexts (Pérez-Paredes et al. 2012:484).

Pérez-Paredes, P., Sánchez-Tornel, M., & Alcaraz Calero, J. M. (2012). Learners’ search patterns during corpus-based focus-on-form activities. International Journal of Corpus Linguistics, 17(4), 483-516.

LINDSEI: the Turkish component

LINDSEI-TR: A New Spoken Corpus of Advanced Learners of English
By Abdurrahman Kilimci

Cukurova University, Faculty of Education, English Language Teaching Department, Balcalı, Adana, Turkey

Abstract

The aim of the present study is to describe the LINDSEI-TR, the Turkish component of the LINDSEI (the Louvain International Database of Spoken English), which was initiated to compile a corpus of spoken data produced by learners from varied mother tongues (Gilquin et al., 2010). In this respect, the main objective of the study is to present the aim, development, and the design criteria of the corpus along with its quantitative and qualitative characteristics. The corpus is considered to be of value to researchers in terms of delineating the features of learners’ spoken interlanguage and designing teaching materials to improve second language teaching and learning.

Keywords: Corpus linguistics, spoken corpus, interlanguage, second language teaching and learning

CFP From data to evidence in English language research: Big data, rich data, uncharted data 19-22 October 2015

From data to evidence in English language research: Big data, rich data, uncharted data

***Conference in Helsinki, Finland, 19-22 October 2015***

To diversify the discussion of data explosion in the humanities, the Research Unit for Variation, Contacts and Change in English (VARIENG) is organising an academic conference that addresses the use of new data sources, historical and modern, in English language research. We are particularly interested in papers discussing the advantages and disadvantages of the following three kinds of data:

Big data

In recent years, mega-corpora and other large text collections have become increasingly available to linguists. These databases open new opportunities for linguistic research, but they may be problematic in terms of representativeness and contextualisation, and the sheer amount of data may also pose practical problems. We welcome papers drawing on big data, including large corpora representing different genres and varieties (e.g. COCA, GloWbE), databases (e.g. EEBO, ECCO) and corpora created by web crawling (e.g. EnTenTen, UKWaC).

Rich data

Rich data contains more than just the texts, including representations of spacing, graphical elements, choice of typeface, prosody, or gestures. This is further supplemented by analytic and descriptive metadata linked to either entire texts or individual textual elements. The benefit of rich data is that it can provide new kinds of evidence about pragmatic, sociolinguistic and even syntactic aspects of linguistic events. Yet the creation and use of rich data bring great challenges. We invite papers on the representation, query, analysis, and visualisation of data consisting of more than linear text.

Uncharted data

Uncharted data comprises material which has not yet been systematically mapped, surveyed or investigated. We wish to draw attention to texts and language varieties which are marginally represented in current corpora, to data sources that exist on the internet or in manuscript form alone, and material compiled for purposes other than linguistic research. We welcome papers discussing the innovative research prospects offered by new and and previously unused or even unidentified material for the study of English in various contexts ranging from communities and networks to social groups and individuals.

Abstracts are invited by 15 February 2015 for 30-minute presentations including discussion as well as for posters and corpus and software demonstrations.

The following invited speakers have confirmed their participation:

Professor Mark Davies (Brigham Young University)
Professor Tony McEnery (Lancaster University)
Professor Päivi Pahta (University of Tampere)
Dr Jane Winters (Institute of Historical Research, University of London)

The conference forms part of the programme celebrating the 375th anniversary of the University of Helsinki in 2015 and will be held in the Main Building of the University.

More information on the conference will be available on the conference home page at: http://www.helsinki.fi/varieng/d2e/. Please address any queries to: d2e-conference@helsinki.fi.

CFP Research in Corpus Linguistics (RiCL)


Call for papers

(official journal of the Spanish Association for Corpus Linguistics AELINCO, published by Academy Publisher)

Research in Corpus Linguistics (RiCL, ISSN 2243-4712) is a peer-reviewed international scientific journal published annually, aiming at the publication of original research based on corpus data from different languages and language families and from different theoretical perspectives and frameworks, with the goal of improving our knowledge about the grammar and the linguistic theoretical background of a language, a language family or any type of cross-linguistic phenomena/constructions/assumptions.

RiCL invites previously unpublished research articles and book reviews in the field of corpus linguistics. Specific areas of interest include corpus design, compilation and typology; discourse, literary analysis and corpora; corpus-based grammatical studies; corpus-based lexicology and lexicography; corpora, contrastive studies and translation; corpus and linguistic variation; corpus-based computational linguistics; corpora, language acquisition and teaching; and special uses of corpus linguistics. The journal also publishes special issues on specific topics, with leading specialists in the field of corpus linguistics as guest editors.

Editors: Javier Pérez-Guerra (University of Vigo) and María José López-Couso (University of Santiago de Compostela)

Links:
– journal website: http://www.academypublisher.com/ricl/
– past issue: http://ojs.academypublisher.com/index.php/ricl/issue/archive
– instructions for authors: http://www.academypublisher.com/ricl/authorguide.html