1Register now! 4th Corpus Linguistics in the South 04/03/2017 Birkbeck Uni

 

The 14th meeting of Corpus Linguistics in the South will be held on Saturday March 4, 2017 at Birkbeck, University of London.

To register for this event, please send an email to Rachelle Vessey (r.vessey@bbk.ac.uk) and include the following details:
Name
Affiliation
Email address

Please also indicate if you would like to join in the group lunch, which will be at Byron (https://www.byronhamburgers.com/store-street/).

There are limited spaces available for this event and also limited availability for lunch. Registration and reservations for lunch will be allocated on a first-come-first-serve basis.

 

Practical information

In the tradition of all Corpus Linguistics in the South events, there will be no charge for participation or attendance. Coffee and refreshments will be provided and participants will be welcome to attend an optional lunch (cost approximately £15). Please note we do not have any funds from which to assist with transport or accommodation. Birkbeck, University of London is located in the heart of Bloomsbury and is easily accessible by public transportation. More details on our central London campus can be found here.

This event is being organised by Rachelle Vessey.

TAALES 2.2 is out : automatic analysis of lexical sophistication, Windows and Mac

From the TAALES website:

Kyle, K. & Crossley, S. A. (2015). Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Quarterly 49(4), pp. 757-786. doi: 10.1002/tesq.194

TAALES is a tool that measures over 400 classic and new indices of lexical sophistication, and includes indices related to a wide range of sub-constructs. TAALES indices have been used to inform models of second language (L2) speaking proficiency, first language (L1) and L2 writing proficiency, spoken and written lexical proficiency, genre differences, and satirical language.

Starting with version 2.2, TAALES provides comprehensive index diagnostics, including text-level coverage output (i.e., the percent of words/bigrams/trigrams in a text covered by the index) AND individual word/bigram/trigram index coverage information.

TAALES takes plain text files as input (it will process all plain text files in a particular folder) and produces a comma separated values (.csv) spreadsheet that is easily read by any spreadsheet software.

 

You can find all the info here. Windows and Mac versions available for free.

4th Learner Corpus Research Conference, Bolzano, Italy, 5‐7 October 2017

4th Learner Corpus Research Conference
Bolzano/Bozen, Italy, 5‐7 October 2017

Home

Abstracts should be submitted through EasyChair by Sunday 15 January 2017.

Notification of the outcome of the review process will be sent by 31 March 2017.

Call for Papers

Following the successful conferences in Louvain‐la‐Neuve (Belgium) in 2011, Bergen (Norway) in 2013 and Nijmegen (the Netherlands) in 2015, the 4th Learner Corpus Research Conference will be hosted by the Institute for Specialised Communication and Multilingualism at EURAC Research, Bolzano/Bozen, Italy. The conference, organized under the aegis of the Learner Corpus Association, aims to be a showcase for the latest developments in the field and will feature full paper presentations, work in progress reports, poster presentations, software demos and a book exhibition.

The theme of LCR 2017 is “Widening the Scope of Learner Corpus Research”.

Conference Venue: European Academy Bozen/Bolzano – EURAC Research

Confirmed keynote speakers:

Philip Durrant (University of Exeter, United Kingdom)
Stefan Th. Gries (University of California, Santa Barbara, U.S.A.)
Stefania Spina (Università per Stranieri Perugia, Italy)
The keynote speakers will adress the theme of LCR 2017 in their respective lectures on L1 writing  development and Learner Corpus Research, quantitative methods in Learner Corpus Research, and Learner Corpus Research and Italian as L2. We welcome papers that address all aspects of Learner  Corpus Research, in particular the following ones:

* Corpora as pedagogical resources
* Corpus‐based transfer studies
* Data mining and other explorative approaches to learner corpora
* English as a Lingua Franca
* Error detection and correction of learner language
* Extracting language features from learner corpora
* Innovative annotations in learner corpora
* Language for academic/specific purposes
* Learner varieties
* Learner corpora for less commonly taught languages
* Learner Corpus Research and the Common European Framework of Reference for Languages (CEFR)
* Learner Corpus Research and Natural Language Processing
* Links between Learner Corpus Research and other research methodologies (e.g. experimental methods)
* Search engines for learner corpora
* Statistical methods in learner corpus studies
* Task and learner variables

There will be four different categories of presentation:

* Full paper (20 minutes + 10 minutes for discussion)
* Work in Progress (WiP) report (10 minutes + 5 minutes for discussion)
* Corpus/software demonstration
* Poster
* The Work in Progress reports and posters are intended to present research still at a preliminary stage and on which researchers would like to get feedback.

The language of the conference is English.

Abstracts
Your abstract should be between 600 and 700 words (excluding a list of references). Abstracts should  provide the following:
* clearly articulated research question(s) and its/their relevance;
* the most important details about research approach, data and methods;
* the main results and their interpretation.

Abstracts should be submitted through EasyChair (https://easychair.org/conferences/?conf=lcr2017) by Sunday 15 January 2017. Please follow instructions provided on the conference website (http://lcr2017.eurac.edu).
Abstracts will be reviewed anonymously by the scientific committee. Notification of the outcome of  the review process will be sent by 31 March 2017.

The LCR 2017 organising committee
Andrea Abel (EURAC Research)
María Belén Díez‐Bedmar (Universidad de Jaén)
Daniela Gasser (EURAC Research)
Aivars Glaznieks (EURAC Research)
Verena Lyding (EURAC Research)
Lionel Nicolas (EURAC Research)

The LCR 2017 scientific committee
Andrea Abel (EURAC Research)
Katherine Ackerley (Università degil Studi di Padova)
Annelie Ädel (Dalarna University)
Nicolas Ballier (Université Paris Diderot – Paris 7)
María Belén Díez‐Bedmar (Universidad de Jaén)
Marcus Callies (Universität Bremen)
Erik Castello (Università degil Studi di Padova)
Francesca Coccetta (Università Ca’Foscari Venezia)
Pieter de Haan (Radboud Universiteit Nijmegen)
Hilde Hasselgård (Universitet i Oslo)
Sandra Deshors (New Mexico State University)
Ana Diaz‐Negrillo (Universidad de Granada)
Michael Flor (ETS)
John Flowerdew (City University of Hong Kong)
Lynne Flowerdew (independent researcher)
Fanny Forsberg Lundell (Stockholm University)
Gaëtanelle Gilquin (University of Louvain)
Sandra Götz (Justus Liebig Universität Gießen)
Solveig Granath (Karlstad University)
Sylviane Granger (Universtié catholique de Louvain)
Nicholas Groom (University of Birmingham)
Jirka Hana (Charles University Prague)
Shin’ichiro Ishikawa (Kobe University)
Jarmo Harri Jantunen (University of Jyväskylä)
Scott Jarvis (Ohio University)
Marie Källkvist (Lund University Sweden)
Agnieszka Lenko‐Szymanska (University of Warsaw)
Anke Lüdeling (Humboldt‐Universität Berlin)
Carla Marello (Università degil Studi Torino)
Fanny Meunier (Universtié catholique de Louvain)
Detmar Meurers (Universität Tübingen)
Florence Myles (University of Essex)
Susan Nacey (Hedmark University College)
Lionel Nicolas (EURAC Research)
Michael O’Donnell (Universidad Autónoma de Madrid)
Signe Oksefjell Ebeling (Universitetet i Oslo)
Magali Paquot (Universtié catholique de Louvain/FNRS)
Pascual Pérez‐Paredes (University of Cambridge)
Tom Rankin (Vienna University of Economics and Business)
Paul Rayson (UCREL, Lancaster University)
Ute Römer (University of Michigan)
Anna Siyanova‐Chanturia (Victoria University of Wellington)
Jennifer Thewissen (Universiteit Antwerpen)
Yukio Tono (Tokyo University of Foreign Studies)
Nina Vyatkina (University of Kansas)
Heike Zinsmeister (Universität Hamburg)

For inquiries, contact Andrea Abel: Andrea . Abel @ eurac . edu

eLex 2017: Lexicography from Scratch submission deadline 1 Feb 2017

The fifth biennial conference on electronic lexicography, eLex 2017, will take place in Holiday Inn Leiden, Netherlands, from 19-21 September 2017.

The conference aims to investigate state-of-the-art technologies and methods for automating the creation of dictionaries. Over the past two decades, advances in NLP techniques have enabled the automatic extraction of different kinds of lexicographic information from corpora and other (digital) resources. As a result, key lexicographic tasks, such as finding collocations, definitions, example sentences, translations, are more and more beginning to be transferred from humans to machines. Automating the creation of dictionaries is highly relevant, especially for under-resourced languages, where dictionaries need to be compiled from scratch and where the users cannot wait for years, often decades, for the dictionary to be “completed”. Key questions to be discussed are: What are the best practices for automatic data extraction, crowdsourcing and data visualisation? How far can we get with Lexicography from scratch and what is the role of the lexicographer in this process?

Important dates

February 1st, 2017: abstract submissions
March 15th, 2017: reviews of abstracts
May 15th, 2017: submission of full papers
June 15th, 2017: reviews of full papers
June 25th, 2017: camera-ready copies submissions

Call for papers here: https://elex.link/elex2017/call-for-papers/

 Release of BABELNET 3.7

 
babelNet

::::::::::::::::
Through the corpora list

::::::::::::::::

We are happy to announce the release a new version of BabelNet, a project recently featured in a TIME magazine article. BabelNet (http://babelnet.org) is the largest multilingual encyclopedic dictionary and semantic network created by means of the seamless integration of the largest multilingual Web encyclopedia – i.e., Wikipedia – with the most popular computational lexicon of English – i.e., WordNet, and other lexical resources such as Wiktionary, OmegaWiki, Wikidata, Open Multilingual WordNet, Wikiquote, VerbNet, Microsoft Terminology, GeoNames, WoNeF, ImageNet, ItalWordNet, Open Dutch WordNet and FrameNet. The integration is performed via a high-performance linking algorithm and by filling in lexical gaps with the aid of Machine Translation. The result is an encyclopedic dictionary that provides Babel synsets, i.e., concepts and named entities lexicalized in many languages and connected with large amounts of semantic relations.

Version 3.7 comes with the following new features:

  • New resource integrated: FrameNet (lexical units)
  • More than 2500 Babel synsets identified as key concepts
  • Mappings with several versions of WordNet now integrated (from 1.6 to 3.0)
  • More than 2.6 million Babel synsets labeled with domains (was 1,558,806 in v3.6)

More statistics are available at: http://babelnet.org/stats

BabelNet was part of the MultiJEDI project originally funded by the European Research Council and headed by Prof. Roberto Navigli at the Linguistic Computing Laboratory of the Sapienza University of Rome. BabelNet is now a self-sustained project. It is, and always will be, free for research purposes, including download. Babelscape, a Sapienza startup company, is BabelNet’s commercial support arm, thanks to which the project will be continued and improved over time.

Contact:

The BabelNet group

=====================================
Roberto Navigli
Dipartimento di Informatica
Sapienza University of Rome
Viale Regina Elena 295b (building G, second floor)
00161 Roma Italy
Phone: +39 0649255161 – Fax: +39 06 49918301
Home Page: http://wwwusers.di.uniroma1.it/~navigli

The Conference on #NLP KONVES new deadline

cfp

KONVENS 2016
http://www.linguistics.rub.de/konvens16/

The Conference on Natural Language Processing (“Konferenz zur Verarbeitung natürlicher Sprache”, KONVENS) aims at offering a broad perspective on current research and developments within the interdisciplinary field of natural language processing. It allows researchers from all disciplines relevant to this field of research to present their work. The conference will take place September 19–21, 2016 in Bochum (Germany). We are pleased to announce that John Nerbonne and Barbara Plank will give invited talks at the conference.

Call for Papers

We welcome original, unpublished contributions on research, development, applications and evaluation, covering all areas of natural language processing, ranging from basic questions to practical implementations of natural language resources, components and systems.

The special theme of the 13th KONVENS is: “Processing non-standard data — commonalities and differences”.

A wide range of data can be considered “non-standard” because it deviates in one way or the other from standard written data such as newspaper texts. Examples include:
* data produced by language learners
* historical data
* data from social media
* (transcriptions of) spoken data

We especially encourage the submission of contributions comparing different types of non-standard data and their properties, focussing on their impact for natural language processing. For example, a feature common to many types of non-standard data is the use of non-standard spelling. However, spelling variation in learner data as compared to historical data is due to very different reasons and, most likely, resulting in very different types of non-standard spellings.

Topics that we would like to see addressed include:
* Common properties of (many) non-standard data, e.g. non-standard spelling, data sparseness, features of orality
* Impact of the commonalities and differences of non-standard data on the methods and tools that are applied to the data, e.g. normalization vs. tool adaptation, evaluation without gold standard, etc.

Important Dates
NEW: June 7, 2016  Paper submissions due
NEW: July 18, 2016 Notification of acceptance
August 15, 2016    Camera-ready copy due
September 19–21, 2016  Conference

Formats

We welcome two types of contributions:
* Full papers for oral presentation (8 pages plus references)
* Short papers for presentation as posters (4 pages plus references)

Short papers/posters can be combined with a system demonstration. Reviews will be anonymous. Accepted full and short papers will be published in the conference proceedings.

Submissions must conform to the formatting guidelines, and must be made electronically through the conference website (see https://www.linguistics.ruhr-uni-bochum.de/konvens16/call/index.html#formatting-guidelines).

The conference languages are English and German. We encourage the submission of contributions in English.