CFP Language Resources and Evaluation Journal, entitled “Under-resourced Languages, Collaborative Approaches and Linked Open Data: Resources, Methods and Applications”

Submissions for a Special Issue of the Language Resources and Evaluation Journal, entitled “Under-resourced Languages, Collaborative Approaches and Linked Open Data: Resources, Methods and Applications”.

Important: More detailed information will be made available in September 2014. For more information please contact the guest editors.


Under-resourced languages are generally described as languages that suffer from a chronic lack of available resources, from human, financial, and time resources to linguistic ones (language data and language technology), and often also experience the fragmentation of efforts in resource development. This situation is exacerbated by the realization that as technology progresses and the demand for localised languages services over digital devices increases, the divide between adequately- and under-resourced languages keeps widening. Given that most of the world’s almost 7000 languages are not adequately resourced, much work needs to be done in order to support their existence in the digital age.

Although the destiny of a language is primarily determined by its native speakers and broader cultural context, the technological development of an under-resourced language offers such a language a strategic opportunity to have the same “digital dignity”, “digital identity” and “digital longevity” as large, well-developed languages on the Web.

The Linked (Open) Data framework and the emerging Linguistic Linked (Open) Data infrastructure offer novel opportunities for under-resourced languages. On the one hand, Linked Data offers ways of exposing existing high quality, albeit small, language resources in the Semantic Web and, on the other hand, allows for the development of new state-of-the-art resources without necessarily having to rely on the availability of sophisticated language processing support.

This special issue arises from the imperative to maintain cultural and language diversity and from the basic right of all communities, languages, and cultures to be “first class citizens” in an age driven by information, knowledge and understanding. In this spirit, this special issue focuses on three strategic approaches to augment the development of resources for under-resourced languages to achieve a level potentially comparable to well-resourced, technologically advanced languages, viz. a) using the crowd and collaborative platforms; b) using technologies of interoperability with well-developed languages; and c) using Semantic Web technologies and, more specifically, Linked Data.

We invite original contributions, not published before and not under consideration for publication elsewhere, that address one or more of the following questions by means of one or more of the three approaches mentioned above:

• How can collaborative approaches and technologies be fruitfully applied to the accelerated development and sharing of high quality resources for under-resourced languages?

• How can such resources be best stored, exposed and accessed by end users and applications?

• How can small language resources be re-used efficiently and effectively, reach larger audiences and be integrated into applications?

• How can multilingual and cross-lingual interoperability of language resources, methods and applications be supported, also between languages that belong to different language families?

• How can existing language resource infrastructures be scaled to thousands of languages?

• How can research on and resource development for under-resourced languages benefit from current advances in semantic and semantic web technologies, and specifically the Linked Data framework?

Laurette Pretorius – University of South Africa, South Africa (pretol AT unisa DOT ac DOT za)
Claudia Soria – CNR-ILC, Italy (claudia.soria AT ilc DOT cnr DOT it)

Sabine Bartsch, Technische Universität Darmstadt, Germany
Delphine Bernhard, LILPA, Strasbourg University, France
Peter Bouda, CIDLeS – Interdisciplinary Centre for Social and Language Documentation, Portugal
Paul Buitelaar, Insight Centre for Data Analytics, NUIG, Ireland
Steve Cassidy, Macquarie University, Australia
Christian Chiarcos, Frankfurt University, Germany
Thierry Declerck, DFKI GmbH, Language Technology Lab, Germany
Mikel Forcada, University of Alicante, Spain
Dafydd Gibbon, Bielefeld University, Germany
Yoshihiko Hayashi, Graduate School of Language and Culture, Osaka University, Japan
Sebastian Hellmann, Leipzig University, Germany
Simon Krek, Jožef Stefan Institute, Slovenia
Tobias Kuhn, ETH, Zurich, Switzerland
Joseph Mariani, LIMSI-CNRS & IMMI, France
John McCrae, Bielefeld University, Germany
Steven Moran, Universität Zürich, Switzerland
Kellen Parker, National Tsing Hua University, China
Patrick Paroubek, LIMSI-CNRS, France
Taher Pilehvar, “La Sapienza” Rome University, Italy
Maria Pilar Perea i Sabater, Universitat de Barcelona, Spain
Laurette Pretorius, University of South Africa, South Africa
Leonel Ruiz Miyares, Centro de Linguistica Aplicada (CLA), Cuba
Kevin Scannell, St. Louis University, USA
Ulrich Schäfer, Technical University of Applied Sciences Amberg-Weiden, Bavaria, Germany
Claudia Soria, CNR-ILC, Italy
Nick Thieberger, University of Melbourne, Australia
Eveline Wandl-Vogt, Austrian Academy of Sciences, ICLTT, Austria
Michael Zock, LIF-CNRS, France