“Culture & Technology” – European Summer University in Digital Humanities

Through the corpora list

:::::::::::::::::::::::::::::::::::

“Culture & Technology” – European Summer University in Digital Humanities (ESU DH C & T) 28th of July – 07th of August 2015, University of Leipzig http://www.culingtec.uni-leipzig.de/ESU_C_T/

As the application phase closes soon (31st of May 2015) we would like to draw your attention (again) to the various types of support which are available for participants of the European Summer School (see: http://www.culingtec.uni-leipzig.de/ESU_C_T/node/480):
The German Accademic Exchange Service (DAAD) offers very generous support to up to 17 alumni / alumnae of German universities. Also former Erasmus-students or student / researchers of Universities of Applied Science, Art or Music Schools qualify as alumni / alumnae as long as they have spent altogether 3 months of their life at academic institutions in Germany.
The University of Leipzig through its International Centre makes available up to 10 bursaries for members of its Eastern European partner universities.
CLARIN-DE makes available up to 13 fellowships which cover tuition fees. If funding allows an allowance of up to € 200 will be granted to cover costs of living.
The Electronic Textual Cultures Lab at the University of Victoria (etcl), in conjunction with the Digital Humanities Summer Institute offers up to 5 tuition fellowships for international graduate students and postdoctoral fellows.
As ESU DH C & T is a member of the International Digital Humanities Training Network courses taken at the Summer University are eligible for transfer credit towards the University of Victoria Graduate Certificate in DH (http://english.uvic.ca/graduate/digital_humanities.html).

The Summer University takes place across 11 whole days. The intensive programme consists of workshops, public lectures, regular project presentations, a poster session and a panel discussion. The workshop programme is composed of the following thematic strands:
XML-TEI encoding, structuring and rendering
Methods and Tools for the Corpus Annotation of Historical and Contemporary Written Texts
Comparing Corpora
Spoken Language and Multimodal Corpora
Python
Basic Statistics and Visualization with R
Stylometry
Open Greek and Latin
Digital Editions and Editorial Theory: Historical Texts and Documents
Spatial Analysis in the Humanities
Building Thematic Research Collections with Drupal
Introduction to Project Management
Each workshop consists of a total of 16 sessions or 32 week-hours. The number of participants in each workshop is limited to 10. Workshops are structured in such a way that participants can either take the two blocks of one workshop or two blocks from different workshops.

The description of all workshops can be found at http://www.culingtec.uni-leipzig.de/ESU_C_T/node/481 in at least two languages. Short bios in at least two languages are available of most workshop leaders at http://www.culingtec.uni-leipzig.de/ESU_C_T/node/488.

Applications are considered on a rolling basis. The selection of participants is made by the Scientific Committee together with the experts who lead the workshops.

Participation fees are the same as last year.

The Summer University is directed at 60 participants from all over Europe and beyond. It wants to bring together (doctoral) students, young scholars and academics from the Arts and Humanities, Library Sciences, Social Sciences, Engineering and Computer Sciences as equal partners to an interdisciplinary exchange of knowledge and experience in a multilingual and multicultural context and thus create the conditions for future project-based cooperations and network-building across the borders of disciplines, countries and cultures.

The Summer University seeks to offer a space for the discussion and acquisition of new knowledge, skills and competences in those computer technologies which play a central role in Humanities Computing and which determine every day more and more the work done in the Humanities and Cultural Sciences, as well as in publishing, libraries, and archives, to name only some of the most important areas. The Summer University aims at integrating these activities into the broader context of the Digital Humanities, which pose questions about the consequences and implications of the application of computational methods and tools to cultural artefacts of all kinds.

In all this the Summer University aims at confronting the so-called Gender Divide , i.e. the under-representation of women in the domain of Information and Communication Technologies (ICT) in Germany and Europe. But, instead of strengthening the hard sciences as such by following the way taken by so many measures which focus on the so-called STEM disciplines and try to convince women of the attractiveness and importance of Computer Science or Engineering, the Summer University relies on the challenges that the Humanities with their complex data and their wealth of women represent for Computer Science and Engineering and the further development of the latter, on the overcoming of the boarders between the so-called hard and soft sciences and on the integration of Humanities, Computer Science and Engineering.

As the Summer University is dedicated not only to the acquisition of knowledge and skills, but wants also to foster community building and networking across disciplines, languages and cultures, countries and continents, the programme of the Summer School features also communal coffee breaks, communal lunches in the refectory of the university, and a rich cultural programme (thematic guided tours, visits of archives, museums and exhibitions, and communal dinners in different parts of Leipzig).

For all relevant information please consult the Web-Portal of the European Summer School in Digital Humanities “Culture & Technology”: http://www.culingtec.uni-leipzig.de/ESU_C_T/ which will be continually updated and integrated with more information as soon as it becomes available.

With best regards, Elisabeth Burr

Prof. Dr. Elisabeth Burr
Französische / frankophone und italienische Sprachwissenschaft
Institut für Romanistik
Universität Leipzig
Beethovenstr. 15
D-04107 Leipzig
http://www.uni-leipzig.de/~burr

eResearch days on Social Media and CMC Corpora for the eHumanities

Call for papers (abstract deadline 15th May)

The first international research days (IRDs) on Social Media and CMC Corpora for the eHumanities will be held in Rennes, France on 23-24th October 2015 and will focus on communication and interactions stemming from networks such as the Internet or telecommunications, as well as mono and multimodal, synchronous and asynchronous communications. The focus of the IRD will encompass different CMC genres. These include, but are not limited to, discussion forums, blogs, newsgroups, emails, SMS and WhatsApp, text chats, wiki discussions, social network exchanges (such as Facebook, Twitter, Linkedin, wikis (Wikpedia type)), discussions in multimodal and/or 3D environments.
The aim of the IRDs is to bring together researchers who have collected CMC data and who wish to organize and share them for research purposes. The IRDs will focus on the process of building CMC corpora including annotation and analysis processes as well as on questions of ethics and rights raised by publishing CMC corpora as open data. We invite researchers who are concerned with the analysis of various types of CMC data and corpora for linguistic or applied linguistic purposes to submit paper presentations.

Topics of interest (not limited to)
******************
The IRDs will have three thematic streams:
– 1) Development of CMC corpora
o Building CMC corpora: from data collection to publication
o Open data for research on CMC: questions of ethics and rights
o One or several models of CMC genres (e.g. extension of the TEI model, etc.)
o Multimodal corpora
– 2) Annotations and analysis
o Discourse and dialog analysis of online discussions: chat, forums, SMS, wikipedia discussions, social network exchanges, blogs, newsgroups, etc.
o Study of social networks through their communication: informal, professional, learning or other communities
o Contrastive analyses of specific CMC genres between several languages communities (e.g. languages in contact)
o Interaction analysis in online learning situations
o Multimodality in interactions
– 3) Natural Language Processing (NLP) applied to CMC
o Tagging and Parsing CMC texts
o Dealing with abbreviations and typos
o Dealing with morphosyntactic, lexical, … variations (e.g. : in corpus produced by deaf scripters)

Presentation categories
************************
Colleagues are invited to submit abstracts for paper presentations that will consist of a 20 minute talk followed by 10 minutes for questions and discussion.
Please note there will not be a poster session during these IRDs.

How to submit your proposal
************************
Papers can be submitted in either English or French. The language in which you submit your abstract should be the language in which you will present if your paper is accepted. All abstracts will be peer-reviewed by the conference programme committee.
Abstracts should be between 500 and 1000 words in length (excluding references). They should be submitted at http://ird-cmc-rennes.sciencesconf.org/user/submit before 15th May 2015. Authors will be notified of outcomes by 15th July 2015.
Please note, when submitting your abstract, you will have to select from a list of three strands and provide three or four keywords that will help place your abstract into the appropriate category. You can type your abstract directly onto the online form or paste a previously edited text. Plain text should be used. If you want to enter formatting elements (bold or italics,, etc.), or charts, tables, etc., please use a file attachments in PDF format. When entering author information, please include all authors and do not simply list the corresponding author.

Important dates
************************
15 May 2015: Paper submission deadline
15 July 2015: Authors notified of outcomes
23-24 October 2015: International Research Days, Rennes

Invited speakers
*********************
– Egon W. Stemle : DiDi project (http://www.eurac.edu/de/research/autonomies/commul/staff/Pages/staffdetails.aspx?persId=27143)
– Angelika Storer : Wikipedia as a resource for linguistic research (http://germanistik.uni-mannheim.de/abteilungen/germanistische_linguistik/prof_dr_angelika_storrer/index.html)
– Pascal Vaillant : Clapoty project (http://clapoty.vjf.cnrs.fr/)

Programme Committee (not yet complete)
******************
Georges Antoniadis (U. Grenoble, France)
Valerie Beaudouin (Telecom, ParisTech, France)
Michael Beisswenger (U. Dortmund, Germany)
Thierry Chanier (U. Blaise Pascal, France)
Isabella Chiari (U. Sapienza, Italy)
Linda Hriba (U. Orleans, France)
Gudrun Ledegen (U. Rennes 2, France)
Julien Longhi (U. Cergy-Pontoise, France)
Jean-Philippe Mague (ENS Lyon, France)
Amanda Potts (Lancaster University, United Kingdom)
Celine Poudat (U. Nice, France)
Ciara R. Wigham (U. Lyon2, France – responsable)


Thierry Chanier
Laboratoire de Recherche sur le Langage (LRL)
Département de Linguistique
Université Blaise Pascal (Clermont 2)
thierry.chanier@univ-bpclermont.fr
Tel : +33 3 4 73 34 68 39

adresse: Université Blaise Pascal,
Maison des Sciences de l’Homme – LRL
4 rue Ledru
63057 Clermont-Ferrand cedex 1
http://lrlweb.univ-bpclermont.fr/

Text simplification and complexity

Presentation by Advaith Siddharthan, University of Aberdeen

Publications

Aim: to reduce linguistic complexity: lexis, syntax, text length, information order, cohesive texts are easier to follow, numerical simplification, improve quality (texts with errors are difficult to read), spelling and grammar checks, how engaging a text is (personal narratives), humour.

IMG_3033

Monolingual translation: one language into the same language.

Motherese talk as a way to simplify

Controlled language: O’Brien 2003 : user manuals that are easy to translate

Peterson & Ostendorf (2007) for L2 learners

Who is the target reader of a simplified text system?

http://www.breakingnewsenglish.com/

 

Evaluating complexity in syntax

Presentation by Phillippe Blache, Laboratoire Parole et Langage, CNRS & Aix-Marseille Université

Part of the Measuring ling. complexity: A multidisciplinary perspective workshop at UCL, Belgium, 24 April 2015

Complexity means different things to different people

System vs Structural complexity (Dahl, 2004)

Existing models: incomplete dependency hypothesis, dependency locality theory, early intermediate constituents principle, activation. However, they all fail to describe language in natural environment.

Challenges: dealing with natural data and dealing with language in its context, esp. spoken language and natural interaction.

Hypothesis: difficulty depends on the search space size. The larger the search space, the more difficulty.

The more properties, the smaller the search space. Maximize online principle  (Hawkins, 2004).

Generative grammar is a very restrictive view.

Property grammars: linguistic statements as constraints (filtering + instantiating)

Basics: constraints are independent, linear precedence.

Constraint violation is possible.

 

L2 complexity, a difficult matter

Presentation by Alex Housen

Part of the Measuring ling. complexity: A multidisciplinary perspective workshop at UCL, Belgium, 24 April 2015

Complexity in SLA research

Early days: 70s and 80s; simplification vs complexification

1990-2010s: rarely investigates for its own sake

As indepedent variable: task complexity, complexity of the L2 target feature

As dependent variable: descriptor of L2 performance, indicatori of L2 proficiency, index of L2 development in studies dealing with the effects og age, learning context, learner variables, etc.

So far it has yielded inconclusive results.

Spada & Tomita (2010): meta analysis on the complexity of target feature on effectiveness of L2 construction

Definitions are varied and not consistent

Most studies show no construct definition and as a result complexity is often not explicitly defined as a theoretica lconstruct. Instead it usually is operationalised as a statistical construct either by means of raters or quantitative mesures of selected features.

Bulté & Housen 2012: review of 40 L2 studies

ll&lt_32_pb

Grammatical and lexical complexity

Only 3 measures are used in more than 5 studies (MLTunit + MLS, D + Guirault)

Motivation for the choice of complexity measures is not specified.

What units are being used?

-Frequency count

-Ratio measures

-Comples measures (inidices)

Norris & Ortega (2009): Towards an Organic Approach to Investigating CAF in Instructed SLA: The Case of Complexity

The challenge is to bring more clarity and identify the meanings and conceptions of L2 complesity that are relevant for investigating and understanding the nature of L2 structures and L2 systems.

Structural L2 complexity vs Cognitive L2 difficulty: may lead to a property theory in SLA

Complexity as diversity

Jarvis 2013 multidimensional model of lexical diversity

-size/volume

-richness/abundance

-effective number of types/variety

and many others

Abstract:

The range, variety, or diversity of words found in learners’ language use is believed to reflect the complexity of their vocabulary knowledge as well as the level of their language proficiency. Many indices of lexical diversity have been proposed, most of which involve statistical relationships between types and tokens, and which ultimately reflect the rate of word repetition. These indices have generally been validated in accordance with how well they overcome sample-size effects and/or how well they predict language knowledge or behavior, rather than in accordance with how well they actually measure the construct of lexical diversity. In this article, I review developments that have taken place in lexical diversity research, and also describe obstacles that have prevented it from advancing further. I compare these developments with parallel research on biodiversity in the field of ecology, and show what language researchers can learn from ecology regarding the modeling and measurement of diversity as a multidimensional construct of compositional complexity.

..

Morphological Complexity Index (MCI)

Available tool: Pallotti & Brezina

corpora.lancs.ac.uk/vocab/analyse/

 

Syntactic Diversity Index (SDI)

7 clauses and sentences categories