Let’s talk, Hawking says

 

Mankind’s greatest achievements have come about by talking, and its greatest failures by not talking. It doesn’t have to be like this. Our greatest hopes could become reality in the future. With the technology at our disposal, the possibilities are unbounded. All we need to do is make sure we keep talking.

Stephen Hawking

Free ngram databases from COW14 web corpora

From the corpora list

::::::::::::::::::::::::::::::

We are pleased to announce the release of the first very large ngram databases derived from the giga-token COW14 web corpora. They are completely free (CC-BY) and can be downloaded without registration. We have applied no frequency thresholds whatsoever. In addition to the counted ngram lists, we offer raw versions such that everybody can create their own version. The raw ngrams also contain additional information (crawl year, top-level domain, country geolocation).

There are also English dependency bigrams (based on Malt parses) containing words, their heads, and the dependency relation between them.

For end-users, there are also word and lemma frequency lists with some convenient frequency measures, optionally with a frequency threshold of 10 (smaller files, easier handling).

——————————————————————–

LICENSE AND REFERENCES

License Creative Commons Attribution 4.0 International
References http://corporafromtheweb.org/category/cow-citation/

Please tell us whenever you publish work based on COW:
https://webcorpora.org/publication/

DOWNLOAD

http://hpsg.fu-berlin.de/cow/ngrams/
http://hpsg.fu-berlin.de/cow/frequencies/

ORIGIN AND ORIGINAL CORPUS SIZES

The ngrams are derived from the COW14AX sentence-shuffled corpora.

Information http://corporafromtheweb.org/category/corpora/
Interface https://webcorpora.org/

English 9,578,828,861 tokens (International)
German 11,660,894,000 tokens (AT, CH, DE)
Spanish 3,680,794,644 tokens (International)
Swedish 4,842,753,707 tokens (FI, SV)

FREQUENCY LISTS

Languages English, German, Spanish, Swedish
Versions Lemma, Lemma + POS, Word, Word + POS
Thresholds no threshold; raw frequency > 9
Measures raw frequency, absolute rank, frequency per million,
log-frequency per million, frequency band

NGRAMS

N 1 .. 5
Languages English, German, Spanish, Swedish
Versions Raw, Word, Word + POS, Lemma (except Swedish)

DEPENDENCY BIGRAMS

Languages English (German soon, maybe Swedish)
Versions Raw, Word, Word + POS, Lemma, Lemma + POS

CFP International Journal of Bilingual Education and Bilingualism: Special Issue 2017

Through the AESLA mail-list

::::::::::::::::::::::::::::::::::::::::::

 

CALL FOR PAPERS

International Journal of Bilingual Education and Bilingualism: Special Issue 2017

As guest editors (Yolanda Ruiz de Zarobe and Roy Lyster) of a Special Issue of the International Journal of Bilingual Education and Bilingualism, we invite you to submit proposals on the following topic:

Instructional practices and teacher development in Content and Language Integrated Learning (CLIL)

The aim of this Journal is to be thoroughly international in nature. It disseminates high-quality research, theoretical advances, and international developments related to

initiatives in bilingualism and bilingual education. Each year the International Journal of Bilingual Education and Bilingualism devotes two of its issues to Special Issues.

Previous Special Issues have tended to receive remarkable praise, particularly as they focus on one issue and often provide a major step forward in the study of a particular

This Special Issue on CLIL seeks:

• To promote theoretical and applied research conducted in the context of CLIL and other content-based programs such as immersion.

• To disseminate information about best practices in content-based instruction.

• To provide a truly international exchange on how CLIL pedagogy is applied in a wide

Authors are invited to submit proposals focusing on instructional practices and teacher development in CLIL at any educational level and in any educational setting. Both

state-of-the-art articles and empirical studies are welcomed. Manuscripts submitted  should be original, not under review by any other publication and not published

– Deadline for 200-250 word abstracts: 15th September 2015. Proposals should be submitted by email attachment to the co-editors at yolanda.ruizdezarobe@ehu.es and

They should contain the author’s name, affiliation and e-mail address.

– Notification of acceptance/rejection: 1st November 2015. Please note that selection of the proposal does not always guarantee publication.

– Deadline for full papers (no longer than 7,000 words including notes and references):

15th February 2016. Each article will receive two independent and anonymous

 

For further information on the journal’s submission guidelines please visit.

http://www.tandf.co.uk/journals/rbeb

 

CFP Posters on late-breaking results June 15 deadline

Through the corpora list

:::::::::::::::::::::::::::::::::
CORPUS LINGUISTICS 2015

The CL2015 organising committee is pleased to issue a call for posters on late-breaking results on any of the topics in the conference’s scope. By “late-breaking” we mean research which was not at a sufficiently advanced stage for an abstract submission to be made in the main submission cycle, but which has now reached that point.

We anticipate that the research in question will still be in its earliest phases. “Late-breaking results” include – but are not necessarily limited to – pilot study results, corpus creation activities currently in hand, newly-developed software, and so on.

· Abstracts should be 400-750 words in length. They must be formatted using the conference stylesheet (available to download from http://ucrel.lancs.ac.uk/cl2015/call.php )

· We especially encourage submission of abstracts from early-career researchers, including postgraduate research students and postdoctoral researchers.

· Abstracts which were previously submitted for the January deadline, and not accepted, are NOT eligible to be resubmitted.

· Abstracts should be submitted by email to cl2015@lancaster.ac.uk by 15th June 2014.

· As with all presentations, at least one author of any late-submission poster must attend the conference.
For more details see http://ucrel.lancs.ac.uk/cl2015

An archive copy of the previously-circulated CL2015 Call for Participation may be found here: http://ucrel.lancs.ac.uk/cl2015/doc/CL2015-CallParticipation.pdf

Andrew Hardie, Tony McEnery, Amanda Potts, Vaclav Brezina, and Paul Rayson
The CL2015 Organising Committee