Language is never, ever, ever random

“Language is never, ever, ever random” (Kilgarriff, 2005), not in its usage, not in its acquisition, and not in its processing. (Nick C. Ellis, 2017, p. 41)

Nick C. Ellis (2017). Cognition, Corpora, and Computing: Triangulating Research in Usage-Based Language Learning. Language Learning 67(S1), pp. 40–65

Corpus of North American Spoken English (CoNASE)

The Corpus of North American Spoken English (CoNASE), a 1.25-billion-word corpus of geolocated automatic speech-to-text transcripts, is now available in a beta version.

URL http://cc.oulu.fi/~scoats/CoNASE.html for more information.

The corpus was created from 301,847 ASR transcripts from 2,572 YouTube channels, corresponding to 154,041 hours of video. The size of the corpus is 1,252,066,371 word tokens.

The channels sampled in the corpus are associated with local government entities such as town, city, or county boards and councils, school or utility districts, regional authorities such as provincial or territorial governments, or other governmental organizations.

The transcripts are primarily of recordings of public meetings, although other genres are also present. Video transcripts have been assigned exact latitude-longitude coordinates using a geocoding script.

This information was distributed through the Corpora-List by Steven Coats, University of Oulu, Finland

To cite the corpus, please use

Coats, Steven. 2021. Corpus of North American Spoken English (CoNASE). http://cc.oulu.fi/~scoats/CoNASE.html.

Jornada de difusión online proyecto de investigación Nutcracker, 24-25 junio 2021

NUTCRACKER: Sistema de detección, rastreo, monitorización y análisis del discurso terrorista en la Red Funded by: MINECO. 2017-2020. FFI2016-79748-R

Proyectos I+d+I – Programa estatal de investigación, desarrollo e innovación orientada a los retos de la sociedad.

“Nutcracker: System for Detection, Tracking, Monitoring and Analysis of the Discourse of Terror on the Net”

LINK 24 JUNE

https://oficinavirtual.ugr.es/redes/SOR/SALVEUGR/accesosala.jsp?IDSALA=22980794
Password: 657396

LINK 25 JUNE

https://oficinavirtual.ugr.es/redes/SOR/SALVEUGR/accesosala.jsp?IDSALA=22980797
Password: 466561

NUTCRACKER: Sistema de detección, rastreo, monitorización y análisis del discurso terrorista en la Red Funded by: MINECO. 2017-2020. FFI2016-79748-R

PIs: Prof Encarnación Hidalgo Tenorio, & Prof Juan Luis Castro Peña, Universidad de Granada

Where´ is home? EU citizens as migrants.

Approaches to migration, language and identity 2020 AMLI Conference (www)

University of Sussex, Wednesday 9 – Friday 11 June 2021

Book of abstracts.


Pascual Pérez-Paredes & Elena Remigi
Universidad de Murcia / The In Limbo Project

When?

Thursday June 10, Panel A: Foregrounding migrant perspectives 11:25 UK time


Abstract

Since January 2021, UK and EU citizens can no longer exercise freedom of movement between the two areas. EU, EEA or Swiss citizens living in the UK before 31 December 2020 have been forced to apply to the EU Settlement Scheme to continue living in the UK. In practical terms, EU citizens have become a new migrant community. The 2016 Brexit referendum started a period of uncertainty,
agony and frustration for both EU citizens in the UK and UK citizens in the EU that ended with the trade deal that the EU and the UK made public on 24 December 2020. The anger, the sense of betrayal (Bueltmann, 2020) and various mental health issues (Reimer, 2018; Bueltmann, 2020), however, linger on. This study uses a corpus of 200 testimonies from EU citizens in the UK to explore their feelings and reactions to Brexit and the hostile environment (Leudar et al., 2008) that emerged soon after the referendum. The In Limbo corpus of testimonies contains personal accounts by EU citizens living in Britain from 2017 until 2020. It has 81,000 tokens and 7,600 types. The collection of the data was organised by volunteers on a not-for-profit basis. The testimonies in Remigi, E., Martin, V., & Sykes
(2020) were chosen as the basis of our corpus.


We used keyword (Baker, 2006; Baker et al., 2008) and collocation (Baker, 2006; Pérez-Paredes, Aguado & Sánchez, 2017; Pérez-Paredes, 2020) analyses to explore the self-representation of EU citizens across four emerging areas of interest: family life, loss of identity, feeling unwelcome and representations of post-Brexit Britain, including discourses about settled status and Britishness. In
order to moderate the impact of Brexit-as-a-topic in the analysis of the narratives, we used two reference corpora in our study: the Brexit corpus and the enTenTen 2015, both provided through Sketch Engine. We used Wodak’s (2001) framework of analysis of representation strategies to pin down our discussion of the discourses emerging in the testimonies. Two strategies appear to be relevant in the context of our data: predication and perspectivation. The former is used mainly when expressing feelings about the UK while the latter are crucial to deliver the narratives
discursively. While our research confirms some of the conclusions in the survey conducted by Bueltmann (2020), the combination of corpus-based CDA methods and the rich data provided through these narratives open up further understanding of the discursive strategies used by EU citizens when resisting the anti-EU environment that was unleashed in the wake of Brexit. Our analysis provides an alternative representation of the consequences and impact of Brexit on EU migrants that is in contrast with the recent triumphalist discourse of the Tory government that misrepresents EU citizens as happily embracing the settled status scheme.

Keywords: Brexit, EU citizens, migrants, keyword analysis, representation strategies

Download the top 100 multiword key terms from the In Limbo corpus.

How learners are using corpora in EMI contexts

This talk was part of Cambridge University Press ELS Insights on Demand.

You can download my presentation slides here.

Here´s a list of the references I used in this presentation:

Biber, D. (2019). Text-linguistic approaches to register variation. Register Studies, 1(1), 42-75.

Biber, D., & Conrad, S. (2009). Register, genre, and style. Cambridge University Press.

Brian, A. (2020). A case study of corpus-informed ESP language learning materials for EMI psychology students at the University of Padova.

Curry, N. & Pérez-Paredes, P. (2021). Understanding Lecturers’ Practices and Processes: A Qualitative Investigation of English-Medium Education in a Spanish Multilingual University, published in Teaching Language and Content in Multicultural and Multilingual Classrooms, editedby Carrió-Pastor, M.L., & Bellés Fortuño, B. Palgrave MacMillan.

Dafouz, E., & Smit, U. (2016). Towards a dynamic conceptual framework for English-mediumeducation in multilingual university settings. Applied Linguistics, 37(3), 397-415.

Dafouz, E., & Smit, U. (2020). ROAD-MAPPING English medium education in the internationaliseduniversity. London: Palgrave Macmillan.

Dushku, S. & Thompson, P. (2020). CAMPUS TALK. Edinburgh University Press.

Jablonkai, R. R. (2019). Corpus linguistic methods in EMI research: A missed opportunity?In Research methods in EMI. Routledge.

Kırkgöz, Y., & Dikilitaş, K. (2018). Recent developments in ESP/EAP/EMI contexts. In Key issues in English for specific purposes in higher education (pp. 1-10). Springer, Cham.

Kunioshi, N., Noguchi, J., Tojo, K., & Hayashi, H. (2016). Supporting English-medium pedagogythrough an online corpus of science and engineering lectures. European Journal of EngineeringEducation41(3), 293-303.

O’keeffe, A., McCarthy, M., & Carter, R. (2007). From corpus to classroom: Language use and language teaching. Cambridge University Press.

Street, B. (2004). Academic literacies and the’new orders’: implications for research and practice in student writing in higher education. Learning & Teaching in the Social Sciences1(1).

Timmis, I. (2015). Corpus linguistics for ELT: Research and practice. Routledge.