Read the piece here.
One of the most important goals of formal schooling is teaching text varieties that might not be acquired outside of school […] Early in school, children learn to read books of many different types, including fictional stories, historical accounts of past events, and descriptions of natural phenomena. These varieties rely on different linguistic structures and patterns, and students must learn how to recognize and interpret those differences. At the same time, students must learn how to produce some of these different varieties, for example writing a narrative essay on what they did during summer vacation versus a persuasive essay on whether the school cafeteria should sell candy. The amount of explicit instruction in different text varieties varies across teachers, schools, and countries, but even at a young age, students must somehow learn to control and interpret the language of different varieties, or they will not succeed at school.
Biber & Conrad (2009:3)
Biber, D., & Conrad, S. (2009). Register, genre, and style (Cambridge Textbooks in Linguistics).
From the TAALES website:
Kyle, K. & Crossley, S. A. (2015). Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Quarterly 49(4), pp. 757-786. doi: 10.1002/tesq.194
TAALES is a tool that measures over 400 classic and new indices of lexical sophistication, and includes indices related to a wide range of sub-constructs. TAALES indices have been used to inform models of second language (L2) speaking proficiency, first language (L1) and L2 writing proficiency, spoken and written lexical proficiency, genre differences, and satirical language.
Starting with version 2.2, TAALES provides comprehensive index diagnostics, including text-level coverage output (i.e., the percent of words/bigrams/trigrams in a text covered by the index) AND individual word/bigram/trigram index coverage information.
TAALES takes plain text files as input (it will process all plain text files in a particular folder) and produces a comma separated values (.csv) spreadsheet that is easily read by any spreadsheet software.
You can find all the info here. Windows and Mac versions available for free.
Page on Nordic languages by the same author: http://www.sssscomic.com/comic.php?page=195