Categories
corpus corpus linguistics English English Language learner corpus learner language learning Manipulating text MOOC Recursos research researching corpus use resources Subjects at UMU text analysis text tools text-analytics universidad www resources

#corpusMOOC Corpus Linguistics: Method, Analysis, Interpretation starts Sept 29

futurelearnlogo

This free MOOC Offers practical introduction to the methodology of corpus linguistics for researchers in social sciences and humanities. It is an 8-week course and is run by Lancaster University.

More information here.

 

 

Categories
language analysis NLP programming languages text tools

Language Processing with Perl and Prolog, 2nd edition @Springer

Language Processing with Perl and Prolog, 2nd edition
By Pierre Nugues
Published by Springer

This book has a companion website at http://ilppp.cs.lth.se/
and can be ordered from Springer: http://www.springer.com/978-3-642-41463-3
or Amazon: http://www.amazon.com/Language-Processing-Perl-Prolog-Implementation/dp/364241463X/

Overview:
This book teaches the principles of natural language processing, first covering practical linguistics issues such as encoding and annotation schemes, defining words, tokens and parts of speech, and morphology, as well as key concepts in machine learning, such as entropy, regression, and classification, which are used throughout the  book. It then details the language-processing functions involved, including part-of-speech tagging using rules and stochastic techniques, using Prolog to write phase-structure grammars, syntactic formalisms and constituent and dependency parsing techniques, semantics, predicate logic, and lexical semantics, and analysis of discourse and applications in dialogue systems. A key feature of the book is the author’s hands-on approach throughout, with sample code in Prolog and Perl, extensive exercises, and a detailed introduction to Prolog. The reader is supported with a companion website that contains teaching slides, programs, and additional material.

The second edition is a complete revision of the techniques exposed in the first edition to reflect advances in the field. The author redesigned or updated all the chapters, added two new ones, and considerably expanded the sections on machine-learning techniques.

Contents:
1 An Overview of Language Processing
2 Corpus Processing Tools
3 Encoding and Annotation Schemes
4 Topics in Information Theory and Machine Learning
5 Counting Words
6 Words, Parts of Speech, and Morphology
7 Part-of-Speech Tagging Using Rules
8 Part-of-Speech Tagging Using Statistical Techniques
9 Phrase-Structure Grammars in Prolog
10 Partial Parsing
11 Syntactic Formalisms
12 Constituent Parsing
13 Dependency Parsing
14 Semantics and Predicate Logic
15 Lexical Semantics
16 Discourse
17 Dialogue
Appendix: An Introduction to Prolog
Index
References

Categories
hashtags sentiment analysis text analysis text tools text-analytics twitter

Hashtags and text mining

Twitter: How to archive event hashtags and create an interactive visualization of the conversation (Link)

TAGSExplorer Beta (Link)

TAGS v3 (Link)

AYLIEN Text analysis API demo  (Link)

Categories
MAC text tools

OSx Find and Replace Using Sed

terminal-icon1  mac-life-magazine-big-icon_3691

Categories
corpus linguistics MAC text tools Tools

Split & Concat for Mac OS

Split & Concat Version 3.0 for Mac OS 10.5 and 10.6 “Snow Leopard”

Categories
corpus linguistics MAC text tools Tools

Split & Concat for Mac OS

Split & Concat Version 3.0 for Mac OS 10.5 and 10.6 “Snow Leopard”