big data Archives - Pérez-Paredes

R for data science

This website is (and will always be) free to use, and is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License. If you’d like a physical copy of the book, you can order it from amazon; it was published by O’Reilly in January 2017.

URL resource : https://r4ds.had.co.nz/

My data science resources page. (URL)

1st Intl. NLP for Informal Text- Deadline 17/4

The 1st International Workshop on Natural Language Processing for Informal Text (NLPIT 2015)
In conjunction with The International Conference on Web Engineering(ICWE 2015)
June 23, 2015, Rotterdam, The Netherlands
http://wwwhome.cs.utwente.nl/~badiehm/nlpit2015/

Overview
The rapid growth of Internet usage in the last two decades adds new challenges to understand the informal user generated content (UGC) on the Internet. Textual UGC refers to textual posts on social media, blogs, emails, chat conversations, instant messages, forums, reviews, or advertisements that are created by end-users of an online system. A large portion of language used on textual UGC is informal. Informal text is the style of writing that disregard language grammars and uses a mixture of abbreviations and context dependent terms. The straightforward application of state-of-the-art Natural Language Processing approaches on informal text typically results in significantly degraded performance due to the following reasons: the lack of sentence structure; the lack of enough context required; the seldom entities involved; the noisy sparse contents of users’ contributions; and the untrusted facts contained. It is the aim of this work- shop to bring the attention of researchers to the opportunities and challenges involved in informal text processing. In particular, we are interested in discussing informal text modeling, normalization, mining, and understanding in addition to various application areas in which UGC is involved.

Topics

We invite submissions on topics that include, but are not limited to, the following core NLP approaches for informal UGC: language identification, classification, clustering, filtering, summarization, tokenization, segmentation, morphological analysis, POS tagging, parsing, named entity extraction, named entity disambiguation, relation/fact extraction, semantic annotation, sentiment analysis, language normalization, informality modeling and measuring, language generation, handling uncertainties, machine translation, ontology construction, dictionary construction, etc.

Submission

Authors are invited to submit original work not submitted to another conference or workshop. Workshop submissions could be a full paper or short paper. Paper length should not exceed 12 pages for full papers and 6 pages for short papers. All papers should follow the Springer’s LNCS format. Papers in PDF can be sent via the EasyChair Conference System https://easychair.org/conferences/?conf=nlpit2015. Each submission will receive, in addition to a meta-review, at least 2 peer double-blind reviews. Each full paper will get 25 minutes presentation time. Short papers will get 5 minutes presentation time in addition to a poster. Beside papers, we also plan to have an invited talk by a renowned scientist on a topic relevant for the workshop. Workshop proceedings will be published as part of the ICWE2015 workshop proceedings. To contact the NLPIT 2015 organization team, please send an e-mail to: nlpit2015@easychair.org.

Deadlines

– Submission deadline: April 17, 2015
– Notification deadline: May 17, 2015
– Camera-ready version: May 24, 2015
– Workshop date: June 23, 2015

Msg. distributed through the corpora list

CFP From data to evidence in English language research: Big data, rich data, uncharted data 19-22 October 2015

From data to evidence in English language research: Big data, rich data, uncharted data

***Conference in Helsinki, Finland, 19-22 October 2015***

To diversify the discussion of data explosion in the humanities, the Research Unit for Variation, Contacts and Change in English (VARIENG) is organising an academic conference that addresses the use of new data sources, historical and modern, in English language research. We are particularly interested in papers discussing the advantages and disadvantages of the following three kinds of data:

Big data

In recent years, mega-corpora and other large text collections have become increasingly available to linguists. These databases open new opportunities for linguistic research, but they may be problematic in terms of representativeness and contextualisation, and the sheer amount of data may also pose practical problems. We welcome papers drawing on big data, including large corpora representing different genres and varieties (e.g. COCA, GloWbE), databases (e.g. EEBO, ECCO) and corpora created by web crawling (e.g. EnTenTen, UKWaC).

Rich data

Rich data contains more than just the texts, including representations of spacing, graphical elements, choice of typeface, prosody, or gestures. This is further supplemented by analytic and descriptive metadata linked to either entire texts or individual textual elements. The benefit of rich data is that it can provide new kinds of evidence about pragmatic, sociolinguistic and even syntactic aspects of linguistic events. Yet the creation and use of rich data bring great challenges. We invite papers on the representation, query, analysis, and visualisation of data consisting of more than linear text.

Uncharted data

Uncharted data comprises material which has not yet been systematically mapped, surveyed or investigated. We wish to draw attention to texts and language varieties which are marginally represented in current corpora, to data sources that exist on the internet or in manuscript form alone, and material compiled for purposes other than linguistic research. We welcome papers discussing the innovative research prospects offered by new and and previously unused or even unidentified material for the study of English in various contexts ranging from communities and networks to social groups and individuals.

Abstracts are invited by 15 February 2015 for 30-minute presentations including discussion as well as for posters and corpus and software demonstrations.

The following invited speakers have confirmed their participation:

Professor Mark Davies (Brigham Young University)
Professor Tony McEnery (Lancaster University)
Professor Päivi Pahta (University of Tampere)
Dr Jane Winters (Institute of Historical Research, University of London)

The conference forms part of the programme celebrating the 375th anniversary of the University of Helsinki in 2015 and will be held in the Main Building of the University.

More information on the conference will be available on the conference home page at: http://www.helsinki.fi/varieng/d2e/. Please address any queries to: d2e-conference@helsinki.fi.

#CFP #BigData in a Transdisciplinary Perspective

(Through the Corpora-List)

Herrenhausen Conference, March 25-27, 2015, Hanover, Germany

Big Data in a Transdisciplinary Perspective

Large amounts of data, a variety of sources, high speed production, but also high speed processing – these are the basic characteristics of Big Data. The amount of data that is generated and collected in each second grows exponentially. The management of Big Data, the intelligent use of large, heterogeneous data sets, is becoming increasingly important for competition. It is affecting all sectors – industry and academia but also the public sector. While the economy is exploring Big Data as a new gold mine, politicians are fighting over the problem of data capitalism, whereas science tackles the question of cross-disciplinary benefits, as well as the challenges and the likely consequences for technology, innovation, and society.

The focus of the Herrenhausen Conference lies on open questions, unsolved problems, and future perspectives. The conference on Big Data therefore will not focus on a particular discipline but provide a transdisciplinary forum for Big Data researchers. We would like to discuss the challenges and consequences of Big Data research for society as well as innovation and technology, address the influence on economics as well as the legal framework and close on the challenges for research and research funding in the field of Big Data. Our goal is to create an inspiring setting for the discussion of new ideas.

We invite all researchers and experts working in this field.

There is no fee for attendance, but registration is essential.

Travel Grants
—————-
The Volkswagen Foundation offers up to 30 Travel Grants for young researchers who wish to attend the conference. For more information about the program please visit http://www.volkswagenstiftung.de/bigdata

We are inviting Ph.D. students or early Postdocs (max. 5 years since Ph.D.) working on independent and challenging projects in the field of big data, or in a field for which big data is crucial to apply for a travel grant.

Applicants are required to apply until *September 30, 2014* by using the application form at http://form.jotformeu.com/form/41901708397359

Please note that we are not able to consider applications after this deadline. The travel grant covers attendance, accommodation and travel expenses (excl. cab fares, parking, food and beverages while travelling as well as poster printing costs).

Recipients of the travel grants are required to present their project in a three minutes “Lightning Talk” as well as on a poster to be shown and explained in a poster session. Powerpoint, etc. can be used, but please limit yourself to max. 3 slides. Every presenter has three minutes to present his / her project.

Your application should contain the following:
1. A short tabular C.V. of max. 1.000 characters (including spaces)
2. A short description of your research focus (max. 600 characters including spaces) that explains how your approach tackles the challenges emerging from the interdisciplinary field of big data research and argues why we should select your project
3. An abstract of the project you want to present (max. 2.200 characters including spaces)
4. Your publications (max. 5).

Participants will be selected by a steering committee of interdisciplinary researchers from different fields of expertise. Acceptance will be based on qualification of the applicant as well as originality and potential of the research project. We will inform the applicants about the results by Mid-November 2014.

If your application has been accepted, the Volkswagen Foundation will book a room for you and send you all necessary information regarding travel, accommodation, poster size and visa. Please, only register for the conference beforehand if you plan to visit the conference regardless of your application’s acceptance.

For inquiries, please contact Anorthe Kremers at the Volkswagen Foundation at tel. +49/511/8183-260, or bigdata@volkswagenstiftung.de

Educación y big data: presentación en slideshare

Gracias a Alfredo Vela por el link.

Educación y datos masivos (Big Data) from Fernando Santamaria