SyntagNet

This information from corpora-bounces@uib.no

SyntagNet, a resource with 88,000 lexical-semantic combinations, is now out!

We are proud to announce that SyntagNet 1.0 (http://syntagnet.org) is now available for download at http://syntagnet.org/download. Developed at the Sapienza NLP group (http://nlp.uniroma1.it), the multilingual Natural Language Processing group at the Sapienza University of Rome, SyntagNet is a manually-curated large-scale lexical-semantic combination database which associates pairs of concepts with pairs of co-occurring words. The goal of SyntagNet is to capture sense distinctions evoked by syntagmatic relations (e.g. mouse.n.1 and squeak.v.1 vs mouse.n.2 and click.n.4), hence providing information which complements the essentially paradigmatic knowledge shared by currently available Lexical Knowledge Bases such as WordNet. Its main features are:

  • Wide coverage, with 78,000 noun-verb and noun-noun lexical combinations extracted from the English Wikipedia and the British National Corpus.
  • High-quality, fully manual disambiguation for all of the lexical combinations, according to the WordNet 3.0 sense inventory.
  • A resulting Lexical Knowledge Base made up of 88,019 semantic combinations linking 20,626 WordNet 3.0 unique synsets with a relation edge.
  • A user-friendly web interface for looking up terms and their lexical-semantic combinations, with complete linkage to BabelNet 4.0.

And much more! Please check out our EMNLP 2019 paper:

M. Maru, F. Scozzafava, F. Martelli, R. Navigli. SyntagNet: Challenging Supervised Word Sense Disambiguation with Lexical-Semantic Combinations, Proc. of EMNLP-IJCNLP 2019

or http://syntagnet.org for more details!

SyntagRank, a state-of-the-art knowledge-based Word Sense Disambiguation system which uses SyntagNet to perform disambiguation in five languages (English, French, German, Italian and Spanish) is also available from the same website (will be demoed at ACL 2020!).

SyntagNet is an output of the MOUSSE ERC Consolidator Grant No. 726487 and of the ELEXIS project No. 731015 under the European Union’s Horizon 2020 research and innovation programme. Babelscape proudly developed the online interface and API, and provides the infrastructure for maintaining the service. 

The Sapienza NLP group

CFP #Grammar of #genres and #styles: which approaches to prefer? 16 Jan 2015

Grammar of genres and styles: which approaches to prefer?

ConSciLa (Confrontations en Sciences du Langage),
Paris, France,
Friday 16 January 2015
(the place will be announced later)

Organization
———–
Thierry Charnois (University of Paris 13, LIPN),
Sascha Diwersy (Universität zu Köln),
Meri Larjavaara (Åbo Akademi),
Dominique Legallois (University of Caen, Crisco)

Call for participation
—————–

Modern syntactic research consists generally of studies that are oriented towards formal properties of sentences. Sentences are then analyzed independently of any utterer-based perspective, or generic textual features.

As a result of this, grammatical variation is not viewed as central, nor are performance-related specificities viewed as pertinent to the field of syntax. In a similar manner, textual studies (in the tradition of textometrics and discourse analysis) rarely focus on the syntactic specificities of the genres under scrutiny, and instead concentrate on lexical and utterer-based specificities. As a consequence, textual genre is rarely characterized by its syntactic features. Whereas stylistics would appear most suited to the study of such linguistic features, its practice is flawed by heavily academic nature and lack of formal tools, restricting any analyses to pertinent yet isolated units of texts.

In recent years, automatic text analysis has enabled a more accurate identification of lexical and grammatical features of texts and genres. There are two main approaches, the first being more widespread than the second :

The paradigmatic approach rests upon the quantification of morpho-syntactic categories. For instance, in his work on oral discourse in the academic community, Biber 2006 uncovers the over-usage (in comparison with written discourse) of first person pronouns, of evaluative expressions (“mental” verbs, modal adverbs, etc.), of WH- questions, etc. By means of factorisation, it is possible to determine a set of properties particular to a specific genre.

The syntagmatic approach focuses on the combination of lexical units, the identification of preferred, or dispreferred, syntagmatic segments by genre. To give an illustration of this, consider the lexico-grammatical structure named “pattern” or motif in Quiniou et al, ce N si ADJ et si ADJ (lit : That N so ADJ and so ADJ). This semantico-evaluative pattern is specific to the 19th century genre of Memoirs, in comparison with Travel narrative, Novels, Correspondence, Essays of the same period :

Oh ! Tant mieux, tant mieux de n’ être pas bornés par ce temps si court et si triste ! E. de Guérin, Journal (1834-1840)
(lit : that time so short and so sad)

Seulement, pour ne pas faire acte de désobéissance et de bravade envers cette mère si tendre et si aimée, Maurice lui annonça […] un petit voyage au Blanc. G. Sand, Histoire de ma vie, 1855
(lit : that mother so tender and so loved)

On éprouve aujourd’hui encore, comme autrefois, une grande douceur intérieure à voir ces lieux si bénis, et maintenant si abandonnés. Mgr Dupanloup, Journal intime, 1876
(lit : these places so blessed and now so abandoned)

This Conscila Study day devoted to the study of grammar and stylistics of discourse genre, aims to bring together researchers in linguistics or NLP whose work focuses on the identification of lexico- grammatical textual features. Papers submitted must take into account the constraints of comprehensiveness : we will not focus on one type of form, but on a maximum of genre-specific elements. The following issues will be discussed:
– Techniques for the identification of generic properties ;
– The complementarity or competitivity between paradigmatic or syntagmatic approaches;
– Data interpretation.

Proposals should therefore focus on the characterization of discourse genre (literary or otherwise) or style, in a comprehensive perspective ; methods can be discussed, without neglecting linguistic description. Also of interest is the comparison between authors, the focus on registers, discourse practices, and textual units (narrative, argumentative, descriptive, etc.).

Studies may include any language, and both oral and written genres are welcome. We also welcome a variety of perspectives, including: computing, didactics, stylistics, discourse study, syntax…
Communications may be presented in French or in English.

——————
Submission Deadline :
——————

1- An intention to submit a paper will be sent by mid-September at
dominique.legallois@unicaen.fr

2- A detailed proposal of at least 1 full page should then be submitted by 1 November 2014. Selected papers will be notified by 20 November 2014 .


Ref

Biber D. (2006) University language: A corpus-based study of spoken and written registers. Amsterdam: John Benjamins

Biber, D. & S. Conrad 2009: Register, genre and style. Cambridge: Cambridge University Press.

Dorgeloh, H. & A. Wanner (eds) 2010: Syntactic variation and genre, Berlin/New York: De Gruyter Mouton.

Larjavaara M et Legallois D. (en prép.) « Les genres discursifs et leur grammaire »

Longrée D. et Mellet S. (2013. « Le motif : une unité phraséologique englobante ? Étendre le champ de la phraséologie de la langue au discours », Langages 189 (D. Legallois & A. Tutin, coord.), p.68-80

Quiniou S., Cellier P., Charnois Th. et Legallois D. (2012)« What About Sequential Data Mining Techniques to Identify Linguistic Patterns for Stylistics ? » in Lecture Notes in Computer Science, Springer Vol. 7181, pp 166-177

Martin, J. R. & Rose, D. (2008) : Genre relations. Mapping culture. London: Equinox.

revue Linx n° 64-65 , « Les genres de discours vus par la grammaire », sous la direction de M. Krazem.