Filters








188 Hits in 4.3 sec

Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons

Stella Markantonatou, John McCrae, Jelena Mitrović, Carole Tiberius, Carlos Ramisch, Ashwini Vaidya, Petya Osenova, Agata Savary
2020 Zenodo  
Multiword Expressions Polish corpus of verbal multiword expressions Agata Savary and Jakub Waszczuk AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations Lifeng Han, Gareth Jones  ...  • Extracting and enriching MWE lists from traditional human-readable lexicons for NLP use• Formats for NLP-applicable MWE lexicons • Interlinking MWE lexicons with other language resources • Using MWE  ... 
doi:10.5281/zenodo.4320698 fatcat:tib7xaln6fbz3kfdzibc2diinq

Toward Universal Dependencies for Shipibo-Konibo

Alonso Vasquez, Renzo Ego Aguirre, Candy Angulo, John Miller, Claudia Villanueva, Željko Agić, Roberto Zariquiey, Arturo Oncevay
2018 Proceedings of the Second Workshop on Universal Dependencies (UDW 2018)  
We describe the linguistic aspects of how the tagset was defined and the treebank was annotated; in addition we present our specific treatment of linguistic units called clitics.  ...  We present an initial version of the Universal Dependencies (UD) treebank for Shipibo-Konibo, the first South American, Amazonian, Panoan and Peruvian language with a resource built under UD.  ...  Shipibo-Konibo Treebank Our current Shipibo-Konibo treebank is the result of the syntactic annotation of 407 sentences extracted from parallel Shipibo-Konibo and Spanish educational materials and storybooks  ... 
doi:10.18653/v1/w18-6018 dblp:conf/acludw/VasquezAAMVAZO18 fatcat:xeiskzsfqfff5d76ankwykpcoq

Parsing Models for Identifying Multiword Expressions

Spence Green, Marie-Catherine de Marneffe, Christopher D. Manning
2013 Computational Linguistics  
Morphological analyses are automatically transformed into rich feature tags that are scored jointly with lexical items.  ...  We develop two structured prediction models for joint parsing and multiword expression identification.  ...  MWE identification then becomes a trivial process of extracting such subtrees from full parses.  ... 
doi:10.1162/coli_a_00139 fatcat:pq7v6i4z2zg2jpzyrs52tll2ce

Without lexicons, multiword expression identification will never fly: A position statement

Agata Savary, Silvio Cordeiro, Carlos Ramisch
2019 Proceedings of the Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019)  
Because most multiword expressions (MWEs), especially verbal ones, are semantically non-compositional, their automatic identification in running text is a prerequisite for semantically-oriented downstream  ...  multilingual corpus annotation and in computational models.  ...  where relatively rich NE-annotated corpora and lexicons are available.  ... 
doi:10.18653/v1/w19-5110 dblp:conf/mwe/SavaryCR19 fatcat:d5rovrbxhnapno7ncnon45noay

VPCTagger: Detecting Verb-Particle Constructions With Syntax-Based Methods

István Nagy T., Veronika Vincze
2014 Proceedings of the 10th Workshop on Multiword Expressions (MWE)  
If a data-driven morphological parser or a syntactic parser is trained on a dataset annotated with extra information for VPCs, they will be able to identify VPCs in raw texts.  ...  Verb-particle combinations (VPCs) consist of a verbal and a preposition/particle component, which often have some additional meaning compared to the meaning of their parts.  ...  In the first step, we extracted potential VPCs from a running text with a syntaxbased candidate extraction method and we applied a machine learning-based approach that made use of a rich feature set to  ... 
doi:10.3115/v1/w14-0803 dblp:conf/mwe/TV14 fatcat:p7rzt3ixmrcyhf4hcee57z2cde

Evaluating the English-Turkish parallel treebank for machine translation

2021 Turkish Journal of Electrical Engineering and Computer Sciences  
We manually generated parallel trees for about 17K sentences selected from the Penn Treebank 4 corpus. English sentences vary in length: 15 to 50 tokens including punctuation.  ...  In order to fill the morphological and syntactic gap between languages, 7 we do morphological annotation and disambiguation.  ...  Turkish has a rich 30 derivational affix inventory that allows the transition from one-word class to another.  ... 
doi:10.3906/elk-2102-57 fatcat:ur7yed4k3zdylg4hrvyfdlmz24

The PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions

Agata Savary, Carlos Ramisch, Silvio Cordeiro, Federico Sangati, Veronika Vincze, Behrang QasemiZadeh, Marie Candito, Fabienne Cap, Voula Giouli, Ivelina Stoyanova, Antoine Doucet
2017 Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)  
Multiword expressions (MWEs) are known as a "pain in the neck" for NLP due to their idiosyncratic behaviour.  ...  While some categories of MWEs have been addressed by many studies, verbal MWEs (VMWEs), such as to take a decision, to break one's heart or to turn off, have been rarely modelled.  ...  Extracting Verbal Multiword Data from Rich Treebank Annotation. In Proceedings of the 15th International Workshop on Treebanks and Linguistic Theories (TLT 15), pages 13-24.  ... 
doi:10.18653/v1/w17-1704 dblp:conf/mwe/SavaryRCSVQCCGS17 fatcat:mdovesoakrbh3bk7xebyycuru4

Identifying verbal multiword expressions with POS tagging and parsing techniques [chapter]

Katalin Ilona Simkó, Viktória Kovács, Veronika Vincze
2018 Zenodo  
The chapter describes an extended version (USzeged+) of our previous system (USzeged) submitted to PARSEME's Shared Task on automatic identification of verbal multiword expressions.  ...  USzeged+ exploits POS tagging and dependency parsing to identify single- and multi-token verbal MWEs in text.  ...  The fourth and last step is to extract the MWE tags and labels from the output of the POS tagger and the dependency parser.  ... 
doi:10.5281/zenodo.1469562 fatcat:bvc2lnxmgbdonfyxpatnwjxzsi

Learning to detect english and hungarian light verb constructions

Veronika Vincze, István Nagy T., János Zsibrita
2013 ACM Transactions on Speech and Language Processing  
Our results show that in spite of domain specificities, out-domain data can also contribute to the successful LVC detection in all domains.  ...  Light verb constructions consist of a verbal and a nominal component, where the noun preserves its original meaning while the verb has lost it (to some degree).  ...  Sass [2010] developed a method for extracting multiword verbs from parallel corpora.  ... 
doi:10.1145/2483691.2483695 dblp:journals/tslp/VinczeTZ13 fatcat:yoj6mc6myzbqvjjjyt7rqylc7m

Multi-word Entity Classification in a Highly Multilingual Environment

Sophie Chesney, Guillaume Jacquet, Ralf Steinberger, Jakub Piskorski
2017 Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)  
We would like to thank the members of the program committee for the timely reviews, authors for their valuable contributions, shared task organizers, annotators, and system developers for their hard work  ...  On the side of UD, Silveira and Manning (2015) explore whether the UD treebank formalism needs an additional representation to improve parsing.  ...  Acknowledgments The authors would like to thank Manuela Cherchi and Anna Desantis for their annotations and input on Italian examples.  ... 
doi:10.18653/v1/w17-1702 dblp:conf/mwe/ChesneyJSP17 fatcat:bv7aavgth5eurmzuphuowtuuhq

Comprehensive Supersense Disambiguation of English Prepositions and Possessives

Nathan Schneider, Jena D. Hwang, Vivek Srikumar, Jakob Prange, Austin Blodgett, Sarah R. Moeller, Aviram Stern, Adi Bitan, Omri Abend
2018 Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)  
We introduce a new annotation scheme, corpus, and task for the disambiguation of prepositions and possessives in English.  ...  Unlike previous approaches, our annotations are comprehensive with respect to types and tokens of these markers; use broadly applicable supersense classes rather than fine-grained dictionary definitions  ...  This research was supported in part by DTRA HDTRA1-16-1-0002/Project #1553695, by DARPA 15-18-CwC-FP-032, and by grant 2016375 from the United States-Israel Binational Science Foundation (BSF), Jerusalem  ... 
doi:10.18653/v1/p18-1018 dblp:conf/acl/AbendSSHPBMSB18 fatcat:nqy5xw6le5ckbe2bgx7cu7bjaa

Sentence Meaning Representations across Languages: What Can We Learn from Existing Frameworks?

Zdeněk Žabokrtský, Magda Ševčíková, Daniel Zeman
2020 Computational Linguistics  
This article gives an overview of how sentence meaning is represented in eleven deep-syntactic frameworks, ranging from those based on linguistic theories elaborated for decades to rather lightweight NLP-motivated  ...  Annotation of verbal multiword expressions was added in version 9.0 of the treebank (Candito et al. 2017 ). • Edges correspond to dependency relations between content words.  ...  The project started by marking clause nuclei composed of verbal predicates and their arguments (predicateargument structure); PropBank annotation pointed to constituents in the original Penn Treebank annotation  ... 
doi:10.1162/coli_a_00385 fatcat:kku2zljemfhfpprixiv2lczjme

Putting the Horses Before the Cart: Identifying Multiword Expressions Before Translation [chapter]

Carlos Ramisch
2017 Lecture Notes in Computer Science  
Translating multiword expressions (MWEs) is notoriously difficult. Part of the challenge stems from the analysis of non-compositional expressions in source texts, preventing literal translation.  ...  For the 3 supervised configurations, annotated MWEs are extracted from the training data and then filtered: we only keep combinations that have been annotated often enough in the training corpus.  ...  When annotating the same corpus from which MWE types were extracted, source-based annotation can be used for best results.  ... 
doi:10.1007/978-3-319-69805-2_6 fatcat:4i7nipwtjfd6nhunlf2gfhtet4

PARSEME multilingual corpus of verbal multiword expressions [chapter]

Agata Savary, Marie Candito, Verginica Barbu Mititelu, Eduard Bejček, Fabienne Cap, Slavomír Čéplö, Silvio Ricardo Cordeiro, Gülşen Eryiğit, Voula Giouli, Maarten Van Gompel, Yaakov HaCohen-Kerner, Jolanta Kovalevskaitė (+10 others)
2018 Zenodo  
Multiword expressions (MWEs) are known as a "pain in the neck" due to their idiosyncratic behaviour.  ...  While some categories of MWEs have been largely stud- ied, verbal MWEs (VMWEs) such as to take a walk, to break one's heart or to turn off have been relatively rarely modelled.  ...  We are grateful to all language teams for their contributions to preparing the annotation guidelines and the annotated corpora. The full composition of the annotation team is the following.  ... 
doi:10.5281/zenodo.1471590 fatcat:y32yvyh4xrbthi3uevojdlydoa

Comprehensive Supersense Disambiguation of English Prepositions and Possessives [article]

Nathan Schneider, Jena D. Hwang, Vivek Srikumar, Jakob Prange, Austin Blodgett, Sarah R. Moeller, Aviram Stern, Adi Bitan, Omri Abend
2018 arXiv   pre-print
We introduce a new annotation scheme, corpus, and task for the disambiguation of prepositions and possessives in English.  ...  Unlike previous approaches, our annotations are comprehensive with respect to types and tokens of these markers; use broadly applicable supersense classes rather than fine-grained dictionary definitions  ...  This research was supported in part by DTRA HDTRA1-16-1-0002/Project #1553695, by DARPA 15-18-CwC-FP-032, and by grant 2016375 from the United States-Israel Binational Science Foundation (BSF), Jerusalem  ... 
arXiv:1805.04905v1 fatcat:yrdehe3x7nc55eytibgftxkoqi
« Previous Showing results 1 — 15 out of 188 results