Filters








125 Hits in 3.1 sec

Annotation of multiword expressions in the Prague dependency treebank

Eduard Bejček, Pavel Straňák
2009 Language Resources and Evaluation  
In this article we want to demonstrate that annotation of multiword expressions in the Prague Dependency Treebank is a well defined task, that it is useful as well as feasible, and that we can achieve  ...  good consistency of such annotations in terms of inter-annotator agreement.  ...  Acknowledgement This work has been supported by grants 1ET2011205-05 of Grant Agency of the Academy of Science of the Czech Republic, projects MSM0021620838 and LC536 of the Ministry of Education and 201  ... 
doi:10.1007/s10579-009-9093-0 fatcat:3puwz47bazhizorw3tg5c42t3a

Use of Coreference in Automatic Searching for Multiword Discourse Markers in the Prague Dependency Treebank

Magdalena Rysova, Jiří Mírovský
2014 Proceedings of LAW VIII - The 8th Linguistic Annotation Workshop  
The paper introduces a possibility of new research offered by a multi-dimensional annotation of the Prague Dependency Treebank.  ...  It focuses on exploitation of the annotation of coreference for the annotation of discourse relations expressed by multiword expressions.  ...  in Prague from the resources of the Charles University Grant Agency in 2013-2015.  ... 
doi:10.3115/v1/w14-4902 dblp:conf/acllaw/RysovaM14 fatcat:wjnlrayb3jbrvjr3t7v6bqsu7u

Semi-Automated Resolution of Inconsistency for a Harmonized Multiword Expression and Dependency Parse Annotation

King Chan, Julian Brooke, Timothy Baldwin
2017 Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)  
This paper presents a methodology for identifying and resolving various kinds of inconsistency in the context of merging dependency and multiword expression (MWE) annotations, to generate a dependency  ...  treebank with comprehensive MWE annotations.  ...  Conclusion We have proposed a methodology for merging multiword expression and dependency parse annotations, to generate HAMSTER: a gold-standard MWE-annotated dependency treebank with high consistency  ... 
doi:10.18653/v1/w17-1726 dblp:conf/mwe/ChanBB17 fatcat:ofs2stwojfdbdlxa2vcmljk4mu

Prague Dependency Treebank – Consolidated 1.0 [article]

Jan Hajič, Eduard Bejček, Jaroslava Hlaváčová, Marie Mikulová, Milan Straka, Jan Štěpánek, Barbora Štěpánková
2020 arXiv   pre-print
We present a richly annotated and genre-diversified language resource, the Prague Dependency Treebank-Consolidated 1.0 (PDT-C 1.0), the purpose of which is - as it always been the case for the family of  ...  the Prague Dependency Treebanks - to serve both as a training data for various types of NLP tasks as well as for linguistically-oriented research.  ...  The original annotation has been supported by multiple projects in the past, funded both nationally by the Ministry of Education, Youth and Sports of the Czech Republic and the Czech Science Foundation  ... 
arXiv:2006.03679v1 fatcat:aol3ozf6rjeu5cao6v2dimfgz4

Annotation and Extraction of Multiword Expressions in Turkish Treebanks

Gülşen Eryiğit, Kübra ADALI, Dilara Torunoğlu-Selamet, Umut Sulubacak, Tuğba Pamay
2015 Proceedings of the 11th Workshop on Multiword Expressions  
Unfortunately, the creation of necessary resources for this task is quite rigorous and many languages suffer from the lack of these; as in the case for Turkish.  ...  Multiword expressions (MWEs) present particular and distinctive semantic properties, hence their automatic extraction receives special attention from the natural language processing (NLP) and corpus linguistics  ...  ) 1001 program (grant number 112E276) and part of the 75 ICT COST Action IC1207 PARSEME (PARSing and Multi-word Expressions).  ... 
doi:10.3115/v1/w15-0912 dblp:conf/mwe/EryigitATSP15 fatcat:afgs7uybzbdjhb53g7sxcjwcny

Discourse Connectives: From Historical Origin To Present-Day Development [chapter]

Magaléna Rysová
2017 Zenodo  
Finally, the paper demonstrates how these observations may be helpful for annotations of discourse in large corpora.  ...  The paper focuses on the description and delimitation of discourse connectives, i.e. linguistic expressions significantly contributing to text coherence and generally helping the reader to better understand  ...  Acknowledgments The author acknowledges support from the Czech Science Foundation (Grant Agency of the Czech Republic): project GA CR No. 17-06123S (Anaphoricity in Connectives: Lexical Description and  ... 
doi:10.5281/zenodo.814460 fatcat:62zhvpwykngilbwc4zwofix4gu

Semi-automated resolution of inconsistency for a harmonized multiword-expression and dependency-parse annotation [chapter]

Julian Brooke, King Chan, Timothy Baldwin
2018 Zenodo  
This chapter presents a methodology for identifying and resolving various kinds of inconsistency in the context of merging dependency and multiword expression (MWE) annotations, to generate a dependency  ...  treebank with comprehensive MWE annotations.  ...  Other MWE-aware dependency treebanks include the various UD treebanks , the Prague Dependency Treebank (Bejček et al. 2013) , the Redwoods Treebank (Oepen et al. 2002) , and others (Nivre 2004; Eryiğit  ... 
doi:10.5281/zenodo.1469565 fatcat:hyvsf3btjzgg5bf72x5kg7twka

A framework for (under)specifying dependency syntax without overloading annotators [article]

Nathan Schneider, Brendan O'Connor, Naomi Saphra, David Bamman, Manaal Faruqui, Noah A. Smith, Chris Dyer, Jason Baldridge
2013 arXiv   pre-print
Moreover, the formalism encourages annotators to underspecify parts of the syntax if doing so would streamline the annotation process.  ...  We demonstrate the efficacy of this annotation on three languages and develop algorithms to evaluate and compare underspecified annotations.  ...  Traditional syntactic annotation projects like the Penn Treebank (Marcus et al., 1993) or Prague Dependency Treebank (Hajič, 1998 ) require highly trained annotators and huge amounts of effort.  ... 
arXiv:1306.2091v2 fatcat:fkqe4cjo45gcfauxj7n2ocx5wu

Sentence Meaning Representations across Languages: What Can We Learn from Existing Frameworks?

Zdeněk Žabokrtský, Magda Ševčíková, Daniel Zeman
2020 Computational Linguistics  
This article gives an overview of how sentence meaning is represented in eleven deep-syntactic frameworks, ranging from those based on linguistic theories elaborated for decades to rather lightweight NLP-motivated  ...  We outline most important characteristics of each framework and then discuss how particular language phenomena are treated across those frameworks, while trying to shed light on commonalities as well as  ...  We dedicate this study to the memory of Petr Sgall, the founder of computational linguistics in former Czechoslovakia and one of the founding members of our Institute.  ... 
doi:10.1162/coli_a_00385 fatcat:kku2zljemfhfpprixiv2lczjme

Constructing a Turkish-English Parallel TreeBank

Olcay Taner Yıldız, Ercan Solak, Onur Görgün, Razieh Ehsani
2014 Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)  
In the corpus, we manually generated parallel trees for about 5,000 sentences from Penn Treebank. English sentences in our set have a maximum of 15 tokens, including punctuation.  ...  We constrained the translated trees to the reordering of the children and the replacement of the leaf nodes with appropriate glosses.  ...  Well-known parallel treebank efforts are • Prague Czech-English dependency treebank annotated with dependency structure (Cmejrek et al., 2004) • English-German parallel treebank, annotated with POS,  ... 
doi:10.3115/v1/p14-2019 dblp:conf/acl/YildizSGE14 fatcat:ygjbs6rkzjb37ixv55blrdufle

Inherently Pronominal Verbs in Czech: Description and Conversion Based on Treebank Annotation

Zdenka Uresova, Eduard Bejček, Jan Hajic
2016 Proceedings of the 12th Workshop on Multiword Expressions  
In this paper, we concentrate on one of the relevant MWE categories, namely on the quasi-universal category called "Inherently Pronominal Verbs" (IPronV) and describe its annotation in the Prague Dependency  ...  We will contribute to the Shared Task dataset, a multilingual open resource, by converting data from the Prague Dependency Treebank (PDT) to the Shared Task format.  ...  Acknowledgements The work described herein has been supported by the grant GP13-03351P of the Grant Agency of the Czech Republic, by the grant LD14117 of the Ministry of Education, Youth and Sports of  ... 
doi:10.18653/v1/w16-1812 dblp:conf/mwe/UresovaBH16 fatcat:e5z5ytulgvh37lrh4kkb3v44na

CzeDLex – A Lexicon of Czech Discourse Connectives

Jiří Mírovský, Pavlína Synková, Magdaléna Rysová, Lucie Poláková
2017 Prague Bulletin of Mathematical Linguistics  
Third, we describe the process of getting data for the lexicon by exploiting a large corpus manually annotated with discourse relations – the Prague Discourse Treebank 2.0: we elaborate on the automatic  ...  Second, we introduce the chosen technical solution based on the Prague Markup Language, which allows for an efficient incorporation of the lexicon into the family of Prague treebanks – it can be directly  ...  The research reported in the present contribution has been using language resources developed, stored and distributed by the LINDAT/CLARIN project of the Ministry of Education, Youth and Sports of the  ... 
doi:10.1515/pralin-2017-0039 fatcat:mqwo3tcbsje4jjn3i4xdph7kru

Conversion from Paninian Karakas to Universal Dependencies for Hindi Dependency Treebank

Juhi Tandon, Himani Chaudhry, Riyaz Ahmad Bhat, Dipti Sharma
2016 Proceedings of the 10th Linguistic Annotation Workshop held in conjunction with ACL 2016 (LAW-X 2016)  
We extend UD to Indian languages through conversion of Pān ̣inian Dependencies to UD for the Hindi Dependency Treebank (HDTB).  ...  We discuss the differences in annotation in both the schemes, present parsing experiments for both the formalisms and empirically evaluate their weaknesses and strengths for Hindi.  ...  The work reported in this paper is supported by the NSF grant (Award Number: CNS 0751202; CFDA Number: 47.070) 4  ... 
doi:10.18653/v1/w16-1716 dblp:conf/acllaw/TandonCBS16 fatcat:7e3rdsipc5bjvjygymzhel7tlq

Extracting Headless MWEs from Dependency Parse Trees: Parsing, Tagging, and Joint Modeling Approaches [article]

Tianze Shi, Lillian Lee
2020 arXiv   pre-print
Despite their special status and prevalence, current dependency-annotation schemes require treating such flat structures as if they had internal syntactic heads, and most current parsers handle them in  ...  Experimental results on the MWE-Aware English Dependency Corpus and on six non-English dependency treebanks with frequent flat structures show that: (1) tagging is more accurate than parsing for identifying  ...  Acknowledgments We thank the three anonymous reviewers for their comments, and Igor Malioutov, Ana Smith and the Cornell NLP group for discussion and comments.  ... 
arXiv:2005.03035v1 fatcat:wyj2riyvc5h4fazrgefxwdbnui

A French corpus annotated for multiword expressions and named entities

Marie Candito, Mathieu Constant, Carlos Ramisch, Agata Savary, Bruno Guillaume, Yannick Parmentier, Silvio Cordeiro
2021 Journal of Language Modelling  
We present the enrichment of a French treebank of various genres with a new annotation layer for multiword expressions (MWEs) and named entities (NEs).1 Our contribution with respect to previous work on  ...  In addition to the span of the elements, annotation includes the subcategory of NEs (e.g., person, location) and one matching sufficient criterion for non-verbal MWEs (e.g., lexical substitution).  ...  The Prague Dependency Treebank (Hajič et al. 2017 ) is a project for the Czech language started in the nineties.  ... 
doi:10.15398/jlm.v8i2.265 fatcat:snxchivzyfcw5lq6hnhgsw3zvi
« Previous Showing results 1 — 15 out of 125 results