Filters








19 Hits in 4.5 sec

Word Sense Disambiguation Using Multiple Contextual Features

Liang-Chih Yu, Chung-Hsien Wu, Jui-Feng Yeh
2010 International Journal of Computational Linguistics and Chinese Language Processing  
Experiments are conducted to evaluate the classifiers' performance on the OntoNotes corpus and are compared with classifiers trained using a set of baseline features, such as the bag-of-words, n-grams,  ...  resulting in accuracy as high as 81.6% and 87.4%, respectively, for NB and ME.  ...  Conclusions are drawn in Section 5. Word Sense Annotation in OntoNotes Corpus The OntoNotes corpus contains a set of sentences with word senses annotated.  ... 
dblp:journals/ijclclp/YuWY10 fatcat:nkbecglusnajjcmvpawpzikfpe

Reducing the Need for Double Annotation

Dmitriy Dligach, Martha Palmer
2011 Linguistic Annotation Workshop  
The quality of annotated data is crucial for supervised learning. To eliminate errors in single annotated data, a second round of annotation is often used.  ...  We show that it is possible to reduce the amount of the second round of annotation by more than half without sacrificing the performance.  ...  Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.  ... 
dblp:conf/acllaw/DligachP11 fatcat:jsqgcj27vbgxtoycgds6lmm6em

OntoNotes

Liang-Chih Yu, Chung-Hsien Wu, Eduard Hovy
2008 Proceedings of the 22nd International Conference on Computational Linguistics - COLING '08   unpublished
In this paper, we use OntoNotes, a large-scale corpus of semantic annotations, including word senses, predicate-argument structure, ontology linking, and coreference.  ...  To determine the mistaken agreements in word sense annotation, we employ word sense disambiguation (WSD) to select a set of suspicious candidates for human evaluation.  ...  The OntoNotes sense tags have been used for many applications, including the SemEval-2007 evaluation (Pradhan et al., 2007b) , sense merging (Snow et al., 2007) , sense pool verification (Yu et al.,  ... 
doi:10.3115/1599081.1599214 fatcat:ol6foen7bbhbbnfxx7lwhszzou

Using Games to Create Language Resources: Successes and Limitations of the Approach [chapter]

Jon Chamberlain, Karën Fort, Udo Kruschwitz, Mathieu Lafourcade, Massimo Poesio
2013 The People's Web Meets NLP  
We discuss some key issues in using a gaming approach, including task design, player motivation and data quality, and compare the costs of each approach in terms of development, distribution and ongoing  ...  In conclusion, we summarise the benefits and limitations of using a gaming approach to resource creation and suggest key considerations for evaluating its utility in different research scenarios.  ...  Acknowledgements We would like to thank Jean Heutte (CREF-CNRS) for his help with the concepts of game flow and for the comments of the reviewers of this chapter.  ... 
doi:10.1007/978-3-642-35085-6_1 dblp:series/tanlp/ChamberlainFKLP13 fatcat:s4y3t2g7obfznkpleeg6maeonm

A Survey of Deep Active Learning [article]

Pengzhen Ren, Yun Xiao, Xiaojun Chang, Po-Yao Huang, Zhihui Li, Brij B. Gupta, Xiaojiang Chen, Xin Wang
2021 arXiv   pre-print
However, the acquisition of a large number of high-quality annotated datasets consumes a lot of manpower, which is not allowed in some fields that require high expertise, especially in the fields of speech  ...  In recent years, due to the rapid development of internet technology, we are in an era of information torrents and we have massive amounts of data.  ...  This is because the premature termination of AL annotation querying leads to large performance losses in the model, and excessive annotation behavior wastes a lot of annotation budget.  ... 
arXiv:2009.00236v2 fatcat:zuk2doushzhlfaufcyhoktxj7e

Bridging Chasms In Hindustani Music Retrieval

Joe Cheri Ross, Preeti Rao, Pushpak Bhattacharyya
2018 Zenodo  
The problem of raga similarity detection is investigated with two diverse data sources viz., discussions on Hindustani ragas and composition notations. Each of these sources help in extra [...]  ...  In the context of information extraction from audio signals, we present our investigations on melodic motif detection involving mukhda (main title phrase of a composition) and pakad (raga characteristic  ...  The annotation tool having color coding features make the visual verification easier while annotating. The annotations are further processed to convert it to CoNLL format.  ... 
doi:10.5281/zenodo.1237568 fatcat:mrshrvhbcbb2lf4n7gys6e6plq

Bridging Chasms In Hindustani Music Retrieval

Joe Cheri Ross, Preeti Rao, Pushpak Bhattacharyya
2018 Zenodo  
The problem of raga similarity detection is investigated with two diverse data sources viz., discussions on Hindustani ragas and composition notations. Each of these sources help in extra [...]  ...  In the context of information extraction from audio signals, we present our investigations on melodic motif detection involving mukhda (main title phrase of a composition) and pakad (raga characteristic  ...  The annotation tool having color coding features make the visual verification easier while annotating. The annotations are further processed to convert it to CoNLL format.  ... 
doi:10.5281/zenodo.1239418 fatcat:uix3v2agqrajtcdrn5wm74tnq4

Information extraction from digital social trace data with applications to social media and scholarly communication data

Shubhanshu Mishra
2020 SIGIR Forum  
, 2019] are introduced for identifying sentiment, named entities, part of speech tags, phrase chunks, and super-sense tags.  ...  The methods and tools presented here can help advance work in the areas of social media and scholarly data analysis.  ...  Some examples of such token level tasks are part of speech tagging, chunking, and super-sense tagging.  ... 
doi:10.1145/3451964.3451981 fatcat:36djwlckprhl5hymzhivrbnscu

Recent Advances in Natural Language Inference: A Survey of Benchmarks, Resources, and Approaches [article]

Shane Storks, Qiaozi Gao, Joyce Y. Chai
2020 arXiv   pre-print
learning and inference approaches in order to support a better understanding of this growing field.  ...  relying on reasoning and knowledge of the world.  ...  Acknowledgements We would like to thank the anonymous reviewers for their greatly helpful comments and suggestions.  ... 
arXiv:1904.01172v3 fatcat:minzpxrrwfebdipu55udwe5dxq

Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases [article]

Gerhard Weikum, Luna Dong, Simon Razniewski, Fabian Suchanek
2021 arXiv   pre-print
This machine knowledge can be harnessed to semantically interpret textual phrases in news, social media and web tables, and contributes to question answering, natural language processing and data analytics  ...  Equipping machines with comprehensive knowledge of the world's entities and their relationships has been a long-standing goal of AI.  ...  It is a great pleasure and honor to have such wonderful colleagues in our research community.  ... 
arXiv:2009.11564v2 fatcat:vh2lqfmhhbcwpf6dcsej3hhvgy

OASIcs, Volume 70, LDK'19, Complete Volume [article]

Maria Eskevich, Gerard de Melo, Christian Fäth, John P. McCrae, Paul Buitelaar, Christian Chiarcos, Bettina Klimek, Milan Dojchinovski
2019
In addition, we want to express our thanks to Richard Shapiro, Will Hunter and Sophie Wood for their great support in different aspects of the project.  ...  Acknowledgements We are very grateful to Charlotte Buxton and Rebecca Juganaru, the expert lexicographers who have contributed all the dictionary knowledge we were lacking and have helped with manual annotations  ...  H2.2 (reliability): Gathering event annotations from a large pool of crowd workers provides reliable results in terms of F1-score when compared to expert annotators.  ... 
doi:10.4230/oasics.ldk.2019 fatcat:gdzvkfimsrd27jq6xuon4vafn4

Biomedical Text Mining for Etiological Factor Identification in Mental Health Publications

Tilia Ellendorff
2021
An initial set of terminological items is analyzed and expanded to support the automatic annotation of disorder mentions in research publications.  ...  After providing information regarding data collection together with an overview and analysis of relevant data, the thesis describes the development of an annotation scheme used to annotate a gold standard  ...  Finally, the fourth phase consisted of verification and assurance of corpus consistency and partial re-annotation (by annotator 3).  ... 
doi:10.5167/uzh-205675 fatcat:o52fsxkqj5ejbjkjvb4e47i7qu

Joint Discourse-aware Concept Disambiguation and Clustering

Angela Petra Fahrni
2016
Concept disambiguation is the task of linking common nouns and proper names in a text -henceforth called mentions -to their corresponding concepts in a predefined inventory.  ...  Concept clustering is the task of clustering mentions, so that all mentions in one cluster denote the same concept.  ...  The annotators for the TAC data sets are asked to select challenging proper names so that both the variability and the ambiguity in the annotated mention pool are high.  ... 
doi:10.11588/heidok.00020737 fatcat:vhljgiqbrbcwtpjxce6k4vcp2a

Knowledge-intensive, high-performance relation extraction [article]

Sebastian Krause, Technische Universität Berlin, Technische Universität Berlin, Volker Markl, Hans Uszkoreit
2018
Information extraction and its subtask relation extraction play a central role in data processing pipelines that make hidden knowledge such as the content of news articles available to downstream users  ...  Here, state-of-the-art methods employ simplistic assumptions at training time, which has a drastic negative effect on both precision and coverage.  ...  It took place in the context of the following research projects: 2013-2014: Deependance, funded by the German Federal Ministry of Education and Research (BMBF; contract 01IW11003) 2013-2014: Intellektix  ... 
doi:10.14279/depositonce-6626 fatcat:5e3jzx6e4fcgtex6ztql65xz24

Proceeding of the Australasian Language Technology Workshop 2006

Lawrence Cavedon, Ingrid Zukerman
the Australasian Language Technology Workshop (ALTW) 2006, held at the University of Sydney   unpublished
in Australia, New Zealand and overseas.  ...  Of the 36 papers submitted, 19 papers were selected by the programme committee for publication and appear in these proceedings.  ...  Acknowledgements We would like to thank James Curran, James Gorman and Jon Patrick from the University of Sydney for their invaluable insights.  ... 
fatcat:74kafjz7hnanhhqlybjlsxtgxm
« Previous Showing results 1 — 15 out of 19 results