A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Word Sense Disambiguation Using Multiple Contextual Features
2010
International Journal of Computational Linguistics and Chinese Language Processing
Experiments are conducted to evaluate the classifiers' performance on the OntoNotes corpus and are compared with classifiers trained using a set of baseline features, such as the bag-of-words, n-grams, ...
resulting in accuracy as high as 81.6% and 87.4%, respectively, for NB and ME. ...
Conclusions are drawn in Section 5.
Word Sense Annotation in OntoNotes Corpus The OntoNotes corpus contains a set of sentences with word senses annotated. ...
dblp:journals/ijclclp/YuWY10
fatcat:nkbecglusnajjcmvpawpzikfpe
Reducing the Need for Double Annotation
2011
Linguistic Annotation Workshop
The quality of annotated data is crucial for supervised learning. To eliminate errors in single annotated data, a second round of annotation is often used. ...
We show that it is possible to reduce the amount of the second round of annotation by more than half without sacrificing the performance. ...
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. ...
dblp:conf/acllaw/DligachP11
fatcat:jsqgcj27vbgxtoycgds6lmm6em
OntoNotes
2008
Proceedings of the 22nd International Conference on Computational Linguistics - COLING '08
unpublished
In this paper, we use OntoNotes, a large-scale corpus of semantic annotations, including word senses, predicate-argument structure, ontology linking, and coreference. ...
To determine the mistaken agreements in word sense annotation, we employ word sense disambiguation (WSD) to select a set of suspicious candidates for human evaluation. ...
The OntoNotes sense tags have been used for many applications, including the SemEval-2007 evaluation (Pradhan et al., 2007b) , sense merging (Snow et al., 2007) , sense pool verification (Yu et al., ...
doi:10.3115/1599081.1599214
fatcat:ol6foen7bbhbbnfxx7lwhszzou
Using Games to Create Language Resources: Successes and Limitations of the Approach
[chapter]
2013
The People's Web Meets NLP
We discuss some key issues in using a gaming approach, including task design, player motivation and data quality, and compare the costs of each approach in terms of development, distribution and ongoing ...
In conclusion, we summarise the benefits and limitations of using a gaming approach to resource creation and suggest key considerations for evaluating its utility in different research scenarios. ...
Acknowledgements We would like to thank Jean Heutte (CREF-CNRS) for his help with the concepts of game flow and for the comments of the reviewers of this chapter. ...
doi:10.1007/978-3-642-35085-6_1
dblp:series/tanlp/ChamberlainFKLP13
fatcat:s4y3t2g7obfznkpleeg6maeonm
A Survey of Deep Active Learning
[article]
2021
arXiv
pre-print
However, the acquisition of a large number of high-quality annotated datasets consumes a lot of manpower, which is not allowed in some fields that require high expertise, especially in the fields of speech ...
In recent years, due to the rapid development of internet technology, we are in an era of information torrents and we have massive amounts of data. ...
This is because the premature termination of AL
annotation querying leads to large performance losses in the model, and excessive annotation
behavior wastes a lot of annotation budget. ...
arXiv:2009.00236v2
fatcat:zuk2doushzhlfaufcyhoktxj7e
Bridging Chasms In Hindustani Music Retrieval
2018
Zenodo
The problem of raga similarity detection is investigated with two diverse data sources viz., discussions on Hindustani ragas and composition notations. Each of these sources help in extra [...] ...
In the context of information extraction from audio signals, we present our investigations on melodic motif detection involving mukhda (main title phrase of a composition) and pakad (raga characteristic ...
The annotation tool having color coding features make the visual verification easier while annotating. The annotations are further processed to convert it to CoNLL format. ...
doi:10.5281/zenodo.1237568
fatcat:mrshrvhbcbb2lf4n7gys6e6plq
Bridging Chasms In Hindustani Music Retrieval
2018
Zenodo
The problem of raga similarity detection is investigated with two diverse data sources viz., discussions on Hindustani ragas and composition notations. Each of these sources help in extra [...] ...
In the context of information extraction from audio signals, we present our investigations on melodic motif detection involving mukhda (main title phrase of a composition) and pakad (raga characteristic ...
The annotation tool having color coding features make the visual verification easier while annotating. The annotations are further processed to convert it to CoNLL format. ...
doi:10.5281/zenodo.1239418
fatcat:uix3v2agqrajtcdrn5wm74tnq4
Information extraction from digital social trace data with applications to social media and scholarly communication data
2020
SIGIR Forum
, 2019] are introduced for identifying sentiment, named entities, part of speech tags, phrase chunks, and super-sense tags. ...
The methods and tools presented here can help advance work in the areas of social media and scholarly data analysis. ...
Some examples of such token level tasks are part of speech tagging, chunking, and super-sense tagging. ...
doi:10.1145/3451964.3451981
fatcat:36djwlckprhl5hymzhivrbnscu
Recent Advances in Natural Language Inference: A Survey of Benchmarks, Resources, and Approaches
[article]
2020
arXiv
pre-print
learning and inference approaches in order to support a better understanding of this growing field. ...
relying on reasoning and knowledge of the world. ...
Acknowledgements We would like to thank the anonymous reviewers for their greatly helpful comments and suggestions. ...
arXiv:1904.01172v3
fatcat:minzpxrrwfebdipu55udwe5dxq
Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases
[article]
2021
arXiv
pre-print
This machine knowledge can be harnessed to semantically interpret textual phrases in news, social media and web tables, and contributes to question answering, natural language processing and data analytics ...
Equipping machines with comprehensive knowledge of the world's entities and their relationships has been a long-standing goal of AI. ...
It is a great pleasure and honor to have such wonderful colleagues in our research community. ...
arXiv:2009.11564v2
fatcat:vh2lqfmhhbcwpf6dcsej3hhvgy
OASIcs, Volume 70, LDK'19, Complete Volume
[article]
2019
In addition, we want to express our thanks to Richard Shapiro, Will Hunter and Sophie Wood for their great support in different aspects of the project. ...
Acknowledgements We are very grateful to Charlotte Buxton and Rebecca Juganaru, the expert lexicographers who have contributed all the dictionary knowledge we were lacking and have helped with manual annotations ...
H2.2 (reliability): Gathering event annotations from a large pool of crowd workers provides reliable results in terms of F1-score when compared to expert annotators. ...
doi:10.4230/oasics.ldk.2019
fatcat:gdzvkfimsrd27jq6xuon4vafn4
Biomedical Text Mining for Etiological Factor Identification in Mental Health Publications
2021
An initial set of terminological items is analyzed and expanded to support the automatic annotation of disorder mentions in research publications. ...
After providing information regarding data collection together with an overview and analysis of relevant data, the thesis describes the development of an annotation scheme used to annotate a gold standard ...
Finally, the fourth phase consisted of verification and assurance of corpus consistency and partial re-annotation (by annotator 3). ...
doi:10.5167/uzh-205675
fatcat:o52fsxkqj5ejbjkjvb4e47i7qu
Joint Discourse-aware Concept Disambiguation and Clustering
2016
Concept disambiguation is the task of linking common nouns and proper names in a text -henceforth called mentions -to their corresponding concepts in a predefined inventory. ...
Concept clustering is the task of clustering mentions, so that all mentions in one cluster denote the same concept. ...
The annotators for the TAC data sets are asked to select challenging proper names so that both the variability and the ambiguity in the annotated mention pool are high. ...
doi:10.11588/heidok.00020737
fatcat:vhljgiqbrbcwtpjxce6k4vcp2a
Knowledge-intensive, high-performance relation extraction
[article]
2018
Information extraction and its subtask relation extraction play a central role in data processing pipelines that make hidden knowledge such as the content of news articles available to downstream users ...
Here, state-of-the-art methods employ simplistic assumptions at training time, which has a drastic negative effect on both precision and coverage. ...
It took place in the context of the following research projects:
2013-2014: Deependance, funded by the German Federal Ministry of Education and Research (BMBF; contract 01IW11003) 2013-2014: Intellektix ...
doi:10.14279/depositonce-6626
fatcat:5e3jzx6e4fcgtex6ztql65xz24
Proceeding of the Australasian Language Technology Workshop 2006
the Australasian Language Technology Workshop (ALTW) 2006, held at the University of Sydney
unpublished
in Australia, New Zealand and overseas. ...
Of the 36 papers submitted, 19 papers were selected by the programme committee for publication and appear in these proceedings. ...
Acknowledgements We would like to thank James Curran, James Gorman and Jon Patrick from the University of Sydney for their invaluable insights. ...
fatcat:74kafjz7hnanhhqlybjlsxtgxm
« Previous
Showing results 1 — 15 out of 19 results