Filters








48,178 Hits in 4.8 sec

Annotating Geographical Entities

Alexandru Sălăvăstru, Daniela Gîfu
2015 Research in Computing Science  
This paper describes a study based on exploration of relations between geographical entities. We suggested a new tool for training and evaluation required by related annotation experiments.  ...  It relates to an annotator used for semi-automatic annotation, starting with the geography manual.  ...  This tool can be used to annotate entities from different domains, such as: biology, computer science, literature, astronomy, physics and so on.  ... 
doi:10.13053/rcs-90-1-4 fatcat:bzoy5whku5fvrhx3pjw3gilf5q

Computer Science Named Entity Recognition in the Open Research Knowledge Graph [article]

Jennifer D'Souza, Sören Auer
2022 arXiv   pre-print
Domain-specific named entity recognition (NER) on Computer Science (CS) scholarly articles is an information extraction task that is arguably more challenging for the various annotation aims that can beset  ...  Currently, progress on CS NER -- the focus of this work -- is hampered in part by its recency and the lack of a standardized annotation aim for scientific entities/terms.  ...  The mappings we used are elicited in Table 2 . 2 Mappings of nine scientific semantic types across Computer Science papers for CS NER.  ... 
arXiv:2203.14579v1 fatcat:fzq37ng56zhovomjnorusvfg3i

A Typology of Semantic Relations Dedicated to Scientific Literature Analysis [chapter]

Kata Gábor, Haïfa Zargayouna, Isabelle Tellier, Davide Buscaldi, Thierry Charnois
2016 Lecture Notes in Computer Science  
Our model relies on a typology of explicit semantic relations.  ...  These relations are instantiated in the abstract/introduction part of the papers and can be identified automatically using textual data and external ontologies.  ...  Relation classification uses the entity-annotated text as input and aims to identify the relations between two entities based on a combination of two information sources: the text sequence between the  ... 
doi:10.1007/978-3-319-53637-8_3 fatcat:z244jmqzmvcehohxuypawz73ey

Symlink: A New Dataset for Scientific Symbol-Description Linking [article]

Viet Dac Lai, Amir Pouran Ben Veyseh, Franck Dernoncourt, Thien Huu Nguyen
2022 arXiv   pre-print
Symlink annotates scientific papers of 5 different domains (i.e., computer science, biology, physics, mathematics, and economics).  ...  Our experiments on Symlink demonstrate the challenges of the symbol-description linking task for existing models and call for further research effort in this area.  ...  Annotation taxonomy: To prepare for the annotation, we design a taxonomy with 3 general entity types and four relation types.  ... 
arXiv:2204.12070v1 fatcat:pnxusnf2gvautfhkhx2smgke5m

Transfer Learning for Scientific Data Chain Extraction in Small Chemical Corpus with joint BERT-CRF Model

Na Pang, Li Qian, Weimin Lyu, Jin-Dong Yang
2019 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval  
This paper utilizes a combined BERT-CRF model to build scientific chemical data chains by extracting 7 chemical entities and relations from publications.  ...  7 types of entities: compound, solvent, method, bond, reaction, pKa and pKa value.  ...  We would like to thank the support by the Center of Basic molecular Science at Tsinghua University and National Science Library of Chinese Academy of Science.  ... 
dblp:conf/sigir/PangQLY19 fatcat:azgm6b2jdzgjbas3iymcietyaq

Transfer Learning for Scientific Data Chain Extraction in Small Chemical Corpus with BERT-CRF Model [article]

Na Pang, Li Qian, Weimin Lyu, Jin-Dong Yang
2019 arXiv   pre-print
This paper presents a novel BERT-CRF model to build scientific chemical data chains by extracting 7 chemical entities and relations from publications.  ...  for 7 types of entities: compound, solvent, method, bond, reaction, pKa and pKa value.  ...  We would like to thank the support by the Center of Basic molecular Science at Tsinghua University and National Science Library of Chinese Academy of Science.  ... 
arXiv:1905.05615v1 fatcat:qla4wiwwdrdjria4cgansfv4vm

Automatic Construction of a Semantic Knowledge Base from CEUR Workshop Proceedings [chapter]

Bahar Sateli, René Witte
2015 Communications in Computer and Information Science  
references, and produces semantic (typed) annotations; and (ii) a flexible exporting module, the LODeXporter, which translates the document annotations into RDF triples according to custom mapping rules  ...  Additionally, we leverage existing Named Entity Recognition (NER) tools to extract named entities from text and ground them to their corresponding resources on the Linked Open Data cloud, thus, briefly  ...  and their corresponding semantic type, and (iii) the relations between exported triples and the type of their relation.  ... 
doi:10.1007/978-3-319-25518-7_11 fatcat:sjrbad3zbbedjlvvnaxgyffad4

Extracting a Knowledge Base of Mechanisms from COVID-19 Papers [article]

Tom Hope, Aida Amini, David Wadden, Madeleine van Zuylen, Sravanthi Parasa, Eric Horvitz, Daniel Weld, Roy Schwartz, Hannaneh Hajishirzi
2021 arXiv   pre-print
We annotate a dataset of mechanisms with our schema and train a model to extract mechanism relations from papers.  ...  We pursue the construction of a knowledge base (KB) of mechanisms -- a fundamental concept across the sciences encompassing activities, functions and causal relations, ranging from cellular processes to  ...  Authors would also like to thank anonymous reviewers, members of AI2, UW-NLP and the H2Lab at The University of Washington for their valuable feedback and comments.  ... 
arXiv:2010.03824v3 fatcat:zibxa5i2j5bh5engtlv44eifhq

Semantic representation of scientific literature: bringing claims, contributions and named entities onto the Linked Open Data cloud

Bahar Sateli, René Witte
2015 PeerJ Computer Science  
We created a gold standard corpus from computer science conference proceedings and journal articles, whereClaimandContributionsentences are manually annotated with their respective types using LOD URIs  ...  )Named Entity(NE) recognition based on the Linked Open Data (LOD) cloud; and (iii) automatic knowledge base construction for both NEs and REs using semantic web ontologies that interconnect entities in  ...  PeerJCompSci is a collection of 27 open-access papers from the computer science edition of the PeerJ journal. 13 13 PeerJ Computer Science Journal, https:// peerj.com/computer-science/ 3.  ... 
doi:10.7717/peerj-cs.37 fatcat:ssogpzd45fhahbxi7jkwtofcwa

An Annotated Corpus for Machine Reading of Instructions in Wet Lab Protocols

Chaitanya Kulkarni, Wei Xu, Alan Ritter, Raghu Machiraju
2018 Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)  
format and benefit biological research.  ...  We make our annotated Wet Lab Protocol Corpus available to the research community. 1 Percy Liang. 2016. SQuAD: 100,000+ questions for machine comprehension of text.  ...  Acknowledgement We would like to thank the annotators: Bethany Toma, Esko Kautto, Sanaya Shroff, Alex Jacobs, Berkay Kaplan, Colins Sullivan, Junfa Zhu, Neena Baliga and Vardaan Gangal.  ... 
doi:10.18653/v1/n18-2016 dblp:conf/naacl/KulkarniXRM18 fatcat:7yvpj4ul65avrhosglh7qjsvta

Building Structured Databases of Factual Knowledge from Massive Text Corpora

Xiang Ren, Meng Jiang, Jingbo Shang, Jiawei Han
2017 Proceedings of the 2017 ACM International Conference on Management of Data - SIGMOD '17  
In this tutorial, we introduce data-driven methods on mining structured facts (i.e., entities and their relations/attributes for types of interest) from massive text corpora, to construct structured databases  ...  To turn such massive unstructured text data into structured, actionable knowledge, one of the grand challenges is to gain an understanding of the factual information (e.g., entities, attributes, relations  ...  W911NF-09-2-0053 (NSCTA), National Science Foundation IIS-1320617 and IIS 16-18481, and grant 1U54GM114838 awarded by NIGMS through funds provided by the trans-NIH Big Data to Knowledge (BD2K) initiative  ... 
doi:10.1145/3035918.3054781 dblp:conf/sigmod/RenJSH17 fatcat:3mrzx3e3yzapnknx5ciwcwm4i4

Semantic Web and Human Computation: The status of an emerging field

Marta Sabou, Lora Aroyo, Kalina Bontcheva, Alessandro Bozzon, Rehab K. Qarout
2018 Semantic Web Journal  
EP/I004327/1; and by the Amsterdam Institute for Advanced Metropolitan Solutions, with the AMS Social Bot grant.  ...  An Extended Study of Content and Crowdsourcing-related Performance Factors in Named Entity Annotation This paper addresses an important problem related to named entity recognition (NER) performed on noisy  ...  Experiments were conducted on 120 untyped DBpedia entities, and have demonstrated the intrinsic complexity of the entity typing problem.  ... 
doi:10.3233/sw-180292 fatcat:ips6wqegjnemlnyoburlzsb7iy

PARADE: A New Dataset for Paraphrase Identification Requiring Computer Science Domain Knowledge [article]

Yun He, Zhuoer Wang, Yin Zhang, Ruihong Huang, James Caverlee
2020 arXiv   pre-print
PARADE contains paraphrases that overlap very little at the lexical and syntactic level but are semantically equivalent based on computer science domain knowledge, as well as non-paraphrases that overlap  ...  Experiments show that both state-of-the-art neural models and non-expert human annotators have poor performance on PARADE.  ...  As a first step, we focus in this paper on the computer science domain.  ... 
arXiv:2010.03725v1 fatcat:v2pld5lopfgjzm535ljvqqjrea

Leveraging Unannotated Texts for Scientific Relation Extraction

Qin DAI, Naoya INOUE, Paul REISERT, Kentaro INUI
2018 IEICE transactions on information and systems  
One of these tasks, Scientific Relation Extraction, aims at automatically capturing scientific semantic relationships among entities in scientific documents.  ...  In order to efficiently grasp such knowledge, various computational tasks are proposed that train machines to read and analyze scientific documents.  ...  Acknowledgements This work was supported by JST CREST Grant Number JPMJCR1513, Japan and KAKENHI Grant Number 16H06614.  ... 
doi:10.1587/transinf.2018edp7180 fatcat:btw3qvfxcbd6bbys3jma6lfhmq

Overview of STEM Science as Process, Method, Material, and Data Named Entities [article]

Jennifer D'Souza
2022 arXiv   pre-print
Agriculture, Astronomy, Biology, Chemistry, Computer Science, Earth Science, Engineering, Material Science, Mathematics, and Medicine.  ...  Our analysis is defined over a large-scale corpus comprising 60K abstracts structured as four scientific entities process, method, material, and data.  ...  Figure 18 : 18 Figure 18: Structured KG representation of a Computer Science domain publication Abstract (Dolev et al., 2014) as PROCESS, METHOD, MATERIAL, and DATA typed entities.  ... 
arXiv:2205.11863v1 fatcat:rvu7p4i6dnhz3euuevfqxdi6sm
« Previous Showing results 1 — 15 out of 48,178 results