Filters








1,647 Hits in 3.2 sec

Extreme Extraction: Only One Hour per Relation [article]

Raphael Hoffmann, Luke Zettlemoyer, Daniel S. Weld
2015 arXiv   pre-print
Information Extraction (IE) aims to automatically generate a large knowledge base from natural language text, but progress remains slow.  ...  Experiments show that experts can create quality extractors in under an hour and even NLP novices can author good extractors.  ...  Unlike our work, this line of research does not evaluate the effectiveness of these languages with users, in terms of development time and extraction quality.  ... 
arXiv:1506.06418v1 fatcat:b7h3yngagzfadhs5nqqaiult7q

Bootstrap Pattern Learning for Open-Domain CLQA

Hideki Shima, Teruko Mitamura
2010 NTCIR Conference on Evaluation of Information Access Technologies  
The key technical contribution of this paper is a minimally supervised bootstrapping approach to generating lexicosyntactic patterns used for answer extraction.  ...  based approach, for both monolingual and crosslingual tracks.  ...  Key term extraction The key term extractor is responsible for creating a list of terms that will be useful for both retrieving potentially relevant answer-bearing documents and subsequently extracting  ... 
dblp:conf/ntcir/ShimaM10 fatcat:rbapzevsdzfmjjey2vtmyzvp3m

Autonomously semantifying wikipedia

Fei Wu, Daniel S. Weld
2007 Proceedings of the sixteenth ACM conference on Conference on information and knowledge management - CIKM '07  
Berners-Lee's compelling vision of a Semantic Web is hindered by a chicken-and-egg problem, which can be best solved by a bootstrapping method -creating enough structured data to motivate the development  ...  We choose Wikipedia as an initial data source, because it is comprehensive, not too large, high-quality, and contains enough manuallyderived structure to bootstrap an autonomous, self-supervised process  ...  We also thank anonymous reviewers for valuable suggestions and comments.  ... 
doi:10.1145/1321440.1321449 dblp:conf/cikm/WuW07 fatcat:sqw6noesufhgletzdzz76bak34

A Survey on Ranking of Features Using Customer Opinion Reviews

Prasad Mahale, Sonali Borase
2020 International Journal of Scientific Research in Science Engineering and Technology  
These audits are significant for the clients just as for the dealers. The majority of the surveys are scattered so it produces trouble for utilizing significant data.  ...  Presently day's web based business is quickly developing which gives office for clients to buy items on the web.  ...  Be that as it may, this language model may be one-sided to visit terms in the audits and can't foresee the viewpoint score precisely accordingly can't sift through clamor effectively.  ... 
doi:10.32628/ijsrset2072109 fatcat:e5jkrmbmanf3dbyfcgvrthe4vm

Using Wikipedia to bootstrap open information extraction

Daniel S. Weld, Raphael Hoffmann, Fei Wu
2009 SIGMOD record  
After considering the roots of the problem, we discuss the extensions necessary to successfully bootstrap. • Reduced Recall When Extracting from the Web: In many cases, the language used on Wikipedia is  ...  By validating facts multiple times from different visitors, we believe we can achieve very high precision on extracted tuples.  ... 
doi:10.1145/1519103.1519113 fatcat:uxumqhkhq5agzgcv6bevrpyvqa

Unsupervised named-entity extraction from the Web: An experimental study

Oren Etzioni, Michael Cafarella, Doug Downey, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, Alexander Yates
2005 Artificial Intelligence  
List Extraction locates lists of class instances, learns a "wrapper" for each list, and extracts elements of each list.  ...  Since each method bootstraps from KNOWITALL's domain-independent methods, the methods also obviate hand-labeled training examples.  ...  The rule language also covers n-ary predicates with arbitrary relation name and multiple predicate arguments, such as the rule for CeoOf(Person,Company) shown in Fig. 9 .  ... 
doi:10.1016/j.artint.2005.03.001 fatcat:qmdekfdvjnf53lbpjmdno2k7uy

Learning Open Information Extraction of Implicit Relations from Reading Comprehension Datasets [article]

Jacob Beckerman, Theodore Christakis
2019 arXiv   pre-print
Implicit tuples are our term for this type of extraction where the relation is not present in the input sentence.  ...  For example, it is evident that "Fed chair Powell indicates rate hike" implies (Powell, is a, Fed chair) and (Powell, works for, Fed).  ...  While specific extractors are important, there are a multiplicity of implicit relation types and it would be intractable to categorize and design extractors for each one.  ... 
arXiv:1905.07471v1 fatcat:tpjjnffyk5aavcojbnn7s52emu

API2MoL: Automating the building of bridges between APIs and Model-Driven Engineering

Javier Luis Cánovas Izquierdo, Frédéric Jouault, Jordi Cabot, Jesús García Molina
2012 Information and Software Technology  
We provide a toolkit (language and bootstrap tool) for the creation of bridges between APIs and MDE.  ...  Our proposal includes a complete prototype of a toolkit (language engine and bootstrap tool) focused on Java APIs, although an adaptation of the approach to deal with APIs for other statically-typed object-oriented  ...  Bearing this idea in mind, we have defined the API2MoL approach aimed to automate the building of the injector and extractor for a given API, by providing a Domain Specific Language (DSL) for specifying  ... 
doi:10.1016/j.infsof.2011.09.006 fatcat:d6f4gm73m5c4bm5lxiihl2dcqe

Bootstrapped Self Training for Knowledge Base Population

Gabor Angeli, Victor Zhong, Danqi Chen, Arun Tejasvi Chaganty, Jason Bolton, Melvin Jose Johnson Premkumar, Panupong Pasupat, Sonal Gupta, Christopher D. Manning
2015 Text Analysis Conference  
Pattern-based relation extractors suffer from low recall, whereas distant supervision yields noisy data which hurts precision.  ...  We propose bootstrapped selftraining to capture the benefits of both systems: the precision of patterns and the generalizability of trained models.  ...  SQL is a powerful language for probing the corpus.  ... 
dblp:conf/tac/AngeliZCCBPPGM15 fatcat:cacm6nl67rht7jll4nvttdjthu

Designing an Extensible Domain-Specific Web Corpus for "Layfication" [chapter]

Marina Santini, Arne Jönsson, Wiktor Strandqvist, Gustav Cederblad, Mikael Nyström, Marjan Alirezaie, Leili Lind, Eva Blomqvist, Maria Lindén, Annica Kristoffersson
2019 Advances in Systems Analysis, Software Engineering, and High Performance Computing  
The main purpose of the corpus is to be used for building and training language technology applications for the "layfication" of the specialized medical jargon.  ...  ., patients, family caregivers, and home care aides) understand medical terms, which often appear opaque. Exploratory experiments are presented and discussed.  ...  This implies using and creating different types of terminologies for different levels of medical expertise and for multiple languages.  ... 
doi:10.4018/978-1-5225-7879-6.ch006 fatcat:tgaorpe5fvepnhl7j66mkp2taa

Bootstrapping to a semantic grid

J. Schwidder, T. Talbott, J. Myers
2005 CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005.  
The authors also acknowledge helpful discussions and ongoing collaborations with members of the Collaboratory for Multiscale Chemical Science (CMCS) project.  ...  Employees of Battelle Memorial Institute, which operates Pacific Northwest National Laboratory for the US Department of Energy under contract DE-AC06-76RL01830 and Oak Ridge National Laboratory under contract  ...  BFD [9] is an extension of the eXtensible Scientific Interchange Language (XSIL) [10] that can describe the layout of a binary or ASCII file format in terms of an XML data model.  ... 
doi:10.1109/ccgrid.2005.1558551 dblp:conf/ccgrid/SchwidderTM05 fatcat:xsa5jijblje5xe6gtui46q42t4

User-driven relational models for entity-relation search and extraction

Jay Urbain
2012 Proceedings of the 1st Joint International Workshop on Entity-Oriented and Semantic Search - JIWES '12  
To meet this need, we present a ranked retrieval and extraction framework for collectively learning and integrating evidence of entities and relational dependencies to predict at query time, a ranking  ...  Our goal is to develop user-driven relational models of entities and their relational dependencies, and a search system based on these models that allow users to search for known entities and relations  ...  Other issues include the relative simplicity of the relational patterns extracted, and the need to define an extractor for bootstrapping in advance.  ... 
doi:10.1145/2379307.2379312 fatcat:7lixxcmrd5cure3mj3dgg6rxam

ClinPhen extracts and prioritizes patient phenotypes directly from medical records to expedite genetic disease diagnosis

Cole A. Deisseroth, Johannes Birgmeier, Ethan E. Bodle, Jennefer N. Kohler, Dena R. Matalon, Yelena Nazarenko, Casie A. Genetti, Catherine A. Brownstein, Klaus Schmitz-Abe, Kelly Schoch, Heidi Cope, Rebecca Signer (+6 others)
2018 Genetics in Medicine  
Because the phenotype extractors cTAKES and MetaMap output Unified Medical Language System (UMLS) terms, while all gene-ranking tools require HPO terms, we converted UMLS terms to HPO using the UMLS Metathesaurus  ...  The average (column) and 95% confidence interval (calculated using bootstrapping with 1000 trials) of the precision and sensitivity values across all patients are displayed for each extractor.  ...  Supplementary Material Refer to Web version on PubMed Central for supplementary material.  ... 
doi:10.1038/s41436-018-0381-1 pmid:30514889 pmcid:PMC6551315 fatcat:4vai22oxrncefarsl6zgeum6fi

Bootstrapping Relation Extractors using Syntactic Search by Examples [article]

Matan Eyal, Asaf Amrami, Hillel Taub-Tabib, Yoav Goldberg
2021 arXiv   pre-print
In this work we propose a process for bootstrapping training datasets which can be performed quickly by non-NLP-experts.  ...  We use these to obtain positive examples by searching for sentences that are syntactically similar to user input examples.  ...  However, this work is only an initial step in exploring methods for bootstrapping relation extractors using minimal user effort, supported by strong pretrained neural LMs.  ... 
arXiv:2102.05007v1 fatcat:oxxjk4ktcvgqzppxtqlz2lbghe

Optique: OBDA Solution for Big Data [chapter]

D. Calvanese, Martin Giese, Peter Haase, Ian Horrocks, T. Hubauer, Y. Ioannidis, Ernesto Jiménez-Ruiz, E. Kharlamov, H. Kllapi, J. Klüwer, Manolis Koubarakis, S. Lamparter (+11 others)
2013 Lecture Notes in Computer Science  
The chosen expressiveness of the ontology and mapping language are focused on very concrete solutions. Management of streaming data is essentially ignored despite their importance for industry.  ...  The important limitations of the state of the art OBDA systems are as follows: -The usability of OBDA systems is hampered by the need to use a formal query language which is difficult for end-users even  ...  We now briefly introduce the main components: Optique Solution -The query formulation component aims at providing a friendly interface for non-technical users combining multiple representation paradigms  ... 
doi:10.1007/978-3-642-41242-4_48 fatcat:eaybvwjdcvbgrifpq3xpmhs2tm
« Previous Showing results 1 — 15 out of 1,647 results