Filters








87,739 Hits in 4.6 sec

A Supervised Learning Approach to Entity Search [chapter]

Guoping Hu, Jingjing Liu, Hang Li, Yunbo Cao, Jian-Yun Nie, Jianfeng Gao
2006 Lecture Notes in Computer Science  
We propose using a linear model to combine the uses of different features and employing a supervised learning approach in training of the model.  ...  In entity search, given a query and an entity type, a search system returns a ranked list of entities in the type (e.g., person name, time expression) relevant to the query.  ...  In this study, our goal is to develop a general method for entity search and thus we employ the supervised learning approach to perform the task.  ... 
doi:10.1007/11880592_5 fatcat:qssjg5fewvfsvcxieom73zveam

Pairwise Webpage Coreference Classification Using Distant Supervision

S. Subramanian, Timothy Baldwin, Julian Brooke, Trevor Cohn
2017 Proceedings of the 26th International Conference on World Wide Web Companion - WWW '17 Companion  
To strike a balance between unsupervised and supervised methods that require annotated data, we build a positive and unlabelled (PU) learning model, where we obtain positive examples using web search-based  ...  A person or other entity is often associated with multiple URL endpoints on the web, motivating the task of determining whether a given pair of webpages is coreferent to a given entity.  ...  web-search, and employ a positive and unlabelled (PU) learning algorithm [3, 4] .  ... 
doi:10.1145/3041021.3054224 dblp:conf/www/SubramanianBBC17 fatcat:tr4bxqamznelliaunywsvc4yiq

Named entity mining from click-through data using weakly supervised latent dirichlet allocation

Gu Xu, Shuang-Hong Yang, Hang Li
2009 Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '09  
It employs a method, referred to as Weakly Supervised Latent Dirichlet Allocation (WS-LDA), to accurately learn the topic model with partially labeled named entities.  ...  This paper proposes conducting NEM by using click-through data collected at a web search engine, employing a topic model that generates the click-through data, and learning the topic model by weak supervision  ...  It is hard to deal with the ambiguity problem by taking a deterministic approach. The use of search click-through data has been proposed in search and other applications.  ... 
doi:10.1145/1557019.1557165 dblp:conf/kdd/XuYL09 fatcat:nhbcnrjvnbbsvdqwfonb37vo2u

Supervised rank aggregation

Yu-Ting Liu, Tie-Yan Liu, Tao Qin, Zhi-Ming Ma, Hang Li
2007 Proceedings of the 16th international conference on World Wide Web - WWW '07  
To further enhance ranking accuracies, we propose employing supervised learning to perform the task, using labeled data. We refer to the approach as 'Supervised Rank Aggregation'.  ...  We set up a general framework for conducting Supervised Rank Aggregation, in which learning is formalized an optimization which minimizes disagreements between ranking results and the labeled data.  ...  ACKNOWLEDGEMENTS The authors would like to thank Wei-Ying Ma at MSRA for his suggestions and comments on this work. They are also grateful to Shisheng Li at USTC for his helps in the experiments.  ... 
doi:10.1145/1242572.1242638 dblp:conf/www/LiuLQML07 fatcat:axe6pj5c2neijo3e4guctojjre

A Parametric Layered Approach to Perform Web Page Ranking

Ratika Goel, Anchal Garg
2013 International Journal of Computer Applications  
Web crawling is not only used for searching a webpage over the web but also to order them according to user interest.  ...  In this present work, dynamic and user interest evolution based parametric approach is defined to perform the web crawling and to arrange the web pages in more definite way.  ...  Machine learning approaches can be separated into three categories as supervised learning (SL), semi-supervised learning (SSL) and unsupervised learning (UL).  ... 
doi:10.5120/11467-7251 fatcat:p4wl56r6vrcydk3qa4pezsb3da

Modeling Missing Data in Distant Supervision for Information Extraction

Alan Ritter, Luke Zettlemoyer, Mausam, Oren Etzioni
2013 Transactions of the Association for Computational Linguistics  
Despite the added complexity introduced by reasoning about missing data, we demonstrate that a carefully designed local search approach to inference is very accurate and scales to large datasets.  ...  This provides a natural way to incorporate side information, for instance modeling the intuition that text will often mention rare entities which are likely to be missing in the database.  ...  Acknowledgements The authors would like to thank Dan Weld, Chris Quirk, Raphael Hoffmann and the anonymous reviewers for helpful comments. Thanks to Wei Xu for providing data.  ... 
doi:10.1162/tacl_a_00234 fatcat:fgrqbhthk5hxhiaobrf6vxk7ji

Distant Supervision for Silver Label Generation of Software Mentions in Social Scientific Publications

Katarina Boland, Frank Krüger
2019 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval  
In this paper, we investigate the use of weakly supervised approaches with distant supervision to create silver labels to train supervised software mention extraction methods using transfer learning.  ...  We show that by combining even only a small number of weakly supervised approaches, a silver standard corpus can be created that serves as a useful basis for transfer learning.  ...  Zhou, Z.H.: A brief introduction to weakly supervised learning.  ... 
dblp:conf/sigir/BolandK19 fatcat:bymesjsx55b5nhp3wxmh6265a4

Exploiting entities for query expansion

Wladmir Cardoso Brandão
2014 SIGIR Forum  
To overcome this problem we propose WAVE, a self-supervised approach to autonomously generate infoboxes for Wikipedia articles.  ...  A substantial fraction of web search queries contain references to entities, such as persons, organizations, and locations.  ...  To overcome this problem we propose WAVE, a self-supervised approach to autonomously generate infoboxes for Wikipedia articles.  ... 
doi:10.1145/2641383.2641393 fatcat:pni3c22krreazpn5zfjrmhwm2a

Joint Information Extraction from the Web Using Linked Data [chapter]

Isabelle Augenstein
2014 Lecture Notes in Computer Science  
The bottleneck for information extraction systems is obtaining training data to learn classifiers.  ...  In this doctoral research, we investigate how existing data in knowledge bases can be used to automatically annotate training data to learn classifiers to in turn extract more data to expand knowledge  ...  Acknowledgements We thank Fabio Ciravegna and Diana Maynard for helping to develop this research plan, Ruben Verborgh and Tom De Nies for their writing tips, as well as the anonymous reviewers for their  ... 
doi:10.1007/978-3-319-11915-1_32 fatcat:4oona332szck3i3gyr46nml4ca

An End-to-End Entity Linking Approach for Tweets

Ikuya Yamada, Hideaki Takeda, Yoshiyasu Takefuji
2015 Workshop on Making Sense of Microposts  
We present a novel approach for detecting, classifying, and linking entities from Twitter posts (tweets). The task is challenging because of the noisy, short, and informal nature of tweets.  ...  Consequently, the proposed approach introduces several methods that robustly facilitate successful realization of the task with enhanced performance in several measures.  ...  Mention Detection and Disambiguation In this step, we first assign a score to mention candidates using a supervised machine-learning model.  ... 
dblp:conf/msm/YamadaTT15 fatcat:tikhsyde6vhpjcgzytgzow6fwy

Fine-Grained Named Entity Recognition in Question Answering with DBpedia

Shimin Zhong, Yajun Du, Zhen Wei Gao
2018 Journal of Physics, Conference Series  
In this paper, we propose a novel model to address both problems, using a distant supervised method. Firstly, we use the web search to obtain more relevant information.  ...  Secondly, we present a greedy n-grams algorithm to extract the entity mentions.  ...  Higashinaka et al. [29] proposed a supervised machine learning model for classifying Wikipedia articles into the 200 fine-grained NE types.  ... 
doi:10.1088/1742-6596/1087/3/032003 fatcat:hjozejnyprfyzisml42ahxvfva

Extracting Relations between Non-Standard Entities using Distant Supervision and Imitation Learning

Isabelle Augenstein, Andreas Vlachos, Diana Maynard
2015 Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing  
While state of the art distant supervision approaches use off-theshelf named entity recognition and classification (NERC) systems to identify relation arguments, discrepancies in domain or genre between  ...  Distantly supervised approaches have become popular in recent years as they allow training relation extractors without textbound annotation, using instead known relations from a knowledge base and a large  ...  Also referred to as search-based structured prediction or learning to search.  ... 
doi:10.18653/v1/d15-1086 dblp:conf/emnlp/AugensteinVM15 fatcat:txoviervmzhz3fkvsph4h4eboe

Learning to expand queries using entities

Wladmir C. Brandão, Rodrygo L. T. Santos, Nivio Ziviani, Edleno S. de Moura, Altigran S. da Silva
2014 Journal of the Association for Information Science and Technology  
In this article, we introduce a supervised learning approach that exploits named entities for query expansion using Wikipedia as a repository of high-quality feedback documents.  ...  In contrast with existing entity-oriented pseudorelevance feedback approaches, we tackle query expansion as a learning-to-rank problem.  ...  We propose a novel learning-to-rank approach to identify and weight effective expansion terms related to entities in web search queries.  ... 
doi:10.1002/asi.23084 fatcat:rcbrgwdmnjfvris2kkl3qi7owy

FABLE: A Semi-Supervised Prescription Information Extraction System

Carson Tao, Michele Filannino, Özlem Uzuner
2018 AMIA Annual Symposium Proceedings  
FABLE utilizes unannotated data to enhance annotated training data: it performs semi-supervised extraction of medication information using pseudo-labels with Conditional Random Fields (CRFs) to improve  ...  As a result, narratives of EHRs need to be processed with natural language processing (NLP) methods that can extract medication and prescription information from free text.  ...  FABLE utilizes a semi-supervised machine learning approach with CRFs and conditional confidence thresholds that are tuned to individual entity categories.  ... 
pmid:30815199 pmcid:PMC6371278 fatcat:ycwiiuwa4bddbdgnvdev36w6um

Learning When Not to Answer: A Ternary Reward Structure for Reinforcement Learning based Question Answering [article]

Fréderic Godin, Anjishnu Kumar, Arpit Mittal
2019 arXiv   pre-print
Employing a supervised learning strategy using depth-first-search paths to bootstrap the reinforcement learning algorithm further improves performance.  ...  structure used in prior work to a ternary reward structure which also rewards an agent for not answering a question rather than giving an incorrect answer.  ...  Hence an imitation learning approach could be beneficial here where we provide a number of expert paths to the learning algorithm to bootstrap the learning process.  ... 
arXiv:1902.10236v2 fatcat:cttva5hydvabzp7rgkgom3ttb4
« Previous Showing results 1 — 15 out of 87,739 results