Filters








60 Hits in 5.6 sec

Data Mining: Analysis of Structured and Unstructured Information [chapter]

Dyan Decker, Alexandre Blanc, John Loveland, Mona Clayton
2015 A Guide to Forensic Accounting Investigation  
In this retrieval model, the user's information need is exhibited as Indri's Structural Query Language.  ...  We utilize Wikipedia as a great, multilingual, free-content encyclopedia for our knowledge base and also some state of the art algorithms for extracting Wikipedia's concepts from the user's information  ...  Finally, we must of course acknowledge the tireless efforts of the Wikipedia community that make a valuable knowledge base during years. We are also debated to the CLEF organizers too.  ... 
doi:10.1002/9781119200048.ch17 fatcat:k2uz5oxvnbe5jbmdfrotr2payu

Collaboratively built semi-structured content and Artificial Intelligence: The story so far

Eduard Hovy, Roberto Navigli, Simone Paolo Ponzetto
2013 Artificial Intelligence  
unstructured and (semi-)structured information.  ...  In this framework, named Wikipedia-based Evolutionary Semantics (Wiki-ES), wikified queries are learned using a variation of a genetic programming algorithm, and subsequently used to perform concept-based  ... 
doi:10.1016/j.artint.2012.10.002 fatcat:mwk5o254urb2dejsh7c224uu3q

Mining meaning from Wikipedia

Olena Medelyan, David Milne, Catherine Legg, Ian H. Witten
2009 International Journal of Human-Computer Studies  
The article addresses how Wikipedia is being used as is, how it is being improved and adapted, and how it is being combined with other structures to create entirely new resources.  ...  language processing; using it to facilitate information retrieval and information extraction; and as a resource for ontology building.  ...  We are also grateful to Enrico Motta and Susan Wiedenbeck for guiding us in the right direction.  ... 
doi:10.1016/j.ijhcs.2009.05.004 fatcat:mzxszf4jlfcizbgxuemgdwzdiy

Entity Disambiguation with Linkless Knowledge Bases

Yang Li, Shulong Tan, Huan Sun, Jiawei Han, Dan Roth, Xifeng Yan
2016 Proceedings of the 25th International Conference on World Wide Web - WWW '16  
Previous research has tackled this problem by making use of two types of context-aware features derived from the reference knowledge base, namely, the context similarity and the semantic relatedness.  ...  We propose a generative model to automatically mine such evidences out of noisy information. The mined evidences can mimic the role of the missing links and help boost the LNED performance.  ...  Previous research [7, 9, 12, 13, 15, 17, 23, 24, 26, 29] has tackled the NED problem (with respect to Wikipedia) by making use of various textual and structural information from Wikipedia.  ... 
doi:10.1145/2872427.2883068 dblp:conf/www/LiTSHRY16 fatcat:ilaegwv7yjcftk5yqzno5zvocm

VizByWiki

Allen Yilun Lin, Joshua Ford, Eytan Adar, Brent Hecht
2018 Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW '18  
We show that this problem is tractable through a new system, VizByWiki, that mines contextually relevant data visualizations from Wikimedia Commons, the central file repository for Wikipedia.  ...  We also demonstrate that VizByWiki can automatically rank visualizations according to their usefulness with reasonable accuracy (nDCG@5 of 0.82).  ...  Wikification Wikification involves disambiguating named entities in unstructured text to Wikipedia articles [24] .  ... 
doi:10.1145/3178876.3186135 dblp:conf/www/LinFAH18 fatcat:jcvdpdolznadvkp3hpbucggcoi

Data Integration for Heterogenous Datasets

James Hendler
2014 Big Data  
A number of demos were developed for the 2010 Semantic Web Challenge at the International Semantic Web Conference.  ...  Acknowledgments Except where otherwise cited, the examples in this article are based on those developed by graduate students in the Tetherless World Constellation at Rensselaer Polytechnic Institute (RPI  ...  Clearly, to be able to answer our query, we now need some information that would allow us to assert where the equivalences between terms lie.  ... 
doi:10.1089/big.2014.0068 pmid:25553272 pmcid:PMC4276119 fatcat:pzyz76crpnhtrdsrukw4ji3wqe

Computational knowledge and information management in veterinary epidemiology

Svitlana Volkova, William H. Hsu
2010 2010 IEEE International Conference on Intelligence and Security Informatics  
For this purpose, we present a system for animal disease outbreak analysis by automatically extracting relational information from online data.  ...  We aim to detect and map infectious disease outbreaks by extracting information from unstructured sources. The system crawls web sites and classifies pages by topical relevance.  ...  : 1) data collection using crawler components; 2) information sharing through the web interface; 3) query-based search using a Lucene-based 27 ranking component; 4) data analysis using entity extraction  ... 
doi:10.1109/isi.2010.5484764 dblp:conf/isi/VolkovaH10 fatcat:go3kilrfivgjnfwuqtl7k3n6fe

Utilising Wikipedia for Text Mining Applications

Muhammad Atif Qureshi
2016 SIGIR Forum  
As can be seen in the figure, unstructured or semi-structured textual data is extracted from the data sources which is then analysed using the knowledge base (Wikipedia) for different application domains  ...  DBPedia DBpedia is a knowledge base which extracts various types of structured information from Wikipedia [13] .  ...  exploit the Wikipedia articles' hyperlink structure.  ... 
doi:10.1145/2888422.2888449 fatcat:lck3kkxoazcj5powaqhjs6epty

Discovering and disambiguating named entities in text

Johannes Hoffart
2013 Proceedings of the 2013 Sigmod/PODS Ph.D. symposium on PhD symposium - SIGMOD'13 PhD Symposium  
Discovering entities such as people, organizations, songs, or places in natural language texts is a valuable asset for semantic search, machine translation, and information extraction.  ...  Additionally, in this dynamic world, new entities are constantly emerging, and disambiguation methods need to cope with the resulting incompleteness of knowledge bases.  ...  Applications Links from unstructured to structured knowledge in the form of entities is not only a fundamental ingredient for further information extraction methods, but are immediately useful for end  ... 
doi:10.1145/2483574.2483582 dblp:conf/sigmod/Hoffart13 fatcat:bjqq5uusrragdm7ewraf63of4e

Entity Linking for Biomedical Literature

Jin Guang Zheng, Daniel Howsmon, Boliang Zhang, Juergen Hahn, Deborah McGuinness, James Hendler, Heng Ji
2014 Proceedings of the ACM 8th International Workshop on Data and Text Mining in Bioinformatics - DTMBIO '14  
The Entity Linking (EL) task links entity mentions from an unstructured document to entities in a knowledge base.  ...  The approach leverages the rich semantic information and structures in ontologies for similarity computation and entity ranking.  ...  Using this broad definition, Wikipedia is a popular knowledge base that is often used for entity linking because it contains structured information such as titles, hyperlinks, infoboxes as well as unstructured  ... 
doi:10.1145/2665970.2665974 dblp:conf/cikm/ZhengHZHMHJ14 fatcat:hrsnminpmfgunpwlz2eifjxxpy

Entity linking for biomedical literature

Jin G Zheng, Daniel Howsmon, Boliang Zhang, Juergen Hahn, Deborah McGuinness, James Hendler, Heng Ji
2015 BMC Medical Informatics and Decision Making  
The Entity Linking (EL) task links entity mentions from an unstructured document to entities in a knowledge base.  ...  The approach leverages the rich semantic information and structures in ontologies for similarity computation and entity ranking.  ...  Using this broad definition, Wikipedia is a popular knowledge base that is often used for entity linking because it contains structured information such as titles, hyperlinks, infoboxes as well as unstructured  ... 
doi:10.1186/1472-6947-15-s1-s4 pmid:26045232 pmcid:PMC4460707 fatcat:e5pwxes7tvgofewecbxiufvlzi

Leveraging web resources for keyword assignment to short text documents [article]

Ayush Singhal, Ravindra Kasturi, Ankit Sharma, Jaideep Srivastava
2017 arXiv   pre-print
We find that the proposed approach not just improves the accuracy of keyword assignment but offer a computationally efficient solution which can be used in real world applications.  ...  However, the approaches developed in the literature for full text documents cannot be used to assign keywords to low text content documents like twitter feeds, news clips, product reviews or even short  ...  The keywords in T are clustered using a pairwise semantic distance measure. Between any two keywords in T , the semantic distance is computed us-ing the unstructured web in the following way.  ... 
arXiv:1706.05985v1 fatcat:mnfxumshgbgcxkg6nwwhjo4gjy

Resolving polysemy and pseudonymity in entity linking with comprehensive name and context modeling

Zhao-Yan Ming, Tat Seng Chua
2015 Information Sciences  
Names are important atomic information carriers in unstructured text.  ...  Specially, we harness entity coreferences within query and KB documents together with the external alias resources for modeling name variants, and further use the name variants to identify focused context  ...  Toward populating structured knowledge base, Text Analysis Conference 2009 introduced the entity linking task [47] that takes an entity mention and the document it appears in as the query and the Wikipedia  ... 
doi:10.1016/j.ins.2015.02.025 fatcat:cs2auka3afggflhcxawgjsxcla

Natural language processing for music knowledge discovery

Sergio Oramas, Luis Espinosa-Anke, Francisco Gómez, Xavier Serra
2018 Journal of New Music Research  
, information extraction, knowledge graph generation and sentiment analysis.  ...  Each of these approaches is presented alongside different use cases (i.e., flamenco, Renaissance and popular music) where large collections of documents are processed, and conclusions stemming from data-driven  ...  The taxonomy of categories can be explored by querying DBpedia, a knowledge base with structured content extracted from Wikipedia.  ... 
doi:10.1080/09298215.2018.1488878 fatcat:z3byyoyzjrh6rboj3l43lw26p4

Natural Language Query Processing Framework for Biomedical Literature

Carmen De Maio, Giuseppe Fenza, Vincenzo Loia, Mimmo Parente
2015 Proceedings of the 2015 Conference of the International Fuzzy Systems Association and the European Society for Fuzzy Logic and Technology   unpublished
Nevertheless, these information are often enclosed in unstructured documents stressing the need to define suitable framework to support execution of analytics services and richer information discovery  ...  This work introduces a general framework to support natural language user's query over facet-based data model.  ...  So, it is necessary to build a useful biomedical knowledge base from unstructured and heterogeneous content in order to support natural language query processing and richer information discovery tasks  ... 
doi:10.2991/ifsa-eusflat-15.2015.232 fatcat:43qoa3vvcffspmyll457fddka4
« Previous Showing results 1 — 15 out of 60 results