5 Hits in 5.7 sec

A Text Similarity Meta-Search Engine Based on Document Fingerprints and Search Results Records

Felipe Bravo-Marquez, Gaston L'Huillier, Sebasti´n A. Rios, Juan D. Vel´squez
2011 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology  
In this work, a similarity function between the given document and retrieved results is estimated.  ...  The retrieval of similar documents from the Web using documents as input instead of key-term queries is not currently supported by traditional Web search engines.  ...  DOCODE: Document Copy Detection (; and the Web Intelligence Research Group (  ... 
doi:10.1109/wi-iat.2011.27 dblp:conf/webi/Bravo-MarquezLRV11 fatcat:4mnsgx5ukrfvhbs3du2ogk5dzy

DOCODE 3.0 (DOcument COpy DEtector): A system for plagiarism detection by applying an information fusion process from multiple documental data sources

Juan D. Velásquez, Yerko Covacevich, Francisco Molina, Edison Marrese-Taylor, Cristián Rodríguez, Felipe Bravo-Marquez
2016 Information Fusion  
These algorithms have been successfully tested in the scientific community in solving tasks like the identification of plagiarized passages and the retrieval of source candidates from the Web, among other  ...  In this article, we present DOCODE 3.0, a Web system for educational institutions that performs automatic analysis of large quantities of digital documents in relation to their degree of originality.  ...  (13IDL4-24315) entitled, DOCODE: DOcument COpy DEtector (  ... 
doi:10.1016/j.inffus.2015.05.006 fatcat:45itivxa7zernds7xq647bvstm

Distribution of "Characteristic" Terms in MEDLINE Literatures

Neil R. Smalheiser, Wei Zhou, Vetle I. Torvik
2011 Information  
Characteristic terms are utilized in several of our web-based services (Anne O'Tate and Arrowsmith), and should be useful for a variety of other information-processing tasks designed to improve text mining  ...  In this report, we studied how the cut-off criterion varied as a function of literature size and term frequency in MEDLINE as a whole, and have compared the distribution of characteristic terms within  ...  Acknowledgements This Human Brain Project/Neuroinformatics research (LM007292 and LM08364) is funded jointly by the National Library of Medicine and the National Institute of Mental Health.  ... 
doi:10.3390/info2020266 fatcat:xaedpgxlwzcajjsmbfwtydu25e

Geospatial Information Retrieval for POIs with the use of a Data Mining System

Alexander Czech
2015 unpublished
For this, the HTML documents are transformed into a vector in the vector space model.  ...  For 9 classes, 18 classification vectors are created and compared with cosine similarity to the HTML document vectors. The results are then associated and summarized on an address basis.  ...  The Vector Space Model can be used to compare document similarity and search queries.  ... 
doi:10.25365/thesis.40077 fatcat:fna4pss43fbsxdjdxctoryvfi4

The World Trade Web: A Multiple-Network Perspective [article]

Paolo Sgrignoli
2014 pre-print
With respect to IM, a general positive correlation with IT is highlighted and product categories for which this effect is stronger are identified and cross-checked with previous classifications.  ...  Then, using the Heckman selection model with a gravity equation, (non-linear) components arising from distance, position in the Global Supply Chain and presence of Regional Trade Agreements are studied  ...  All the other controls used in the regressions (e.g. contiguity, common language, etc.) have been retrieved from the CEPII dataset documented in Mayer and Zignago (2011).  ... 
doi:10.6092/imtlucca/e-theses/137 arXiv:1409.3799v1 fatcat:k5guwsswqndmvpafrtjfaqn7ha