82,961 Hits in 6.0 sec

Phrase Similarity through the Edit Distance [chapter]

Manuel Vilares, Francisco J. Ribadas, Jesús Vilares
2004 Lecture Notes in Computer Science  
The algorithm is based on a dynamic programming approach integrating both the edit distance between parse trees and single-term similarity.  ...  This work intends to capture the concept of similarity between phrases.  ...  Acknowledgments This research has been partially supported by the Spanish Government (projects TIC2000-0370-C02-01 and HF2002-0081, FPU grant AP2001-2545), the Autonomous Government of Galicia (projects  ... 
doi:10.1007/978-3-540-30075-5_30 fatcat:xu7fxgqtcvcfppk2pbg2wc6dke

Statistical Phrase Extraction and Indexing for Music Retrieval [chapter]

Atsuhiro Takasu, Teruhito Kanazawa, Jun Adachi
2003 IFIP Advances in Information and Communication Technology  
The proposed overall method accelerates query processing in two ways, by: (1) filtering songs by sub-sequences of notes, and (2) reducing the length of sequences in DP matching by using phrases instead  ...  We show experimentally that the proposed overall method reduces processing time to about 1/11 with 10% loss of retrieval accuracy compared with a retrieval method that does not use either the phrase information  ...  For the selected phrases, we measure the similarity based on the edit distance.  ... 
doi:10.1007/978-0-387-35660-0_34 fatcat:rysruesrtfe4pgrxbbrynlezzi

A case based approach to expressivity-aware tempo transformation

Maarten Grachten, Josep-Lluís Arcos, Ramon López de Mántaras
2006 Machine Learning  
The expressive resources for emphasizing the musical structure of the melody and the affective content differ depending on the performance tempo.  ...  Changing the tempo of a given melody is a problem that cannot be reduced to just applying a uniform transformation to all the notes of a musical piece.  ...  Acknowledgments This research has been partially supported by the Spanish Ministry of Science and Technology under the project TIC 2003-07776-C2-02 "CBR-ProMusic: Content-based Music Processing using CBR  ... 
doi:10.1007/s10994-006-9025-9 fatcat:xha6wi5z5ffz7h3yu5nfzkldvi

Automatically generating related queries in Japanese

Rosie Jones, Kevin Bartz, Pero Subasic, Benjamin Rey
2007 Language Resources and Evaluation  
The precision/recall curves show significant improvement with the new feature set and blocking rules, and are often better than the English counterpart.  ...  For Japanese, the opportunities for improving results are greater than for languages with a single character set, since documents may be written in multiple character sets, and a user may express the same  ...  Next to each pair is the edit distance measure designed to detect the similarity at hand.  ... 
doi:10.1007/s10579-007-9021-0 fatcat:y4bif35bzfh3xinylbq62nhe5i

A Novel Approach Based on Fault Tolerance and Recursive Segmentation to Query by Humming [chapter]

Xiaohong Yang, Qingcai Chen, Xiaolong Wang
2010 Lecture Notes in Computer Science  
Then improved edit distance, pitch deviation and overall bias are employed to measure the similarity between phrases and indexed entries.  ...  Query melody is segmented into phrases recursively with musical dictionary firstly.  ...  The authors would like to thank the anonymous reviewers for helpful suggestions.  ... 
doi:10.1007/978-3-642-13577-4_49 fatcat:gz54gwqwlzfizoiimwuaj56dwm

CharacTer: Translation Edit Rate on Character Level

Weiyue Wang, Jan-Thorsten Peter, Hendrik Rosendahl, Hermann Ney
2016 Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers  
This work proposes translation edit rate on character level (CharacTER), which calculates the character level edit distance while performing the shift edit on word level.  ...  In addition, we apply the hypothesis sentence length for normalizing the edit distance in CharacTER, which also provides significant improvements compared to using the reference sentence length.  ...  Acknowledgments This paper has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement n o 645452 (QT21).  ... 
doi:10.18653/v1/w16-2342 dblp:conf/wmt/WangPRN16 fatcat:oijkvht4nza47btjtt6qqryamq

Wiki trust metrics based on phrasal analysis

Mark Kramer, Andy Gregorowicz, Bala Iyer
2008 Proceedings of the 4th International Symposium on Wikis - WikiSym '08  
Wiki users receive very little guidance on the trustworthiness of the information they find.  ...  It is difficult for them to determine how long the text in a page has existed, or who originally authored the text.  ...  One could use edit distance [10] [6] , or semantic distance, if it could be measured.  ... 
doi:10.1145/1822258.1822291 dblp:conf/wikis/KramerGI08 fatcat:jlll3xxigvd25pgh4ior4zdw6m

Retrieving Lexical Semantics from Multilingual Corpora

Ahmad R. Shahid, Dimitar Kazakov
2010 POLIBITS Research Journal on Computer Science and Computer Engineering With Applications  
The paper also discusses how the success of this approach can be measured. The reported results are for English, German, French, and Greek using the Europarl parallel corpus.  ...  The approach can be extended to add relationships between these synsets that are akin to WordNet relationships of synonymy and hypernymy.  ...  Edit distance measures the minimum number of edit steps required to convert one string into another [10] , [11] , [12] .  ... 
doi:10.17562/pb-41-4 fatcat:isrtk3bo4vgblfaxyka3nclrwu

RedMed: Extending drug lexicons for social media applications [article]

Adam Lavertu, Russ B Altman
2019 bioRxiv   pre-print
Emerging drug abuse trends are identified through community surveillance programs, medical claims data, and other healthcare system data.  ...  AbstractIn 2017, drug abuse caused over 73,000 deaths in the United States.  ...  Acknowledgements 538 We'd like to thank the editorial staff and the reviewers for their excellent feedback during the publishing process.  ... 
doi:10.1101/663625 fatcat:rnkr5hquaveffjxfdhsrihkfiy


2004 Biocomputing 2005  
The sequences of contextual elements may be matched approximately by edit distance defined as the minimal cost incurred by the changes (including insertion, deletion and replacement) needed to transform  ...  Our approach augments the traditional concept of edit distance by elements of linguistic and biomedical knowledge, which together provide flexible selection of contextual features and their comparison.  ...  information of interest indirectly by the rules through alignment.  ... 
doi:10.1142/9789812702456_0019 fatcat:iys25yfi7re2vmvpodcu4r4cdm

Improving Assessment of Students through Semantic Space Construction

Roberto Pirrone, Giuseppe Russo, Vincenzo Cannella
2009 2009 International Conference on Complex, Intelligent and Software Intensive Systems  
Many tutoring systems offer only a limited set of assessment options like multiple-choice questions, fill-in-the-blanks tests or other types of predefined replies obtained through graphical widgets (radio-buttons  ...  We have reviewed the system design in the framework of a cognitive architecture with the aim to reach a double result: the reduction of the effort for the construction of the knowledge base and the improvement  ...  An optimal edit script between T 1 and T 2 has minimum cost and this cost is the tree edit distance. The tree edit distance problem is to compute the edit distance and the corresponding edit script.  ... 
doi:10.1109/cisis.2009.137 dblp:conf/cisis/PirroneRC09 fatcat:v6d3tyomejebxkgn4ocgve5ht4

Summarization by Analogy: An Example-based Approach for News Articles

Megumi Makino, Kazuhide Yamamoto
2008 International Joint Conference on Natural Language Processing  
Using example-based approach for the summarization task has the following three advantages: high modularity, absence of the necessity to score importance for each word, and high applicability to local  ...  Experimental results have proven that the summarization system attains approximately 60% accuracy by human judgment.  ...  (iii) Similarity with Enhanced Edit Distance We adopt the enhanced edit distance to link phrases including the same characters, because Japanese abbreviation tends to include the same characters as the  ... 
dblp:conf/ijcnlp/MakinoY08 fatcat:a6vigh4cqjehndohlafo4ksary

Identification of Synonyms Using Definition Similarities in Japanese Medical Device Adverse Event Terminology

Ayako Yagahara, Masahito Uesugi, Hideto Yokoi
2021 Applied Sciences  
Edit distances (Levenshtein and Jaro–Winkler distance) and distributed representations (Word2vec, fastText, and Doc2vec) were employed for calculating similarities.  ...  Such tools for edit distances and distributed representations have achieved good performance in previous studies.  ...  Similarity Calculations In the edit distance, the similarity index is the distance between two definition sentences without symbols using the python-Levenshtein module (version 0.12.0) [25] .  ... 
doi:10.3390/app11083659 fatcat:3ymkjz5fevhxvnqvyrhk3irvam

A Semi-Automatic Data Cleaning & Coding Tool for Chinese Clinical Data Standardization [chapter]

Yani Chen, Qi Tian, Hailing Cai, Xudong Lu
2022 Studies in Health Technology and Informatics  
The process included the preprocessing, text similarity algorithm, and manual review.  ...  The complexity of data cleaning and coding for Chinese clinical data prompted the turn of low-effective manual coding into the computer-aided tool.  ...  Table 1-Examples of symbol processing Edit distance The edit distance refers to the minimum number of insertions, deletions, and substitutions required to transform one string into the other [16] .  ... 
doi:10.3233/shti220041 pmid:35672980 fatcat:3wizafjrw5c6nbqqnjwpaz7nea


Caroline Suen, Sandy Huang, Chantat Eksombatchai, Rok Sosic, Jure Leskovec
2013 Proceedings of the 22nd international conference on World Wide Web - WWW '13  
We describe the News Information Flow Tracking, Yay! (NIFTY) system for large scale real-time tracking of "memes" -short textual phrases that travel and mutate through the Web.  ...  The real-time information on news sites, blogs and social networking sites changes dynamically and spreads rapidly through the Web.  ...  D edit (ps, p d ) is the substring edit distance between the phrases, and T peak (ps, p d ) is the time difference between the first volume peaks for each of the two phrases.  ... 
doi:10.1145/2488388.2488496 dblp:conf/www/SuenHESL13 fatcat:mepoqet4dzaufhl4bvealmfqgi
« Previous Showing results 1 — 15 out of 82,961 results