5,219 Hits in 4.0 sec

Improved count suffix trees for natural language data

Guido Sautter, Cristina Abba, Klemens Böhm
2008 Proceedings of the 2008 international symposium on Database engineering & applications - IDEAS '08  
Existing pruning techniques are based on suffix frequency or tree depth.  ...  ., estimating the selectivity of query terms based on small summary statistics before query execution. Count Suffix Trees (CST) are commonly used to this end.  ...  routine based on morphological structure of words.  ... 
doi:10.1145/1451940.1451972 dblp:conf/ideas/SautterAB08 fatcat:qwfey72jkrd5peykukoznrud2i

Using Syntactic Dependency-Pairs Conflation to Improve Retrieval Performance in Spanish [chapter]

Jesús Vilares, Fco. Mario Barcala, Miguel A. Alonso
2000 Lecture Notes in Computer Science  
At sentence level, an approximate grammar is used to conflate syntactic and morphosyntactic variants of a given multi-word term into a common base form.  ...  This article presents two new approaches for term indexing which are particularly appropriate for languages with a rich lexis and morphology, such as Spanish, and need few resources to be applied.  ...  We start with a noun syntagma whose syntactic structure is shown in the left tree, with the head noun N 1 modified by an adjectival syntagma. 2.  ... 
doi:10.1007/3-540-45715-1_40 fatcat:6glnqt5t6fb3hfxmechbwwx6di

A Theoretical Foundation for Syntactico-Semantic Pattern Recognition

Shrinivasan Patnaikuni, Dr Sachin Gengaje
2021 IEEE Access  
The paper formally puts forth an approach for syntactico-semantic pattern recognition.  ...  These algorithms essentially were dependent on the syntactic grammars defining the patterns.  ...  In the context of parse trees generated by the context free grammar works by [37] and [38] suggest an approach based on attention mechanism during the parse tree generation thereby paving a way to  ... 
doi:10.1109/access.2021.3115445 fatcat:oyifmss2qbfp3dv3ir5zmjfrgy

Texture-Cognition-Based 3D Building Model Generalization

Po Liu, Chengming Li, Fei Li
2017 ISPRS International Journal of Geo-Information  
In addition, a new cognition-based hierarchical algorithm is proposed for model-group clustering.  ...  In this paper, the texture features are first introduced into the generalization process, and a self-organizing mapping (SOM)-based algorithm is used for texture classification.  ...  Fei Li is mainly responsible for the design and technical guidance of SOM based texture classification algorithm. Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/ijgi6090260 fatcat:tms7kb6fkvfrzcvbu2pov7wogu

A Hierarchic Diff Algorith For Collaborative Music Document Editing

Christopher Antila, Jeffrey Treviño, Gabriel Weaver
2017 Zenodo  
We describe an application of hierarchic diff to the collaborative editing of tree-based music representations, using Zhang and Shasha's tree edit distance algorithm as implemented within the XUDiff tool  ...  We consider common operations on the score tree—deleting, changing, and appending tree nodes—to derive a minimal edit sequence, known as an edit script, and we compare the performance of the widely used  ...  Acknowledgments Research conducted for the nCoda project has been supported by Colorado College's SEGway faculty support grant.  ... 
doi:10.5281/zenodo.924190 fatcat:dqdfvcgg5fhxnoo4x62ec2ksdq

Capturing divergence in dependency trees to improve syntactic projection

Ryan Georgi, Fei Xia, William D. Lewis
2014 Language Resources and Evaluation  
These patterns can then be used to improve structural projection algorithms, allowing for better performing NLP tools for resource-poor languages, in particular those that may not have large amounts of  ...  In this paper, we investigate the possibility of using small, parallel, annotated corpora to automatically detect divergent structural patterns between two languages.  ...  To measure how common C1-C3 is in a language pair, we design an algorithm that transforms a tree pair based on a word alignment.  ... 
doi:10.1007/s10579-014-9273-4 fatcat:x4abtxdevzhjze4cxhxunngmuy

A New Multi-Phase Algorithm for Stemming in Farsi Language Based on Morphology

Somayyeh Estahbanati, Reza Javidan, Mehdi Nikkhah
2011 Journal of clean energy technologies  
This stemmer is based on removing the suffixes and prefixes, and a database is used for saving the exceptions to decrease error rate.  ...  In this paper a new algorithm for stemming in Farsi (Persian) language is presented.  ...  For example for the verb " ‫ﻣﯽ‬ ‫ﮐﻨﻴﻢ‬ " the algorithm removes the term ‫"ﻣﯽ"‬ as prefix.  ... 
doi:10.7763/ijcte.2011.v3.381 fatcat:vyptz4lifncdvb2rgh7s35g7x4

Conflation Methods in Stemming Algorithm

In the wake of building skill about the conflation algorithms domain, we rounded out system portrayal surveys for every last one of these algorithms.  ...  We gathered distributed papers in the conflation algorithms branch of knowledge as domain archives and the source code of conflation algorithms for system engineering examination.  ...  For instance, if a client needs to search for an archive "On the best way to cook" and presents a question on "cooking" he may not get all the significant outcomes.  ... 
doi:10.35940/ijitee.k1237.09811s19 fatcat:p46t7oautvbbbaem4gaxec3f64

Performance evaluation of SDIAGENT, a multi-agent system for distributed fuzzy geospatial data conflation

2006 Information Sciences  
In this paper, we evaluate SDIAGENT our, recently introduced, multi-agent architecture for geospatial data integration and conflation, and compare its model performance with that of client/server and single-agent  ...  Experimental results for several realistic scenarios, under varying conditions, are presented for these three system architectures.  ...  For instance, if the feature is a point, a Point Conflation Agent (PCA), which contains a knowledge-based conflation algorithm for point features, will be generated.  ... 
doi:10.1016/j.ins.2005.07.009 fatcat:fbbxv6rqgzdc5pdmee7vl5xih4

Navigating the Semantic Horizon using Relative Neighborhood Graphs

Amaru Cuba Gyllensten, Magnus Sahlgren
2015 Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing  
The approach is based on relative neighborhood graphs, which uncover the topological structure of local neighborhoods in semantic space.  ...  This has the potential to overcome both the problem with selecting a proper k in k-NN search, and the problem that a ranked list of neighbors may conflate several different senses.  ...  Another graph-based approach is the HyperLex algorithm (Véronis, 2004) , which constructs a graph connecting all pairs of terms that co-occur in the context of an ambiguous term.  ... 
doi:10.18653/v1/d15-1292 dblp:conf/emnlp/GyllenstenS15 fatcat:2vyoc4vxtbcp5nq3gziep6nhoa

An End-to-end Point of Interest (POI) Conflation Framework [article]

Raymond Low, Zeynep D. Tekler, Lynette Cheah
2021 arXiv   pre-print
Based on the evaluation conducted, the resulting unified dataset was found to be more comprehensive and complete than any of the five POI data sources alone.  ...  Furthermore, the proposed approach for identifying POI matches between different data sources outperformed all baseline approaches with a matching accuracy of 97.6% with an average run time below 3 minutes  ...  is another classification algorithm that aggregates the model output produced by relatively uncorrelated base learners (i.e., decision trees) to produce an ensemble model that is more powerful than any  ... 
arXiv:2109.06073v1 fatcat:rd654ccekvg4hctzwehyfbtfqy

COLE Experiments in the CLEF 2002 Spanish Monolingual Track [chapter]

Jesús Vilares, Miguel A. Alonso, Francisco J. Ribadas, Manuel Vilares
2003 Lecture Notes in Computer Science  
In this our first participation in CLEF, we applied Natural Language Processing techniques for single word and multiword term conflation.  ...  on the employment of such families together with syntactic dependencies to deal with the syntactic content of the document.  ...  The best results we obtained were for the stemmer used by the open source search engine Muscat 3 , based on Porter's algorithm [2] .  ... 
doi:10.1007/978-3-540-45237-9_22 fatcat:rvesai4nwzfztbg2ulp6wawgga

A Tree Based Association Rule Approach For Xml Data With Semantic Integration

D. Sasikala, K. Premalatha
2015 Zenodo  
To improve the query answering, a Semantic Tree Based Association Rule (STAR) mining method is proposed.  ...  Semi-structured documents suffer due to its heterogeneity and dimensionality. XML structure and content mining represent convergence for research in semi-structured data and text mining.  ...  The TAR is constructed for frequent terms based on semantic appropriateness between the terms and the minimum support. 4.  ... 
doi:10.5281/zenodo.1337979 fatcat:iiscdioyzra5la4yazpq354lee

Verb phrase movement as a window into head movement

Nicholas Joseph LaCara
2016 Proceedings of the Linguistic Society of America  
This paper looks at cases where verb phrase fronting generates two copies of the verb (as in Portuguese or Hebrew), one in the fronted vP and one in an inflectional position, showing how a PF approach  ...  The interaction of verb movement with verb phrase fronting can shed light on how and when head movement occurs.  ...  For the purposes of this paper, I adopt the CONFLATION approach to head movement (Hale & Keyser 2002 , Harley 2004 . 6 Under this view, morpho-phonological features are passed up the tree as structure  ... 
doi:10.3765/plsa.v1i0.3714 fatcat:gzkkfgxduvdk7lnvbgjtlxe3le

Survey of Machine Learning Techniques in Textual Document Classification

S.W. Mohod, Dr. C.A.Dhote
2014 IOSR Journal of Computer Engineering  
Many machine learning algorithms plays an important role in training the system with predefined categories.  ...  The importance of Machine learning approach has felt because of which the study has been taken up for text document classification based on the statistical event models available.  ...  This method is an instant-based learning algorithm that categorized objects based on closest feature space in the training set [10] .  ... 
doi:10.9790/0661-16131721 fatcat:cbiobxubajhc7f66jabrgvbiwm
« Previous Showing results 1 — 15 out of 5,219 results