Filters








6,168 Hits in 7.7 sec

Parallel Construction and Query of Index Data Structures for Pattern Matching on Square Matrices

Raffaele Giancarlo, Roberto Grossi
1999 Journal of Complexity  
dedicated to zvi galil in honor of his 50th birthday We describe fast parallel algorithms for building index data structures that can be used to gather various statistics on square matrices.  ...  The main data structure is the Lsuffix tree, which is a generalization of the classical suffix tree for strings.  ...  We give some efficient algorithms for the construction and query of index data structures on the text, the main one being the Lsuffix tree of a square matrix [23] .  ... 
doi:10.1006/jcom.1998.0496 fatcat:li3m4yo2s5hcxnm5nmaafy35iq

Physical Data Warehouse Design Using Neural Network

Mayank Sharma, Navin Rajpal, B.V.R. Reddy
2010 International Journal of Computer Applications  
Performance of the data warehouse depends on physical design.  ...  If square root of minimum Euclidean distance for a pattern is less than and equal to threshold then pattern will belong to cluster and its number will be the index of key value defined by this pattern  ...  DB2 physical design advisor uses horizontal partitioning for distributing data on non-shared parallel machines and multidimensional clustering for data cube construction [3, 7] .  ... 
doi:10.5120/77-172 fatcat:2ccwbibw4vhcvfxbxlaqrthzvi

Querying and Mining Strings Made Easy [chapter]

Majed Sahli, Essam Mansour, Panos Kalnis
2017 Lecture Notes in Computer Science  
Therefore, we propose a scalable and efficient data structure that allows StarQL implementations to handle large sets of strings and utilize large computing infrastructures.  ...  StarQL is based on a native string data model that allows StarQL to support a large variety of string operations and provide semantic-based query optimization.  ...  This paper proposed StarQL, a declarative query language for strings; and StarIN, a scalable and efficient data structure.  ... 
doi:10.1007/978-3-319-69179-4_1 fatcat:7mwy47qmhzbkbd3m2mgv2qkgmm

On the Construction of Classes of Suffix Trees for Square Matrices: Algorithms and Applications

Raffaele Giancarlo, Roberto Grossi
1996 Information and Computation  
We provide a uniform framework for the study of index data structures for a two-dimensional matrix TEXT[1 : n, 1 : n] whose entries are drawn from an ordered alphabet 7.  ...  (in"Proceedings, Fourth Symposium on Theory of Computing," pp. 125 136) for strings and matrices. ] 1996 Academic Press article no.  ...  However, we need novel algorithmic techniques that are improvements and generalizations of the ones that have been devised for the construction of suffix tree data structures for strings and square matrices  ... 
doi:10.1006/inco.1996.0087 fatcat:3hpis6csqfedxnwrwc4ndweyde

A FAST PROTEIN STRUCTURE RETRIEVAL SYSTEM USING IMAGE-BASED DISTANCE MATRICES AND MULTIDIMENSIONAL INDEX

PIN-HAO CHI, GRANT SCOTT, CHI-REN SHYU
2005 International journal of software engineering and knowledge engineering  
Distance matrices are capable of representing specific protein structural topologies, and similar proteins will generate similar matrices.  ...  Indexing protein structures has been shown to provide a scalable solution for structure-to-structure comparisons in large protein structure retrieval systems.  ...  FEATURE EXTRACTION By mapping 3D protein structures into 2D distance matrices, we can analyze the 2D matrices and do further structure comparison based on the patterns in the matrices.  ... 
doi:10.1142/s0218194005002439 fatcat:fzxs76le7fb4bprwxtdtrgfyta

A High Performance Optoelectronic Machine for Automated Arabic-English Parallel Corpus Creation and for Text Mining Processing

Samy S. A. Ghoniemy, Omar H. Karam
2014 International Journal of Computer and Electrical Engineering  
and utilizing the parallel processing nature of optics.  ...  In this paper a parallel optoelectronic computer architecture is proposed for large-scale parallel corpus, full text search and text mining applications while achieving high speed and high performance  ...  Most of the previous work on parallel texts has been conducted on a few manually constructed parallel corpora such as the work published by Canadian Hansard Corpus and Linguistic Data Consortium (LDC)  ... 
doi:10.7763/ijcee.2014.v6.787 fatcat:3n3tdotvsrd5tgofhvif7k7hfi

Using latent semantic analysis to improve access to textual information

S. T. Dumais, G. W. Furnas, T. K. Landauer, S. Deerwester, R. Harshman
1988 Proceedings of the SIGCHI conference on Human factors in computing systems - CHI '88  
The latent semantic indexing approach tries to overcome these problems by automatically organizing text objects into a semantic structure more appropriate for matching user requests.  ...  Terms and objects are represented by 50 to 150 dimensional vectors and matched against user queries iu this "semantic" space.  ...  The 13% average improvement over raw term matching shows that LSI captured some structure in the data which was missed by raw term matching.  ... 
doi:10.1145/57167.57214 fatcat:kg5j5upx4bhztdxgzec25kwpya

Page 1425 of Mathematical Reviews Vol. , Issue 2000b [page]

2000 Mathematical Reviews  
their enumeration.” 2000b:68047 68P05 6swi0 Giancarlo, Raffaele (I-PLRM; Palermo) ; Grossi, Roberto (I-FRNZ-I; Florence) Parallel construction and query of index data structures for pattern matching on  ...  This paper presents fast parallel algorithms for building data struc- tures on square matrices (images). Using an L-suffix tree, the building algorithm requires O(logn) time with n?  ... 

Indexing text data under space constraints

Bijit Hore, Hakan Hacigumus, Bala Iyer, Sharad Mehrotra
2004 Proceedings of the Thirteenth ACM conference on Information and knowledge management - CIKM '04  
The notion of indexing on substrings (or q-grams) has been explored earlier without sufficient consideration of efficiency. q-grams are used to prune away rows that do not qualify for the query.  ...  , ii) performance evaluation of the application of the novel method to real data, and iii) parallelization of the algorithm, scaling considerations and a proposal to handle scaling issues.  ...  [11] provides an excellent survey of various data structures/algorithms developed for pattern queries.  ... 
doi:10.1145/1031171.1031212 dblp:conf/cikm/HoreHIM04 fatcat:krepf7n2jbb4xb4iq35ewdivn4

Page 7892 of Mathematical Reviews Vol. , Issue 99k [page]

1999 Mathematical Reviews  
Summary: “We propose multi-dimensional index data structures that generalize suffix arrays to square matrices and cubic matri- ces.  ...  Giancarlo proposed a two-dimensional index data structure, the Lsuffix tree, that generalizes suffix trees to square matrices.  ... 

Multistep Sparse Approximation Technology in Information Retrieval

Chi Shen, Wen Li, Mike F. Unuakhalu
2014 International Journal of Computer Applications  
Our numerical experiments on both small and large datasets show the advantage of such an approach in terms of storage costs and query time compared with the least-squares based approach while maintaining  ...  This method is based on document clustering techniques and leastsquare matrix approximation to approximate the matrix of vectors.  ...  The dense data structure of the concept decomposition matrix poses a huge challenge for both disk and memory spaces of conventional computers.  ... 
doi:10.5120/17406-7991 fatcat:4jp4opwtzbg7tah3cwepkngufq

Survey on Information Retrieval and Pattern Matching for Compressed Data Size using the SVD Technique on Real Audio Dataset

Poonam Dhumal, S. S.
2016 International Journal of Computer Applications  
Text mining is a variant on a field called data mining that tries to discover curious patterns from large databases.  ...  Due to increasing size of text and audio data over internet, various techniques are needed to help with the finding and extraction of very specific information relevant to a user's task.  ...  Pattern matching is the one of the important task in NLP [1] . It match the patterns related to user query and easily result get it back to the user.  ... 
doi:10.5120/ijca2016908120 fatcat:6vvcyrznm5gutfudfysvss5gvq

Travel time estimation of a path using sparse trajectories

Yilun Wang, Yu Zheng, Yexiang Xue
2014 Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '14  
In addition, we propose using frequent trajectory patterns (mined from historical trajectories) to scale down the candidates of concatenation and a suffix-tree-based index to manage the trajectories received  ...  In this paper, we propose a citywide and real-time model for estimating the travel time of any path (represented as a sequence of connected road segments) in real time in a city, based on the GPS trajectories  ...  Besides , we also construct another two matrices and to help the decomposition of .  ... 
doi:10.1145/2623330.2623656 dblp:conf/kdd/WangZX14 fatcat:mlfotn5e3bhtdpa2ivveorkryi

Matrix Algebra Framework for Portable, Scalable and Efficient Query Engines for RDF Graphs

Fuad Jamour, Ibrahim Abdelaziz, Yuanzhao Chen, Panos Kalnis
2019 Proceedings of the Fourteenth EuroSys Conference 2019 CD-ROM on ZZZ - EuroSys '19  
Existing query engines for RDF graphs follow one of two design paradigms: relational or graph-based.  ...  Our experiments on large-scale real and synthetic datasets show that MAGiQ performs comparably to or better than existing specialized SPARQL query engines for data-intensive queries, scales to very large  ...  We also thank the authors of Wukong [49] for their responsiveness and help with running and understanding their system.  ... 
doi:10.1145/3302424.3303962 dblp:conf/eurosys/JamourACK19 fatcat:yo3hnbrmczd57fiormkncs4jpe

On Symmetric Rectilinear Matrix Partitioning [article]

Abdurrahman Yaşar and Muhammed Fati̇h Balin and Xiaojing An and Kaan Sancak and Ümit V. Çatalyürek
2020 arXiv   pre-print
Even distribution of irregular workload to processing units is crucial for efficient parallelization in many applications.  ...  We experimentally show the proposed algorithms are efficient and effective on more than six hundred test matrices.  ...  Mücahid Benlioglu for his valuable comments and feedbacks for the initial draft of this manuscript and code-base. This work was partially supported by the NSF grant CCF-1919021.  ... 
arXiv:2009.07735v1 fatcat:iglykt6ssfdlxeer24fktnvigi
« Previous Showing results 1 — 15 out of 6,168 results