A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is
dedicated to zvi galil in honor of his 50th birthday We describe fast parallel algorithms for building index data structures that can be used to gather various statistics on square matrices. ... The main data structure is the Lsuffix tree, which is a generalization of the classical suffix tree for strings. ... We give some efficient algorithms for the construction and query of index data structures on the text, the main one being the Lsuffix tree of a square matrix  . ...doi:10.1006/jcom.1998.0496 fatcat:li3m4yo2s5hcxnm5nmaafy35iq
Performance of the data warehouse depends on physical design. ... If square root of minimum Euclidean distance for a pattern is less than and equal to threshold then pattern will belong to cluster and its number will be the index of key value defined by this pattern ... DB2 physical design advisor uses horizontal partitioning for distributing data on non-shared parallel machines and multidimensional clustering for data cube construction [3, 7] . ...doi:10.5120/77-172 fatcat:2ccwbibw4vhcvfxbxlaqrthzvi
Lecture Notes in Computer Science
Therefore, we propose a scalable and efficient data structure that allows StarQL implementations to handle large sets of strings and utilize large computing infrastructures. ... StarQL is based on a native string data model that allows StarQL to support a large variety of string operations and provide semantic-based query optimization. ... This paper proposed StarQL, a declarative query language for strings; and StarIN, a scalable and efficient data structure. ...doi:10.1007/978-3-319-69179-4_1 fatcat:7mwy47qmhzbkbd3m2mgv2qkgmm
We provide a uniform framework for the study of index data structures for a two-dimensional matrix TEXT[1 : n, 1 : n] whose entries are drawn from an ordered alphabet 7. ... (in"Proceedings, Fourth Symposium on Theory of Computing," pp. 125 136) for strings and matrices. ] 1996 Academic Press article no. ... However, we need novel algorithmic techniques that are improvements and generalizations of the ones that have been devised for the construction of suffix tree data structures for strings and square matrices ...doi:10.1006/inco.1996.0087 fatcat:3hpis6csqfedxnwrwc4ndweyde
Distance matrices are capable of representing specific protein structural topologies, and similar proteins will generate similar matrices. ... Indexing protein structures has been shown to provide a scalable solution for structure-to-structure comparisons in large protein structure retrieval systems. ... FEATURE EXTRACTION By mapping 3D protein structures into 2D distance matrices, we can analyze the 2D matrices and do further structure comparison based on the patterns in the matrices. ...doi:10.1142/s0218194005002439 fatcat:fzxs76le7fb4bprwxtdtrgfyta
and utilizing the parallel processing nature of optics. ... In this paper a parallel optoelectronic computer architecture is proposed for large-scale parallel corpus, full text search and text mining applications while achieving high speed and high performance ... Most of the previous work on parallel texts has been conducted on a few manually constructed parallel corpora such as the work published by Canadian Hansard Corpus and Linguistic Data Consortium (LDC) ...doi:10.7763/ijcee.2014.v6.787 fatcat:3n3tdotvsrd5tgofhvif7k7hfi
Proceedings of the SIGCHI conference on Human factors in computing systems - CHI '88
The latent semantic indexing approach tries to overcome these problems by automatically organizing text objects into a semantic structure more appropriate for matching user requests. ... Terms and objects are represented by 50 to 150 dimensional vectors and matched against user queries iu this "semantic" space. ... The 13% average improvement over raw term matching shows that LSI captured some structure in the data which was missed by raw term matching. ...doi:10.1145/57167.57214 fatcat:kg5j5upx4bhztdxgzec25kwpya
their enumeration.” 2000b:68047 68P05 6swi0 Giancarlo, Raffaele (I-PLRM; Palermo) ; Grossi, Roberto (I-FRNZ-I; Florence) Parallel construction and query of index data structures for pattern matching on ... This paper presents fast parallel algorithms for building data struc- tures on square matrices (images). Using an L-suffix tree, the building algorithm requires O(logn) time with n? ...
The notion of indexing on substrings (or q-grams) has been explored earlier without sufficient consideration of efficiency. q-grams are used to prune away rows that do not qualify for the query. ... , ii) performance evaluation of the application of the novel method to real data, and iii) parallelization of the algorithm, scaling considerations and a proposal to handle scaling issues. ...  provides an excellent survey of various data structures/algorithms developed for pattern queries. ...doi:10.1145/1031171.1031212 dblp:conf/cikm/HoreHIM04 fatcat:krepf7n2jbb4xb4iq35ewdivn4
Summary: “We propose multi-dimensional index data structures that generalize suffix arrays to square matrices and cubic matri- ces. ... Giancarlo proposed a two-dimensional index data structure, the Lsuffix tree, that generalizes suffix trees to square matrices. ...
Our numerical experiments on both small and large datasets show the advantage of such an approach in terms of storage costs and query time compared with the least-squares based approach while maintaining ... This method is based on document clustering techniques and leastsquare matrix approximation to approximate the matrix of vectors. ... The dense data structure of the concept decomposition matrix poses a huge challenge for both disk and memory spaces of conventional computers. ...doi:10.5120/17406-7991 fatcat:4jp4opwtzbg7tah3cwepkngufq
Text mining is a variant on a field called data mining that tries to discover curious patterns from large databases. ... Due to increasing size of text and audio data over internet, various techniques are needed to help with the finding and extraction of very specific information relevant to a user's task. ... Pattern matching is the one of the important task in NLP  . It match the patterns related to user query and easily result get it back to the user. ...doi:10.5120/ijca2016908120 fatcat:6vvcyrznm5gutfudfysvss5gvq
In addition, we propose using frequent trajectory patterns (mined from historical trajectories) to scale down the candidates of concatenation and a suffix-tree-based index to manage the trajectories received ... In this paper, we propose a citywide and real-time model for estimating the travel time of any path (represented as a sequence of connected road segments) in real time in a city, based on the GPS trajectories ... Besides , we also construct another two matrices and to help the decomposition of . ...doi:10.1145/2623330.2623656 dblp:conf/kdd/WangZX14 fatcat:mlfotn5e3bhtdpa2ivveorkryi
Existing query engines for RDF graphs follow one of two design paradigms: relational or graph-based. ... Our experiments on large-scale real and synthetic datasets show that MAGiQ performs comparably to or better than existing specialized SPARQL query engines for data-intensive queries, scales to very large ... We also thank the authors of Wukong  for their responsiveness and help with running and understanding their system. ...doi:10.1145/3302424.3303962 dblp:conf/eurosys/JamourACK19 fatcat:yo3hnbrmczd57fiormkncs4jpe
Even distribution of irregular workload to processing units is crucial for efficient parallelization in many applications. ... We experimentally show the proposed algorithms are efficient and effective on more than six hundred test matrices. ... Mücahid Benlioglu for his valuable comments and feedbacks for the initial draft of this manuscript and code-base. This work was partially supported by the NSF grant CCF-1919021. ...arXiv:2009.07735v1 fatcat:iglykt6ssfdlxeer24fktnvigi
« Previous Showing results 1 — 15 out of 6,168 results