Filters








27,500 Hits in 3.2 sec

Searching in compressed dictionaries

S.T. Klein, D. Shapira
Proceedings DCC 2002. Data Compression Conference  
Conclusion We introduced two new methods to represent a POM file so that direct search could be done in these compressed dictionaries.  ...  We see that in the case of small files, which is the important application since dictionaries are usually kept in small chunks, the Fibonacci variant is much faster than decoding and searching or than  ...  The algorithm for searching in compressed POM dictionaries, based on decompressing each entry, is given in Figure 2 .  ... 
doi:10.1109/dcc.2002.999952 dblp:conf/dcc/KleinS02 fatcat:tcu43kwudjfilil2ufxnksvxv4

Layered Lossless Compression Method of Massive Fault Recording Data

Jinhong Di, Pengkun Yang, Chunyan Wang, Lichao Yan
2022 North atlantic university union: International Journal of Circuits, Systems and Signal Processing  
The parallel search method is to divide the dictionary into several small dictionaries with different bit widths to realize the parallel search of the dictionary.  ...  In order to overcome the problems of large error and low precision in traditional power fault record data compression, a new layered lossless compression method for massive fault record data is proposed  ...  The conventional sequential search method needs to search the whole dictionary in each compression process, resulting in a long search delay and a long compression delay of LZW algorithm, which can not  ... 
doi:10.46300/9106.2022.16.3 fatcat:pq7ikgqcs5hgbgrhlrv727op3m

A new compression algorithm for fast text search

Aydın CARUS, Altan MESUT
2016 Turkish Journal of Electrical Engineering and Computer Sciences  
Our experimental results show that SoCAFTS is a good solution when it is necessary to search for long patterns in a compressed document.  ...  Although the search speed of ETDC is very good in short patterns, it can only search for exact words and its compression performance differs from one natural language to another because of its word-based  ...  Note that in the Compressed Form 1 of Figure 4 , if the trigram 'mpr' could not be found in the sub-dictionary of 'co', the compression procedure would try to search for 'pre' in the sub-dictionary of  ... 
doi:10.3906/elk-1407-178 fatcat:f2koqulplja4zgp5tbcu43dca4

Order-Preserving Key Compression for In-Memory Search Trees [article]

Huanchen Zhang, Xiaoxuan Liu, David G. Andersen, Michael Kaminsky, Kimberly Keeton, Andrew Pavlo
2020 arXiv   pre-print
We present the High-speed Order-Preserving Encoder (HOPE) for in-memory search trees. HOPE is a fast dictionary-based compressor that encodes arbitrary keys while preserving their order.  ...  We first develop a theoretical model to reason about order-preserving dictionary designs. We then select six representative compression schemes using this model and implement them in HOPE.  ...  One could apply existing field/table-wise compression schemes to search tree keys. Whole-key dictionary compression is the most popular scheme used in DBMSs today.  ... 
arXiv:2003.02391v1 fatcat:4k5o6pbznfdezdvcwbxsel5voa

Lossless Text Compression using Dictionaries

Umesh S. Bhadade, A.I. Trivedi
2011 International Journal of Computer Applications  
The algorithm suggested here uses the dynamic dictionary created at run-time and is also suitable for searching the phrases from the compressed file.  ...  Compression is used just about everywhere. Reduction of both compression ratio and retrieval of data from large collection is important in today"s era.  ...  Search 4Char pair in the dictionary, If found construct code value and store it in compressed file, else search 3-Char pair in the dictionary, if found construct code value and store it in compressed file  ... 
doi:10.5120/1799-1767 fatcat:e3ci7oe3arcg7opbsrfgchokwe

Design and Implementation af LZW Data Compression Algorithm

Simrandeep Kaur
2012 International Journal of Information Sciences and Techniques  
In this paper, LZW data compression algorithm is implemented by finite state machine, thus the text data can be effectively compressed.  ...  LZW is dictionary based algorithm, which is lossless in nature and incorporated as the standard of the consultative committee on International telegraphy and telephony, which is implemented in this paper  ...  ch and again search data in dictionary; if ( it is not present in dictionary ) then add that string to dictionary; end if; Compression example: consider a string "BAABAABB" is given to LZW algorithm.  ... 
doi:10.5121/ijist.2012.2407 fatcat:435xl2vdjja5dcctyhxuiay65u

DATA COMPRESSION USING EFFICIENT DICTIONARY SELECTION METHOD

SRITULASI ADIGOPULA, P. BALANAGU, N. SURESH BABU
2015 International journal of computer and communication technology  
With the increase in silicon densities, it is becoming feasible for compression systems to be implemented in chip.  ...  The objective of the project is to design a lossless data compression system which operates in high-speed to achieve high compression rate.  ...  ACKNOWLEDGEMENTS The authors would like to thank the anonymous reviewers for their comments which were very helpful in improving the quality and presentation of this paper.  ... 
doi:10.47893/ijcct.2015.1287 fatcat:md4rez3lenhfrcy6bsky2jpuym

CLP: Efficient and Scalable Search on Compressed Text Logs

Kirk Rodrigues, Yu Luo, Ding Yuan
2021 USENIX Symposium on Operating Systems Design and Implementation  
A search query will be processed by first searching in the dictionary, and then searching those encoded messages for which the dictionary search suggests possible matches.  ...  In our evaluation, dictionary search time is negligible compared to a segment scan; furthermore, log type dictionary search time is negligible compared with variable dictionary search.  ... 
dblp:conf/osdi/RodriguesLY21 fatcat:ld2dfwyktbaz5braxufz3wje6y

Compressed permuterm index

Paolo Ferragina, Rossano Venturini
2007 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '07  
In this paper we propose the Compressed Permuterm Index which solves the Tolerant Retrieval problem in optimal query time, i.e. time proportional to the length of the searched pattern, and space close  ...  Experiments show that our index supports fast queries within a space occupancy that is close to the one achievable by compressing the string dictionary via gzip, bzip2 or ppmdi.  ...  Here, we are interested in the compressed indexing of the string dictionary D, which introduces more challenges.  ... 
doi:10.1145/1277741.1277833 dblp:conf/sigir/FerraginaV07 fatcat:kvjkt2lmhreurba5befqbowtji

Compressed String Dictionaries [article]

Nieves R. Brisaboa and Rodrigo Cánovas and Miguel A. Martínez-Prieto and Gonzalo Navarro
2011 arXiv   pre-print
The problem of storing a set of strings --- a string dictionary --- in compact form appears naturally in many cases.  ...  We show that space reductions of up to 20% of the original size of the strings is possible while supporting fast dictionary searches.  ...  In this paper we study Front-Coding and other solutions we propose for compressing large string dictionaries, so that two basic operations are supported: (1) given a string, give its position in the dictionary  ... 
arXiv:1101.5506v1 fatcat:s2n6i5pk3bhnjjgfuldomdmmpm

A new word-based compression model allowing compressed pattern matching

Halil Nusret BULUŞ, Aydın CARUS, Altan MESUT
2017 Turkish Journal of Electrical Engineering and Computer Sciences  
In the first phase a dictionary is constructed by adding a phrase, paying attention to word boundaries, and in the second phase compression is done by using codewords of phrases in this dictionary.  ...  In addition, the proposed method makes it possible to also search for the group of consecutively compressed words.  ...  Now the "abr" string is searched in the dictionary.  ... 
doi:10.3906/elk-1601-92 fatcat:q4p4rrbkzfadfhysapwmbgevkm

Compressed String Dictionaries [chapter]

Nieves R. Brisaboa, Rodrigo Cánovas, Francisco Claude, Miguel A. Martínez-Prieto, Gonzalo Navarro
2011 Lecture Notes in Computer Science  
The problem of storing a set of strings -a string dictionary -in compact form appears naturally in many cases.  ...  Thus efficient approaches to compress them are necessary. In this paper we empirically compare time and space performance of some existing alternatives, as well as new ones we propose.  ...  Final Remarks Prefix search, that is, finding the dictionary strings that start with a given pattern, is easily supported by the methods we have explored, except hashing.  ... 
doi:10.1007/978-3-642-20662-7_12 fatcat:kzrzp7m62rgvhc6jkzld5y2o74

The compressed permuterm index

Paolo Ferragina, Rossano Venturini
2010 ACM Transactions on Algorithms  
In this article we propose the Compressed Permuterm Index which solves the Tolerant Retrieval problem in time proportional to the length of the searched pattern, and space close to the kth order empirical  ...  by compressing the string dictionary via gzip or bzip2.  ...  The authors would like to thank the anonymous referees and Gonzalo Navarro for their valuable technical comments and their help in improving the presentation of the article.  ... 
doi:10.1145/1868237.1868248 fatcat:ghfsoazcw5bzdlhmxc5p6wmt6y

Compressed Matching in Dictionaries

Shmuel T. Klein, Dana Shapira
2011 Algorithms  
We suggest to extend the problem to the search of patterns in the compressed form of structured files.  ...  The problem of compressed pattern matching, which has recently been treated in many papers dealing with free text, is extended to structured files, specifically to dictionaries, which appear in any full-text  ...  The algorithm for searching for a pattern P in a dictionary compressed by POM, based on decompressing each entry, is given in Figure 2 .  ... 
doi:10.3390/a4010061 fatcat:cbzwe7dwrng3xpurrr4edgma7i

Development of word-based text compression algorithm for Indonesian language document

Ardiles Sinaga, Adiwijaya, Hertog Nugroho
2015 2015 3rd International Conference on Information and Communication Technology (ICoICT)  
Symbols, numbers and affixes will be indexed in the basic dictionary. The basic word will also be checked whether it exists in the basic dictionary or not.  ...  If there is not a match, then the word will be stored in the supplement dictionary.  ...  The proposed method creates a supplement dictionary in a trie [12] that is called prefix tree to speed up the word searching in the dictionary.  ... 
doi:10.1109/icoict.2015.7231466 fatcat:aksdpfdhafarjbgeh4wegg3fvy
« Previous Showing results 1 — 15 out of 27,500 results