A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Boosting textual compression in optimal linear time
2005
Journal of the ACM
of time; and (c) it admits a decompression algorithm again optimal in time. ...
We provide a general boosting technique for Textual Data Compression. ...
The authors are deeply indebted to the referees, and in particular one of them, for a very careful reading of the article that lead to very useful, punctual and constructive comments. ...
doi:10.1145/1082036.1082043
fatcat:ljhmm5ehanc4lnaoasuaykbasu
What, where, and when
2014
Proceedings of the 8th Workshop on Geographic Information Retrieval - GIR '14
It exploits the structure of time-stamped data to dramatically shrink the temporal search space and uses a shallow tree based on the spatial distribution of tweets to allow speedy search over the spatial ...
With the adoption of timestamps and geotags on Web data, search engines are increasingly being asked questions of "where" and "when" in addition to the classic "what." ...
Acknowledgements This work was supported in part by the NSF (under grants 0966187 and 0904246) and by GAANN Grant P200A090157 from the US Department of Education. ...
doi:10.1145/2675354.2675358
dblp:conf/gir/NepomnyachiyGJM14
fatcat:e7mrj576uvdennsmxajmjq6wh4
On Optimally Partitioning a Text to Improve Its Compression
[chapter]
2009
Lecture Notes in Computer Science
In this paper we investigate the problem of partitioning an input string T in such a way that compressing individually its parts via a base-compressor C gets a compressed output that is shorter than applying ...
ACM 50(6):825-851, 2003) in the context of table compression, and then further elaborated and extended to strings and trees by Ferragina et al. (J. ...
showing that our algorithmic solution to the text partitioning problem could be used as a tool for approximating efficiently the interesting class of Dynamic-Programming Recurrences we have dealt with in ...
doi:10.1007/978-3-642-04128-0_38
fatcat:nsl7gu5y6fh4boojrtiypnnddu
On Optimally Partitioning a Text to Improve Its Compression
2010
Algorithmica
In this paper we investigate the problem of partitioning an input string T in such a way that compressing individually its parts via a base-compressor C gets a compressed output that is shorter than applying ...
ACM 50(6):825-851, 2003) in the context of table compression, and then further elaborated and extended to strings and trees by Ferragina et al. (J. ...
showing that our algorithmic solution to the text partitioning problem could be used as a tool for approximating efficiently the interesting class of Dynamic-Programming Recurrences we have dealt with in ...
doi:10.1007/s00453-010-9437-6
fatcat:4rmr2kxc4zct3b2trrl37pstpa
Optimally Partitioning a Text to Improve Its Compression
[chapter]
2013
Atlantis Studies in Computing
In this paper we investigate the problem of partitioning an input string T in such a way that compressing individually its parts via a base-compressor C gets a compressed output that is shorter than applying ...
ACM 50(6):825-851, 2003) in the context of table compression, and then further elaborated and extended to strings and trees by Ferragina et al. (J. ...
showing that our algorithmic solution to the text partitioning problem could be used as a tool for approximating efficiently the interesting class of Dynamic-Programming Recurrences we have dealt with in ...
doi:10.2991/978-94-6239-033-1_3
fatcat:oobkz6d7rfapjlh64t4kgsglyq
From first principles to the Burrows and Wheeler transform and beyond, via combinatorial optimization
2007
Theoretical Computer Science
Sciortino, Boosting textual compression in optimal linear time, Journal of the ACM 52 (2005) 688-713] . Therefore, they are all highly compressible. ...
We also show that the class of optimal word permutations defined here is identical to the one identified by Ferragina et al. for compression boosting [P. Ferragina, R. Giancarlo, G. Manzini, M. ...
In fact, they defined a class of word permutations well suited for compression boosting, i.e., bwt is not the only word permutation that is useful for boosting. ...
doi:10.1016/j.tcs.2007.07.019
fatcat:vcpnsui7fzf4rdtfqp2rieyz7i
Compressing Sentence Representation for Semantic Retrieval via Homomorphic Projective Distillation
[article]
2022
arXiv
pre-print
In this paper, we propose Homomorphic Projective Distillation (HPD) to learn compressed sentence embeddings. ...
We evaluate our method with different model sizes on both semantic textual similarity (STS) and semantic retrieval (SR) tasks. ...
Conclusion and Discussion In this paper, we propose an effective method to compress sentence representation using homomorphic projective distillation. ...
arXiv:2203.07687v1
fatcat:f6pxlypyjjgshctgvh4skjm26q
WIDIT in TREC-2003 Web Track
2003
Text Retrieval Conference
in real time. ...
Reranking Module In order to optimize retrieval performance in top ranks, fusion results were reranked based on combinations of site compression technique and content-link evidence ranking heuristic. ...
dblp:conf/trec/YangA03
fatcat:qbevhxaup5bqhipsoj3emw2t5y
A Survey on CDPCF: Concise Discriminative Patterns Based Classification Framework
2018
IJARCCE
Simple models such as generalized linear models have ordinary performance but strong interpretability on a set of simple features. ...
There are different series which includes tree-based models, organize numerical, categorical and high dimensional features into a comprehensive structure with rich interpretable information in the data ...
Their two step approach [3] , which combines random forest and a stepwise selection, provides a realistic approach for selecting an optimal set of features within a reasonable computational time. ...
doi:10.17148/ijarcce.2018.7104
fatcat:kzfjryb6cveilg7npqzq4q5t64
Text vs. space
2011
Proceedings of the 20th ACM international conference on Information and knowledge management - CIKM '11
We feel that previous work has often focused on the spatial aspect at the expense of performance considerations in text processing, such as inverted index access, compression, and caching. ...
In this paper, we take a fresh look at this problem. ...
We executed queries on our own document-at-a-time (DAAT) query processor, optimized through block-wise compression and forward skips in the inverted lists. ...
doi:10.1145/2063576.2063641
dblp:conf/cikm/ChristoforakiHDMS11
fatcat:f5kaghcwnzcrlgxkcvjlp344zy
A Comparative Assessment of Data Mining Algorithms to Predict Fraudulent Firms
2020
2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence)
The process of data mining is helpful in discovering meaningful patterns in historical or unstructured data in order to make better business decisions. ...
We have implemented Decision Trees, Linear Support Vector Machines, RBF Kernel Support Vector Machines, K-Nearest Neighbor, Artificial Neural Network and logistic regression classification models. ...
PCA have many benefits from compression to reduction in computational complexity, noise reduction.
IV. ...
doi:10.1109/confluence47617.2020.9057968
fatcat:yxruvis7qzfqjgww6zz2gmxuae
Discovering gis sources on the web using summaries
2008
Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries - JCDL '08
Existing techniques simply rely on textual metadata accompanying such datasets to compute relevance to user-queries. ...
Such approaches result in poor search results, often missing the most relevant sources on the web. ...
In practice the MinSkew algorithm runs very fast and the time taken is almost linear in n and |B|. ...
doi:10.1145/1378889.1378907
dblp:conf/jcdl/HariharanHM08
fatcat:qtdn6d7jmbdvxe6ncjj35aecyq
ITI-CERTH participation to TRECVID 2009 HLFE and Search
2009
TREC Video Retrieval Evaluation
In a separate run, the use of compressed video information to form a Bag-of-Words model for shot representation is studied. ...
The search task is based on an interactive retrieval application combining retrieval functionalities in various modalities (i.e. textual, visual and concept search) with a user interface supporting interactive ...
In this run, the use of compressed video information for BoW model generation was examined for the first time. ...
dblp:conf/trecvid/MoumtzidouDKVAM09
fatcat:w2dzfw6csnccbbr5q474dfefru
Page 3210 of Mathematical Reviews Vol. , Issue 2004d
[page]
2004
Mathematical Reviews
for compression and analysis of very large remote sensing data sets (429-441); Juan K. ...
Schapire, The boosting approach to ma- chine learning: an overview (149-171); Dragos D. Margineantu and Thomas G. ...
Neural Markovian Predictive Compression: An Algorithm for Online Lossless Data Compression
2010
2010 Data Compression Conference
The result is an interesting combination of properties: Linear processing time, constant memory storage performance and great adaptability to parallelism. ...
This work proposes a novel practical and general-purpose lossless compression algorithm named Neural Markovian Predictive Compression (NMPC), based on a novel combination of Bayesian Neural Networks (BNNs ...
LZW is well-suited to online compression, as it does not require the input stream to be divided into blocks. Using a fixed size dictionary, LZW can be implemented in linear time. ...
doi:10.1109/dcc.2010.26
dblp:conf/dcc/ShermerAS10
fatcat:f4ovjsi5xjf7bi62dkp6zmjtiq
« Previous
Showing results 1 — 15 out of 2,550 results