71,567 Hits in 1.8 sec

Compressed web indexes

Flavio Chierichetti, Ravi Kumar, Prabhakar Raghavan
2009 Proceedings of the 18th international conference on World wide web - WWW '09  
The problem of compressed indexes that permit such fast retrieval has a long history.  ...  Web search engines use indexes to efficiently retrieve pages containing specified query terms, as well as pages linking to specified pages.  ...  In the following tables, S denotes compressed index size and U denotes uncompressed index size; thus S/U is the compression ratio.  ... 
doi:10.1145/1526709.1526770 dblp:conf/www/ChierichettiKR09 fatcat:xai5wdopqbb23co5zdcxhzuulu

Compressing term positions in web indexes

Hao Yan, Shuai Ding, Torsten Suel
2009 Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval - SIGIR '09  
We focus on techniques for compressing term positions in web search engine indexes.  ...  We perform a detailed study of a number of existing and new techniques for compressing position data in web indexes.  ...  In this paper, we focus on techniques for compressing position information in web indexes.  ... 
doi:10.1145/1571941.1571969 dblp:conf/sigir/YanDS09 fatcat:2focpoagrjf3hm4qpnx5nvqkma

Web-based image indexing and retrieval in JPEG compressed domain

J. Jiang, A. Armstrong, G. C. Feng
2004 Multimedia Systems  
This technology will help control the explosion of media-rich content by offering users a powerful automated image indexing and retrieval tool for compressed images on the Web.  ...  , developed in the pixel domain, and the fact that an increasing number of images stored on the Web are already compressed by JPEG at the source.  ...  This type of retrieval technique involving data compression is not suitable for Web image retrieval simply because Web images are already in compressed format at the source and are not indexed.  ... 
doi:10.1007/s00530-003-0115-2 fatcat:r6ekttjzmfgbppx5bu2ymbj2ti

Using d-gap patterns for index compression

Jinlin Chen, Terry Cook
2007 Proceedings of the 16th international conference on World Wide Web - WWW '07  
In this paper the information of d-gap sequential patterns is used as a new dimension for improving inverted index compression.  ...  Sequential patterns of d-gaps exist pervasively in inverted lists of Web document collection indices due to the cluster property.  ...  INTRODUCTION Efficient indexing of Web pages is crucial for the performance of search engines.  ... 
doi:10.1145/1242572.1242769 dblp:conf/www/ChenC07a fatcat:5ep3t53vb5bbjh2xa4hdazkwpi

Assigning document identifiers to enhance compressibility of Web Search Engines indexes

Fabrizio Silvestri, Raffaele Perego, Salvatore Orlando
2004 Proceedings of the 2004 ACM symposium on Applied computing - SAC '04  
Granting efficient accesses to the index is a key issue for the performances of Web Search Engines (WSE).  ...  document identifiers) compressed using variable length encoding methods.  ...  The Indexing phase itself can be viewed as a Web Content Mining process.  ... 
doi:10.1145/967900.968024 dblp:conf/sac/SilvestriPO04 fatcat:xyemltmfynaobay5xlpgzyllwe

Compressed Indexes for String Searching in Labeled Graphs

Paolo Ferragina, Francesco Piccinno, Rossano Venturini
2015 Proceedings of the 24th International Conference on World Wide Web - WWW '15  
But, as far as we know, all these results are limited to design compressed graph indexes which support basic access operations onto the link structure of the input graph, such as: given a node u, return  ...  This paper takes inspiration from the Facebook Unicorn's platform and proposes some compressed-indexing schemes for large graphs whose nodes are labeled with strings of variable length-i.e., node's attributes  ...  We remark that there exist compression schemes specifically designed to achieve higher compression on graphs, especially Web graphs [6, 8] .  ... 
doi:10.1145/2736277.2741140 dblp:conf/www/FerraginaPV15 fatcat:asrd4n5mkbevxd6w5576pybw3q

Inverted index compression via online document routing

Gal Lavee, Ronny Lempel, Edo Liberty, Oren Somekh
2011 Proceedings of the 20th international conference on World wide web - WWW '11  
Modern search engines are expected to make documents searchable shortly after they appear on the ever changing Web. To satisfy this requirement, the Web is frequently crawled.  ...  We show that there exists a tradeoff between the compression of a partitioned index and the distribution of documents from the same host across the index partitions (i.e., host distribution).  ...  However, this results in indexes which do not compress well.  ... 
doi:10.1145/1963405.1963475 dblp:conf/www/LaveeLLS11 fatcat:gte6hs3qqfdrbilhf7vsw7r5gi

A web search engine model based on index-query bit-level compression

Hussein Al-Bahadili, Saif Al-Saab, Reyadh Naoum, Shakir M. Hussain
2010 Proceedings of the 1st International Conference on Intelligent Semantic Web-Services and Applications - ISWSA '10  
In this paper, we propose a new web search engine model based on index-query bit-level compression.  ...  compressed index, and the second layer be located after the query parser for query compression to enable bit-level compressed index-query search.  ...  In this work we propose a new web search engine model that is based on index-query bit-level compression.  ... 
doi:10.1145/1874590.1874597 dblp:conf/iswsa/Al-BahadiliANH10 fatcat:ujtdl3kltnf6zlyfdmdd24yacq

Inverted index compression and query processing with optimized document ordering

Hao Yan, Shuai Ding, Torsten Suel
2009 Proceedings of the 18th international conference on World wide web - WWW '09  
Web search engines use highly optimized compression schemes to decrease inverted index size and improve query throughput, and many index compression techniques have been studied in the literature.  ...  It is known that this can significantly improve index compression compared to a random document ordering. We study index compression and query processing techniques for such reordered indexes.  ...  index size and query throughput on the TREC GOV2 data set of 25.2 million web pages.  ... 
doi:10.1145/1526709.1526764 dblp:conf/www/YanDS09 fatcat:gnkoqsxngnbt3ifyijodkkrmgy

A Mixed Coding Scheme for Inverted File Index Compression

Jinlin Chen, Ping Zhong, Terry Cook
2006 2006 1st IEEE Workshop on Hot Topics in Web Systems and Technologies  
Many codes have been proposed for compressing inverted lists. These codes use different codewords for different dgaps.  ...  One way to improve inverted file compression is to use the cluster property [1] of document collection, which states that term occurrences are not uniformly distributed.  ...  The large amount of information available on the Web requires an efficient indexing mechanism for search engines.  ... 
doi:10.1109/hotweb.2006.355272 dblp:conf/hotweb/ChenZC06 fatcat:g4u6z5g2nbhdtdiho4cj4twuhu

Correlation Between P-wave Velocity and Strength Index for Shale to Predict Uniaxial Compressive Strength Value

H. Awang, N. R. Ahmad Rashidi, M. Yusof, K. Mohammad, M.J. Zainorizuan, L. Yee Yong, L. Alvin John Meng Siang, O. Mohamad Hanifi, R. Siti Nazahiyah, A. Mohd Shalahuddin
2017 MATEC Web of Conferences  
Prediction of uniaxial compressive strength (UCS) value was made via converting the point load strength value to UCS value using a correlation.  ...  This study aims to produce a correlation between P-wave velocity value and point load strength index value for shale. Both field and laboratory tests were carried out.  ...  The other aim is to predict uniaxial compressive strength by using P-wave velocity and point load strength index.  ... 
doi:10.1051/matecconf/201710307017 fatcat:vaicwaiubnevbcfc7yybjcrgeq

Clustered Webbase: A Repository of Web Pages Based on Top Level Domain

Geeta Rani, Nidhi Tyagi
2015 International Journal of Information Technology and Computer Science  
The research focuses on coordinator module which not only indexes the documents but also uses compression technique to increase the storage capacity of repository.  ...  The World Wide Web is a huge source of hyperlinked information; it is growing every moment in context of web documents.  ...  Compress document and update Index Copyright © 2015 MECS I.J.  ... 
doi:10.5815/ijitcs.2015.06.08 fatcat:eur3gahb5vbnpayj3wwtgrzmpy

Maintaining the search engine freshness using mobile agent

Marwa Badawi, Ammar Mohamed, Ahmed Hussein, Mervat Gheith
2013 Egyptian Informatics Journal  
Search engines must keep an up-to-date image to all Web pages and other web resources hosted in web servers in their index and data repositories, to provide better and accurate results to its clients.  ...  So we are interested in detecting the significant changes in web pages which reflect effectively in search engine's index and minimize the network load.  ...  It processes the assigned URL locally as follows: requests the web pages from the web server, generates the document index of the web pages, detect significantly changed web pages, compresses the generated  ... 
doi:10.1016/j.eij.2012.11.001 fatcat:fbesa2uwufhotpswwujirt3v2e

Index Structures for Querying the Deep Web

Jian Qiu, Feng Shao, Misha Zatsman, Jayavel Shanmugasundaram
2003 International Workshop on the Web and Databases  
INTRODUCTION Most current web search engines can only crawl, index, and query over static web pages, also referred to as the "surface web".  ...  This paper focuses on the organization of the deep web index structures once all the indexing data is obtained from the deep web data sources.  ... 
dblp:conf/webdb/QiuSZS03 fatcat:ft2uaxldwvhuvoc4t4x323xdga

On compressing the textual web

Paolo Ferragina, Giovanni Manzini
2010 Proceedings of the third ACM international conference on Web search and data mining - WSDM '10  
For the second scenario, we compare compressed-storage solutions with the new technology of compressed self-indexes [45] .  ...  But we are not aware of any study which has deeply addressed the issue of compressing the raw Web pages.  ...  A modern solution, mostly unexplored at the Web-scale, consists of using the compressed self-indexes as a storageformat that achieves (in theory) space-occupancy close to the kth order entropy of the indexed  ... 
doi:10.1145/1718487.1718536 dblp:conf/wsdm/FerraginaM10 fatcat:akeiabrftjeldhhebgyx5t3y3a
« Previous Showing results 1 — 15 out of 71,567 results