Filters








36,348 Hits in 5.5 sec

Boosting Web Retrieval Through Query Operations [chapter]

Gilad Mishne, Maarten de Rijke
2005 Lecture Notes in Computer Science  
We explore the use of phrase and proximity terms in the context of web retrieval, which is different from traditional ad-hoc retrieval both in document structure and in query characteristics.  ...  We also analyze why phrase and proximity terms are far more effective for web retrieval than for ad-hoc retrieval.  ...  Acknowledgments The authors wish to thank Jaap Kamps for many discussions and inspiration.  ... 
doi:10.1007/978-3-540-31865-1_36 fatcat:sna7jjdfrjh6ritpvjesn3onjq

Web Information Retrieval Using Genetic Algorithm-Particle Swarm Optimization

Priya I. Borkar, Leena H. Patil
2013 International Journal of Future Computer and Communication  
Conventional search engines use heuristics to determine which web pages are the best match for a given keyword.  ...  That's why more and more people begin to use focused crawler to get information in their special fields today.  ...  Yang et al. suggested using GAs with user feedback to choose weights for search terms in a query [7] .  ... 
doi:10.7763/ijfcc.2013.v2.234 fatcat:5pnau65s35e27p7nd4cwl77s4a

Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval [article]

Shitao Xiao, Zheng Liu, Weihao Han, Jianjin Zhang, Yingxia Shao, Defu Lian, Chaozhuo Li, Hao Sun, Denvy Deng, Liangjie Zhang, Qi Zhang, Xing Xie
2022 arXiv   pre-print
In this work, we tackle this problem with Bi-Granular Document Representation, where the lightweight sparse embeddings are indexed and standby in memory for coarse-grained candidate search, and the heavyweight  ...  Nowadays, the embedding-based retrieval (EBR) becomes a promising solution, where deep learning based document representation and ANN search techniques are allied to handle this task.  ...  used for the training of siamere BERT based document encoders.  ... 
arXiv:2201.05409v3 fatcat:eyofrsjwanh5zkmwuzl7ksvmmm

Language Models for Searching in Web Corpora

Jaap Kamps, Gilad Mishne, Maarten de Rijke
2004 Text Retrieval Conference  
We describe our participation in the TREC 2004 Web and Terabyte tracks.  ...  For the web track, we employ mixture language models based on document full-text, incoming anchortext, and documents titles, with a range of webcentric priors.  ...  Acknowledgments Thank you to Börkur Sigurbjörnsson for useful suggestions and discussion.  ... 
dblp:conf/trec/KampsMR04 fatcat:lz7gdqjnofhpdkow2t3fgvbmzm

XML and information retrieval

David Carmel, Yoelle Maarek, Aya Soffer
2001 SIGMOD record  
Acknowledgments: We would like to thank Sue Dumais for her quest to bridge the gap between the IR and Web communities that made this workshop possible, as well as all of the workshop participants for making  ...  For more information about the XML and Information Retrieval workshop, visit the workshop Web page at http://www.haifa.il.ibm.com/sigir00-xml/ or contact any of the organizers: David Carmel carmel@il.ibm.com  ...  It is believed that it will become a universal format for data exchange on the Web and that in the near future we will find vast amounts of documents in XML format on the Web.  ... 
doi:10.1145/373626.373705 fatcat:wg5c7vvlxfgc7orqpplixvlgki

XML and information retrieval

David Carmel, Yoelle Maarek, Aya Soffer
2000 SIGIR Forum  
Acknowledgments: We would like to thank Sue Dumais for her quest to bridge the gap between the IR and Web communities that made this workshop possible, as well as all of the workshop participants for making  ...  For more information about the XML and Information Retrieval workshop, visit the workshop Web page at http://www.haifa.il.ibm.com/sigir00-xml/ or contact any of the organizers: David Carmel carmel@il.ibm.com  ...  It is believed that it will become a universal format for data exchange on the Web and that in the near future we will find vast amounts of documents in XML format on the Web.  ... 
doi:10.1145/373593.373624 fatcat:y53o3p2lzfddnd2xdtstd3topu

Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval

Shitao Xiao, Zheng Liu, Weihao Han, Jianjin Zhang, Yingxia Shao, Defu Lian, Chaozhuo Li, Hao Sun, Denvy Deng, Liangjie Zhang, Qi Zhang, Xing Xie
2022 Proceedings of the ACM Web Conference 2022  
In this work, we tackle this problem with Bi-Granular Document Representation, where the lightweight sparse embeddings are indexed and standby in memory for coarse-grained candidate search, and the heavyweight  ...  Nowadays, the embedding-based retrieval (EBR) becomes a promising solution, where deep learning based document representation and ANN search techniques are allied to handle this task.  ...  RELATED WORK • Deep Document Representation. Document representation is a fundamental part in EBR.  ... 
doi:10.1145/3485447.3511957 fatcat:wdnqr756ingkdkdvrpcpg3iooe

A comparative study on the Assortment of Information Retrieval systems

L. Senthilvadivu
2018 International Journal of Scientific Research in Computer Sciences and Engineering  
For thousands of years people have realized the importance of archiving and finding information.  ...  The information retrieval by submitting the queries bring out millions of documents which consume the precious time of the user.  ...  incorporating additional operators such as term proximity operators.  ... 
doi:10.26438/ijsrcse/v6i2.109112 fatcat:5du75zj3ezh3rlzq5dgbgtufhu

The Continued Saga of DB-IR Integration [chapter]

R BAEZAYATES, M CONSENS
2004 Proceedings 2004 VLDB Conference  
Ÿ Integrated web-crawler and out-of-the-box-GUI with Ultra Search • Minimal operations: -Single-word and phrase search with stopwords -Suffix, prefix, infix -Proximity searching (with order) -Boolean  ...  Looking for "k-means" in lotus.com • The XML data conforms to the publisher's DTD • Web interface for the citation 2.  ... 
doi:10.1016/b978-012088469-8/50118-2 fatcat:dktiusnpj5hcfbu2fopto7psqq

The Continued Saga of DB-IR Integration [chapter]

Ricardo Baeza-Yates, Mariano Consens
2004 Proceedings 2004 VLDB Conference  
Ÿ Integrated web-crawler and out-of-the-box-GUI with Ultra Search • Minimal operations: -Single-word and phrase search with stopwords -Suffix, prefix, infix -Proximity searching (with order) -Boolean  ...  Looking for "k-means" in lotus.com • The XML data conforms to the publisher's DTD • Web interface for the citation 2.  ... 
doi:10.1016/b978-012088469-8.50118-2 dblp:conf/vldb/Baeza-YatesC04 fatcat:2lzk6qlgurgbdoj6do2qtxy2za

A MODEL OF HYBRID GENETIC ALGORITHM-PARTICLE SWARM OPTIMIZATION(HGAPSO) BASED QUERY OPTIMIZATION FOR WEB INFORMATION RETRIEVAL

Priya I. Borkar .
2013 International Journal of Research in Engineering and Technology  
Conventional search engines use heuristics to determine which web pages are the best match for a given keyword.  ...  That's why more and more people begin to use focused crawler to get information in their special fields today.  ...  Thus, by using the genetic algorithm in this paper presents a model of hybrid GAPSO (HGAPSO) based for effective Web information retrieval.  ... 
doi:10.15623/ijret.2013.0201012 fatcat:6aes6cswjnha3oo4qmlf7dluai

Indri at TREC 2004: Terabyte Track

Donald Metzler, Trevor Strohman, Howard R. Turtle, W. Bruce Croft
2004 Text Retrieval Conference  
Our methods use term proximity information and HTML document structure. In addition, a number of optimization procedures for efficient query processing are explained.  ...  This paper provides an overview of experiments carried out at the TREC 2004 Terabyte Track using the Indri search engine. Indri is an efficient, effective distributed search engine.  ...  Acknowledgments This work was supported in part by the Center for Intelligent Information Retrieval and in part by Advanced Research and De-velopment Activity and NSF grant #CCF-0205575.  ... 
dblp:conf/trec/MetzlerSTC04 fatcat:ntojfwj4rbgljplsmhpq3xteoa

PEx-WEB: Content-based Visualization of Web Search Results

Fernando V. Paulovich, Roberto Pinho, Charl P. Botha, Anton Heijs, Rosane Minghim
2008 2008 12th International Conference Information Visualisation  
The system (The Projection explorer for the WWW, or PEx-Web) implements these techniques and various additional tools as means to make better use of web search results for exploratory applications.  ...  The efficacy of search engines has expanded the uses for the information available on the Web. An increasing number of applications make use of the WWW as a primary source of information.  ...  We wish to acknowledge the work of our undergraduate and research students as well as research colleagues in processing some data and discussing various issues of the work.  ... 
doi:10.1109/iv.2008.94 dblp:conf/iv/PaulovichPBHM08 fatcat:2jz4x7mw7vaa5jdkwkaq6pzvvq

Quasi-succinct indices

Sebastiano Vigna
2013 Proceedings of the sixth ACM international conference on Web search and data mining - WSDM '13  
Compressed inverted indices in use today are based on the idea of gap compression: documents pointers are stored in increasing order, and the gaps between successive document pointers are stored using  ...  proximity queries.  ...  All the code used for experiments is available at the MG4J web site.  ... 
doi:10.1145/2433396.2433409 dblp:conf/wsdm/Vigna13 fatcat:l3pbv5ulzzb73as2p6vxmonsna

Document vector representations for feature extraction in multi-stage document ranking

Nima Asadi, Jimmy Lin
2012 Information retrieval (Boston)  
In this context, feature extraction can be accomplished using a document vector index, a mapping from document ids to document representations.  ...  In particular, we propose a novel document-adaptive hashing scheme for compactly encoding term ids.  ...  Acknowledgments This work has been supported in part by NSF under awards IIS-0916043, IIS-1144034, and IIS-1218043.  ... 
doi:10.1007/s10791-012-9217-9 fatcat:m3p24omrujc7natc2ai2op2xoa
« Previous Showing results 1 — 15 out of 36,348 results