Filters








8,863 Hits in 8.0 sec

Novel Methods for Forensic Multimedia Data Analysis: Part II [chapter]

Petra Perner
2020 Digital Forensic Science [Working Title]  
We continue our work of novel methods for forensic multimedia data analysis Part I by a description of related work and a proposal of the methods and techniques we are developing beyond the state of the  ...  Such data allow the investigator to identify and compare objects, events or persons based on data properties, including biometric features or more symbolic features that point to coincidences and anomalies  ...  While the commercially available optical character recognition systems are very successful for printed documents, recognition of handwritten text continues to be a challenge [17, 18] .  ... 
doi:10.5772/intechopen.92548 fatcat:oo2pjx6la5eqzm3rcph6bog6ie

Improving Query Correctness Using Centralized Probably Approximately Correct (PAC) Search [chapter]

Ingemar Cox, Jianhan Zhu, Ruoxun Fu, Lars Kai Hansen
2010 Lecture Notes in Computer Science  
A non-deterministic architecture for information retrieval, known as probably approximately correct (PAC) search, has recently been proposed.  ...  A theoretical analysis is presented that provides an upper bound on the performance of any iterative algorithm.  ...  To meet this challenge commercial search engines use a centralized distributed architecture in which the index is disjointly partitioned across a number of clusters [2] .  ... 
doi:10.1007/978-3-642-12275-0_25 fatcat:u2frgksedngxlkyn46se4n4tw4

Spatiotemporal Keyword Query Suggestion Based On Document Proximity and K-Means Method– A Review

Aju Tom Kuriakose, Sobhana N.V
2017 IJARCCE  
The commercial web search engines are looking for the efficient keyword suggestions methods for retrieving the relevant information.  ...  The K-Means method used for retrieving the highest ranked top k objects near to the current location of the user and Time aware query suggestion brings out the most relevant documents based on the temporal  ...  Extracting information are necessary for identifying the location based on features that are embedded in a document.  ... 
doi:10.17148/ijarcce.2017.63157 fatcat:v7a3huyozzfh7hizniut5wuhiy

Selective Search

Anagha Kulkarni, Jamie Callan
2015 ACM Transactions on Information Systems  
This search technique first partitions the corpus, based on documents' similarity, into topic-based shards.  ...  This article investigates and extends an alternative: selective search, an approach that partitions the dataset based on document similarity to obtain topic-based shards, and searches only a few shards  ...  We show that one of the main reasons for the success of topic-based partitioning strategies is their ability to concentrate relevant documents for a query into few shards, which is necessary to support  ... 
doi:10.1145/2738035 fatcat:fistpgm5abemdeecnpiqmt4szi

Content-Based Copy Retrieval Using Distortion-Based Probabilistic Similarity Search

A. Joly, O. Buisson, C. Frelicot
2007 IEEE transactions on multimedia  
We first propose a new approximate similarity search technique in which the probabilistic selection of the feature space regions is not based on the distribution in the database but on the distribution  ...  Content-based copy retrieval (CBCR) aims at retrieving in a database all the modified versions or the previous versions of a given candidate object.  ...  Usually, this step is performed by a vote on the document identifier provided with each retrieved local feature [4] , [25] .  ... 
doi:10.1109/tmm.2006.886278 fatcat:5jyy3zdvana3zew55eip7lxpce

Tuning the capacity of search engines

Diego Puppin, Fabrizio Silvestri, Raffaele Perego, Ricardo Baeza-Yates
2010 ACM Transactions on Information Systems  
This article introduces an architecture for a document-partitioned search engine, based on a novel approach combining collection selection and load balancing, called load-driven routing.  ...  By trading off a small fraction of the results, our technique allows us to strongly reduce the computing pressure to a search engine back-end; we are able to retrieve more than 2/3 of the top-5 results  ...  A query first retrieves block identifiers from the centralized index, then searches the topranked blocks to retrieve single documents.  ... 
doi:10.1145/1740592.1740593 fatcat:hpbv3ivyk5ctxozfvsmsjzgkeu

Automated Ontology Instantiation from Tabular Web Sources The AllRight System

Dietmar Jannach, Kostyantyn Shchekotykhin, Gerhard Friedrich
2009 Social Science Research Network  
In many domains, however, one possible solution to this problem is to automate the instantiation process for a given ontology by searching (mining) the web for the required instance information.  ...  The main innovative pillars of the system are a new high-recall focused crawling technique (xCrawl), a novel table recognition algorithm, innovative methods for document clustering and instance name recognition  ...  Focused crawlers, in contrast to breadth-first crawlers used by search engines, therefore use an informed-search strategy and try to retrieve only those parts of the web relevant to a particular topic  ... 
doi:10.2139/ssrn.3199423 fatcat:b6mk4wibcnb2vkwfdyl726shd4

Automated ontology instantiation from tabular web sources—The AllRight system

Dietmar Jannach, Kostyantyn Shchekotykhin, Gerhard Friedrich
2009 Journal of Web Semantics  
In many domains, however, one possible solution to this problem is to automate the instantiation process for a given ontology by searching (mining) the web for the required instance information.  ...  The main innovative pillars of the system are a new high-recall focused crawling technique (xCrawl), a novel table recognition algorithm, innovative methods for document clustering and instance name recognition  ...  Focused crawlers, in contrast to breadth-first crawlers used by search engines, therefore use an informed-search strategy and try to retrieve only those parts of the web relevant to a particular topic  ... 
doi:10.1016/j.websem.2009.04.002 fatcat:pxijgeakcvad3aeazbg56ansoi

Earlybird: Real-Time Search at Twitter

Michael Busch, Krishna Gade, Brian Larson, Patrick Lok, Samuel Luckenbill, Jimmy Lin
2012 2012 IEEE 28th International Conference on Data Engineering  
Earlybird represents a point in the design space of real-time search engines that has worked well for Twitter's needs.  ...  A key requirement of real-time search is the ability to ingest content rapidly and make it searchable immediately, while concurrently supporting low-latency, highthroughput query evaluation.  ...  Thanks also goes out to Jeff Dalton, Donald Metzler, and Ian Soboroff for comments on earlier drafts of this paper.  ... 
doi:10.1109/icde.2012.149 dblp:conf/icde/BuschGLLLL12 fatcat:bmbfnesddndz7jy3buunnhz72i

Peer-to-peer information retrieval

Almer S. Tigelaar
2012 SIGIR Forum  
In this article we provide an overview of the key challenges for peer-to-peer information retrieval and the work done so far.  ...  Peer-to-peer technology is widely used for file sharing. In the past decade a number of prototype peer-to-peer information retrieval systems have been developed.  ...  Separate indexing strategies could be used for fast and slow changing Web documents. A further challenge is caching of postings lists or search results.  ... 
doi:10.1145/2422256.2422276 fatcat:u5n62556ezbrllenb7frlatiri

An adaptive network-constrained clustering (ANCC) model for fine-scale urban functional zones

Jie Song, Hanfa Xing, Huanxue Zhang, Yuetong Xu, Yuan Meng
2021 IEEE Access  
A comparison of a block-level mapping model, a non-adaptive network-based model and the ANCC model reveals accuracies of 53.10%, 59.20% and 77.10%, respectively, indicating the advantages of the ANCC model  ...  On this basis, a term frequency-inverse document frequency-weighted latent Dirichlet allocation (TW-LDA) topic model is designed to delineate urban functions from semantic information.  ...  In particular, it is very challenging to identify the main functions in residential-commercial mixed functions.  ... 
doi:10.1109/access.2021.3070345 fatcat:ckiyfiicjvgztfgfojrsgbxadm

A Secure Search Engine for the Personal Cloud

Saliha Lallali, Nicolas Anciaux, Iulian Sandu Popa, Philippe Pucheral
2015 Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data - SIGMOD '15  
The main difficulty lies in the design of an inverted document index and its related search and update algorithms capable of tackling the strong hardware constraints of these devices.  ...  constraints of tamper-resistant devices and provide scalable solutions for the Personal Cloud.  ...  The second part of the demonstration will focus on the search engine performance, scalability and compliance with the low RAM constraint.  ... 
doi:10.1145/2723372.2735376 dblp:conf/sigmod/LallaliAPP15 fatcat:aebkt7eignhjbdlr3is2jy75xe

Efficient query processing in geographic web search engines

Yen-Yu Chen, Torsten Suel, Alexander Markowetz
2006 Proceedings of the 2006 ACM SIGMOD international conference on Management of data - SIGMOD '06  
Query processing is a major bottleneck in standard web search engines, and the main reason for the thousands of machines used by the major engines.  ...  Academic research in this area has focused primarily on techniques for extracting geographic knowledge from the web.  ...  We also thank Bernhard Seeger, Thomas Brinkhoff, and Xiaohui Long for earlier collaboration and discussions on geographic search.  ... 
doi:10.1145/1142473.1142505 dblp:conf/sigmod/ChenSM06 fatcat:nnl4xqiulrft5kqi3dbitxmf3y

SpeechFind: advances in spoken document retrieval for a National Gallery of the Spoken Word

J.H.L. Hansen, Rongqing Huang, Bowen Zhou, M. Seadle, J.R. Deller, A.R. Gurijala, M. Kurimo, P. Angkititrakul
2005 IEEE Transactions on Speech and Audio Processing  
Advances in formulating spoken document retrieval for a new National Gallery of the Spoken Word (NGSW) are addressed.  ...  Our experimental online system entitled "SpeechFind" is presented, which allows for audio retrieval from a portion of the NGSW corpus.  ...  AUDIO CORPUS STRUCTURE OF NGSW Spoken document retrieval focuses on employing text-based search strategies from transcripts of audio materials.  ... 
doi:10.1109/tsa.2005.852088 fatcat:73g7x2ea6fg2lfjbw7rhrsa6we

Inverted files for text search engines

Justin Zobel, Alistair Moffat
2006 ACM Computing Surveys  
The technology underlying text search engines has advanced dramatically in the past decade.  ...  The development of a family of new index representations has led to a wide range of innovations in index storage, index construction, and query evaluation.  ...  Jamie Callan, Bruce Croft, Donna Harman, Mike Lesk, and Ellen Voorhees helped us identify some of the early work in the area.  ... 
doi:10.1145/1132956.1132959 fatcat:u56re4tqtfg6zcpyfnzl5ne57m
« Previous Showing results 1 — 15 out of 8,863 results