42,484 Hits in 2.9 sec

Probabilistic document indexing from relevance feedback data

N. Fuhr, C. Buckley
1990 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '90  
Based on the binary independence indexing model, we apply three new concepts for probabilistic document indexing from relevance feedback data: (cl 1990 ACM o-89791-408-2 90 0009 45 $1.50  ...  Acknowledgement We thank Keith van Rijsbergen for his constructive comments on an earlier version of this paper.  ...  After a while, when there is enough learning data available, the probabilistic approach can be applied.  ... 
doi:10.1145/96749.98008 dblp:conf/sigir/FuhrB90 fatcat:eqgdee67jnerzon3qouuenl2da

A probabilistic description-oriented approach for categorizing web documents

Norbert Gövert, Mounia Lalmas, Norbert Fuhr
1999 Proceedings of the eighth international conference on Information and knowledge management - CIKM '99  
Our categorisation approach is based on a probabilistic description-oriented representation of web documents, and a probabilistic interpretation of the k-nearest neighbour classifier.  ...  Experimental results show that (1) using an enhanced representation of web documents is crucial for an effective categorisation of web documents, and (2) a theoretical interpretation of the k-nearest neighbour  ...  Our approach requires a test-bed of pre-categorised documents (for the learning phase). The creation of the test-bed is described in Section 4.1.  ... 
doi:10.1145/319950.320053 dblp:conf/cikm/GovertLF99 fatcat:24za4tjts5gxraocue4cgeg44a

I-vector based language modeling for spoken document retrieval

Kuan-Yu Chen, Hung-Shin Lee, Hsin-Min Wang, Berlin Chen, Hsin-Hsi Chen
2014 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
In this paper, we make a step forward to formulate an i-vector based language modeling (IVLM) framework for SDR.  ...  techniques (such as probabilistic linear discriminative analysis, PLDA) can be readily and effectively used.  ...  The results indicate that probabilistic approaches are a school of simple but powerful methods for SDR, and there are still potential research areas for non-probabilistic approaches.  ... 
doi:10.1109/icassp.2014.6854974 dblp:conf/icassp/ChenLWCC14 fatcat:o6nhwymkk5b4hcrj3jjyidu5dy

Combining model-oriented and description-oriented approaches for probabilistic indexing

Norbert Fuhr, Ulrich Pfeifer
1991 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '91  
In this paper, we combine a probabilistic model for the Darmstadt Index- ing Approach with logistic regression.  ...  For the combination of both approaches, we present a new probabilistic indexing model, which can be trans- formed into a log-linear form.  ... 
doi:10.1145/122860.122865 dblp:conf/sigir/FuhrP91 fatcat:fyya2foaf5buto6vjsbpxet7ae

Probabilistic Models in Information Retrieval

N. Fuhr
1992 Computer journal  
A new approach regards IR as uncertain inference; here, imaging is used as a new technique for estimating the probabilistic parameters, and probabilistic inference networks support more complex forms of  ...  For the estimation of these parameters, three different learning strategies are distinguished, namely query-related, document-related and description-related learning.  ...  Document-related learning is orthogonal to the queryrelated strategy: probabilistic indexing models (see Section 3.1) collect relevance feedback data for a specific document d m from a set of queries Q  ... 
doi:10.1093/comjnl/35.3.243 fatcat:4boq3ci2hzg4dbiwyfshicixcm

Models for retrieval with probabilistic indexing

Norbert Fuhr
1989 Information Processing & Management  
The probabilistic indexing weights required for any of these models can be provided by an application of the Darmstadt indexing approach (DIA) for indexing with descriptors from a controlled vocabu-Iary  ...  In this model, the indexing weight of a descriptor in a document is an estimate of the probability of relevance of this document with respect to queries using this descriptor.  ...  The DIA is a dictionary-based indexing approach for automatic indexing from document titles and abstracts, with a prescribed indexing vocabulary.  ... 
doi:10.1016/0306-4573(89)90091-5 fatcat:lkf75n35k5hkvl4uqxzrsfalgy

Optimum polynomial retrieval functions

N. Fuhr
1989 Proceedings of the 12th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '89  
We give experimental results for the application of this approach to documents with weighted indexing ss well as to documents with complex representations.  ...  Then we describe an approach for the development of optimum polynomial retrieval functions: request-document pairs (f" d",) are mapped onto description vectors Z(ft,&,), and a polynomial function of the  ...  Especially, the probabilistic indexing weights of the documents cover all the information necessary for achieving a good ranking (this also has been shown for probabilistic models, see [Fuhr 861, [Fuhr  ... 
doi:10.1145/75334.75343 dblp:conf/sigir/Fuhr89 fatcat:5amkkkzunfcdtacwmrvmbxwenm

A comparative study of various text mining approaches for analysis of drug reaction using social media post

Smruti J. Dave, Prof. Hardik H. Maheta
2019 International Journal of Research in Advent Technology  
We have done comparative study of various text mining approaches on social media posts to understand and obtain adverse drug reaction.  ...  Adverse drug reactions can be referred as unwanted, uncomfortable or a harmful effect that any drug may cause.  ...  An index is a compressed version of the same information which these documents contain. When a user query comes in, index is queried and the documents are obtained which match the particular query.  ... 
doi:10.32622/ijrat.74201994 fatcat:hnaurfn5erdb5amzclmtzhr6si

The Revieval of Subject Analysis: A Knowledge-based Approach facilitating Semantic Search

Sebastian Furth, Volker Belli, Joachim Baumeister
2016 Lernen, Wissen, Daten, Analysen  
The approach is based on a simple but powerful and intuitive probabilistic model that allows for the easy integration of expert knowledge.  ...  Therefore, in this paper we present a novel approach that makes the underlying Subject Indexing task rather a Knowledge Engineering than a Natural Language Processing task.  ...  Acknowledgments The work described in this paper is supported by the Bundesministerium für Wirtschaft und Energie (BMWi) under the grant ZIM ZF4172701 "APOSTL -Accessible Performant Ontology Supported Text Learning  ... 
dblp:conf/lwa/FurthBB16 fatcat:m7okldp6eff43jwxbossetz5bm

Towards rapidly developing database-supported machine learning applications

Frank Rosner, Alexander Hinneburg
2016 Lernen, Wissen, Daten, Analysen  
TopicExplorer is an interactive web application for text mining that uses Bayesian topic models as a core component.  ...  The development of a big data analytics application benefits from a conceptual model that jointly represents aspects about data management as well as machine learning.  ...  Each document n ∈ N consists of a set of tokens M n that represents the words occurring in this document. The document specific token index sets M n partition the total index set of tokens M .  ... 
dblp:conf/lwa/RosnerH16 fatcat:7ixzpye5njcr5afgmvk6x5gyzq

Translating Bayesian Networks into Entity Relationship Models, Extended Version [article]

Frank Rosner, Alexander Hinneburg
2016 arXiv   pre-print
The main contribution of the paper is a method to translate Bayesian networks, a main conceptual language for probabilistic graphical models, into usable entity relationship models.  ...  The transformed representation of a Bayesian network leaves out mathematical details about probabilistic relationships but unfolds all information relevant for data management tasks.  ...  Each document n ∈ N consists of a set of tokens M n that represents the words occurring in this document. The document specific token index sets M n partition the total index set of tokens M .  ... 
arXiv:1607.02399v1 fatcat:rkdvzwqefvd6fa35w2tamihemm

"Is this document relevant?…probably": a survey of probabilistic models in information retrieval

Fabio Crestani, Mounia Lalmas, Cornelis J. Van Rijsbergen, Iain Campbell
1998 ACM Computing Surveys  
This article surveys probabilistic approaches to modeling information retrieval.  ...  The basic concepts of probabilistic approaches to information retrieval are outlined and the principles and assumptions upon which the approaches are based are presented.  ...  ACKNOWLEDGMENTS We would like to thank the anonymous reviewers for their interesting and helpful comments.  ... 
doi:10.1145/299917.299920 fatcat:saq74jbtzzbgvorqsrq7tminjq

A Survey of Information Retrieval Techniques

Mang'are Fridah Nyamisa
2017 Advances in Networks  
As such, they are not ideal for high precision information retrieval applications.  ...  These models are used to find similarities between the query and the documents to retrieve documents that reflect the query.  ...  index term and freedom of supposition for the index term; generally, the idea of probabilistic model is within a probabilistic scope, which allows the user to retrieve and which documents are relevant  ... 
doi:10.11648/ fatcat:i3ahtt53bfd2rcc5aytt6jvb6e

Model-Guided Segmentation and Layout Labelling of Document Images Using a Hierarchical Conditional Random Field [chapter]

Santanu Chaudhury, Megha Jindal, Sumantra Dutta Roy
2009 Lecture Notes in Computer Science  
, paragraph, etc.; and 3. probabilistic layout model for encoding global relations between the above blocks for a particular class of documents.  ...  The system extracts features which encode contextual information and spatial configurations of a given document image, and learns relations between these layout entities using hierarchical CRFs.  ...  Common approaches have often been heuristic in nature, for instance [1] . A learning-based approach is more general than using assumptions about document layouts.  ... 
doi:10.1007/978-3-642-11164-8_61 fatcat:xgaadnwbvjem5mqxz4m7eqahxq

A Survey on Automatic Semantic Subject Indexing of Documents using Big Data Analytics

K. Swanthana
2018 International Journal for Research in Applied Science and Engineering Technology  
So there is a need for effective and efficient indexing and retrieval techniques. Indexing is a crucial aspect that allows the documents to be located quickly.  ...  There is a need for the design of indexing strategies that can support.  ...  Probabilistic Latent Semantic Analysis (PLSA): Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis  ... 
doi:10.22214/ijraset.2018.4282 fatcat:mki2fowtfff4jfug2mlcnkypdi
« Previous Showing results 1 — 15 out of 42,484 results