Filters








9,400 Hits in 5.4 sec

Exploring a Few Good Tuples from Text Databases

Alpa Jain, Divesh Srivastava
2009 Proceedings / International Conference on Data Engineering  
In this paper, we present a novel exploration methodology of finding a few good tuples for a relation that can be extracted from a database which allows for judging the relevance of the database for the  ...  Information extraction from text databases is a useful paradigm to populate relational tables and unlock the considerable value hidden in plain-text documents.  ...  How can we identify a few good tuples for a relation buried in a text database?  ... 
doi:10.1109/icde.2009.120 dblp:conf/icde/JainS09 fatcat:47nqayo3hffevbenbodghvwqd4

Building query optimizers for information extraction

Alpa Jain, Panagiotis Ipeirotis, Luis Gravano
2009 SIGMOD record  
over text databases.  ...  This paper discusses our SQoUT 1 project, which focuses on processing structured queries over relations extracted from text databases.  ...  Intuitively, precision measures the fraction of tuples extracted by the system from a text database that are good, while recall measures the fraction of good tuples that the system manages to extract from  ... 
doi:10.1145/1519103.1519108 fatcat:pq6q6cntqnfpjncupq6ajnurga

SQL Queries Over Unstructured Text Databases

Alpa Jain, AnHai Doan, Luis Gravano
2007 2007 IEEE 23rd International Conference on Data Engineering  
By processing a text database with information extraction systems, we can define a variety of structured "relations," over which we can then issue SQL queries.  ...  Text documents often embed data that is structured in nature.  ...  As a summary of our conclusions, our query processing approach produced high goodness executions for both selection and projection queries and for all values of user-specified parameter w (see Definition  ... 
doi:10.1109/icde.2007.368986 dblp:conf/icde/JainDG07 fatcat:ggmyaurtynhx5khoktgmppg7ay

Structured Querying of Web Text Data: A Technical Challenge

Michael J. Cafarella, Christopher Ré, Dan Suciu, Oren Etzioni
2007 Conference on Innovative Data Systems Research  
We propose a general-purpose query system called the extraction database, or ExDB, which supports SQL-like structured queries over Web text.  ...  The Web contains a huge amount of text that is currently beyond the reach of structured access tools.  ...  It was also supported by DARPA contract NBCHD030010, ONR grant N00014-02-1-0324, the University of Washington's Turing Center, as well as gifts from Google.  ... 
dblp:conf/cidr/CafarellaRSE07 fatcat:unbtublzdzcqzhen5sai32rn3y

Optimizing SQL Queries over Text Databases

Alpa Jain, AnHai Doan, Luis Gravano
2008 2008 IEEE 24th International Conference on Data Engineering  
a text database might produce answers that are not fully accurate or complete, for a number of reasons.  ...  By processing a text database with information extraction systems, we can materialize a variety of structured "relations," over which we can then issue regular SQL queries.  ...  The remaining authors are supported by a generous gift from the Data Management, Exploration, and Mining Group, Microsoft Research.  ... 
doi:10.1109/icde.2008.4497472 dblp:conf/icde/JainDG08 fatcat:zdiggonxhvcuhmeahucjolhmyy

Ziggy

Thibault Sellam, Martin Kersten
2016 Proceedings of the VLDB Endowment  
Data exploration has received much attention during the last few years. The aim is to learn interesting new facts from a possibly unfamiliar data set.  ...  To assist them, it detects characteristic views, that is, small sets of columns on which the tuples in the results are different from those in the rest of the database.  ...  Ziggy detects and plots characteristic views, that is, small sets of columns on which the user's tuples are different from those in the rest of the database.  ... 
doi:10.14778/3007263.3007287 fatcat:czyqm5n6jnbthh6fjnk7pfbd5e

Join Optimization of Information Extraction Output: Quality Matters!

Alpa Jain, Panagiotis G. Ipeirotis, AnHai Doan, Luis Gravano
2009 Proceedings / International Conference on Data Engineering  
Information extraction (IE) systems are trained to extract specific relations from text databases.  ...  We establish the accuracy of our analytical models, as well as study the effectiveness of a qualityaware join optimizer, with a large-scale experimental evaluation over real-world text collections and  ...  Ideally, when processing a text database with an IE system, we should focus on good documents and process as few empty documents as possible, for efficiency reasons; we should also process as few bad documents  ... 
doi:10.1109/icde.2009.138 dblp:conf/icde/JainIDG09 fatcat:p7at27xy2facrgz3w3wgq5fsxi

Automatic categorization of query results

Kaushik Chakrabarti, Surajit Chaudhuri, Seung-won Hwang
2004 Proceedings of the 2004 ACM SIGMOD international conference on Management of data - SIGMOD '04  
We first develop analytical models to estimate information overload faced by a user for a given exploration.  ...  We dynamically generate a labeled, hierarchical category structure -users can determine whether a category is relevant or not by examining simply its label; she can then explore just the relevant categories  ...  Another scenario is that the user is interested in just one (or two or a few) tuple(s) in R; so she explores R using T till she finds that one (or few) tuple(s).  ... 
doi:10.1145/1007568.1007653 dblp:conf/sigmod/ChakrabartiCH04 fatcat:54hvyxp2svb4lfd2sq7kuyazoq

An Approach to Integrating Query Refinement in SQL [chapter]

Michael Ortega-Binderberger, Kaushik Chakrabarti, Sharad Mehrotra
2002 Lecture Notes in Computer Science  
This paper explores how to enhance database systems with query refinement for content-based (similarity) searches in object-relational databases.  ...  With the emergence of applications that require contentbased similarity retrieval, techniques to support such a retrieval paradigm over database systems have emerged as a critical area of research.  ...  Empirically, even a few feedback judgments can improve query results substantially. We explore the same 4 queries from above and give tuple level feedback for 2, 4 and 8 tuples.  ... 
doi:10.1007/3-540-45876-x_4 fatcat:nyr4qrpptbanfeom2tnhifm5ha

Large-scale extraction and use of knowledge from text

Peter Clark, Phil Harrison
2009 Proceedings of the fifth international conference on Knowledge capture - K-CAP '09  
can fly", "people can drive cars") from text by abstracting from a parser's output, and we have used it to create a database of 23 million propositions of this kind.  ...  Building on ideas by Schubert, we have developed a system called DART (Discovery and Aggregation of Relations in Text) that extracts simple, semi-formal statements of world knowledge (e.g., "airplanes  ...  DISCUSSION AND CONCLUSIONS While there is new interest in creating general knowledge resources from text, there are still few such resources available.  ... 
doi:10.1145/1597735.1597763 dblp:conf/kcap/ClarkH09 fatcat:np3jqqwwufdanor35f4iwdwccy

Standing Out in a Crowd: Selecting Attributes for Maximum Visibility

Muhammed Miah, Gautam Das, Vagelis Hristidis, Heikki Mannila
2008 2008 IEEE 24th International Conference on Data Engineering  
In this paper we focus on a novel and complementary problem: how to guide a seller in selecting the best attributes of a new tuple (e.g., new product) to highlight such that it stands out in the crowd  ...  searching for products in a catalog).  ...  The work of Gautam Das and Muhammed Miah was partially supported by unrestricted gifts from Microsoft Research and start-up funds from the University of Texas, Arlington.  ... 
doi:10.1109/icde.2008.4497444 dblp:conf/icde/MiahDHM08 fatcat:rfvkudznengm3f6642miuoe7ua

A quality-aware optimizer for information extraction

Alpa Jain, Panagiotis G. Ipeirotis
2009 ACM Transactions on Database Systems  
., contains spurious tuples and misses good tuples). Typically, an extraction system has a set of parameters that can be used as "knobs" and tune the system to be either precision-or recall-oriented.  ...  Information extraction systems can extract structured relations from the documents and enable sophisticated, SQL-like queries over unstructured text.  ...  T good good tuples in the text database T bad bad tuples in the text database Tretr tuples extracted from Dproc using E gd(t) number of distinct documents in Dg that contain t bd(t) number of  ... 
doi:10.1145/1508857.1508862 fatcat:7cyge2fkwnaa5aou3h25aop44y

An Empirical Study of Effective and Versatile Keyword Query Search

Tejashree R. Shinde, Prof. Sanchika A. Bajpai
2015 International Journal of Engineering Research and  
In this paper, a survey of work on keyword querying in databases is presented.  ...  A huge amount of research work focusing on the keyword searching, retrieval and query processing has been done in the relational database.  ...  A. Bajpai.  ... 
doi:10.17577/ijertv4is050573 fatcat:ogylhz7p6rh7blto7nszeatjuu

Determining Attributes to Maximize Visibility of Objects

M. Miah, G. Das, V. Hristidis, H. Mannila
2009 IEEE Transactions on Knowledge and Data Engineering  
We introduce a complementary problem: how to guide a seller in selecting the best attributes of a new tuple (e.g., a new product) to highlight so that it stands out in the crowd of existing competitive  ...  In recent years, there has been significant interest in the development of ranking functions and efficient top-k retrieval algorithms to help users in ad-hoc search and retrieval in databases (e.g., buyers  ...  We introduce the problem of selecting attributes of a tuple for maximum visibility as a new data exploration problem.  ... 
doi:10.1109/tkde.2009.72 fatcat:eavx2lkgmvg27mlqn65qema65u

Abstract Information Model for presenting database query results

M. Sumathi, T. Kalaikumaran
2012 2012 International Conference on Computer Communication and Informatics  
Even though the full text search engines are developed using the database search techniques as a base, the full text search engines has acquired good response from the users because of its simplicity and  ...  But in the case of database search, there are database results, which cannot be ranked by usual sorting technique used in database engines. This results in information overload.  ...  They have used queries from a movie recommender system for implementing that. This system is really good to present a large number of results in a single spiral.  ... 
doi:10.1109/iccci.2012.6158823 fatcat:ckduyxysnrfmhbue25hsu7oufq
« Previous Showing results 1 — 15 out of 9,400 results