26,411 Hits in 3.3 sec

A Survey of Automatic Deep Web Classification Techniques

Umara Noor, Zahid Rashid, Azhar Rauf
2011 International Journal of Computer Applications  
In this paper apart from the literature survey, we propose a framework for analysis of automatic classification techniques of deep web.  ...  A large part of Deep web comprises of online structured domain specific databases that are accessed using web query interfaces.  ...  Such extracted content works for both simple/advance search query interfaces and structured/unstructured deep web sources.  ... 
doi:10.5120/2362-3099 fatcat:5xzytnesv5cbfnraompzuwlzhu

A Feature-Weighted Instance-Based Learner for Deep Web Search Interface Identification

Hong Wang, Qingsong Xu, Youyang Chen, Jinsong Lan
2013 Research Journal of Applied Sciences Engineering and Technology  
Determining whether a site has a search interface is a crucial priority for further research of deep web databases.  ...  This study first reviews the current approaches employed in search interface identification for deep web databases.  ...  ACKNOWLEDGMENT This study was supported in part by the National Natural Science Foundation of China (No. 90820302 & No. 10771217).  ... 
doi:10.19026/rjaset.5.4862 fatcat:cwlmmupbrzf5ffzgdo4ltkh56m

Survey of Techniques for Deep Web Source Selection and Surfacing the Hidden Web Content

Khushboo Khurana, M.B. Chandak
2016 International Journal of Advanced Computer Science and Applications  
The paper discusses various techniques that can be used to surface the deep web information and techniques for Deep Web Source Selection.  ...  Large and continuously growing dynamic web content has created new opportunities for large-scale data analysis in the recent years.  ...  The paper also shows the comparative analysis of the two techniques widely used for surfacing the hidden web form processing and querying the deep web by Hidden Web Crawlers and Schema Matching for Virtual  ... 
doi:10.14569/ijacsa.2016.070555 fatcat:yb6ffo7nv5gv3aorditiarfeca

Accessing the deep web

Bin He, Mitesh Patel, Zhen Zhang, Kevin Chen-Chuan Chang
2007 Communications of the ACM  
In overlap analysis, the number of deep Web sites is estimated by exploiting two search engines.  ...  When conducting the survey, we first find the number of query interfaces for each Web site, then the number of Web databases, and finally the number of deep Web sites.  ... 
doi:10.1145/1230819.1241670 fatcat:dguqjfdlx5cdfpuoanoafmkefu

Query Intensive Interface Information Extraction Protocol for deep web

Dilip Kumar Sharma, A. K. Sharma
2009 2009 International Conference on Intelligent Agent & Multi-Agent Systems  
The deep web information can play a significant role in research and development. This paper also discusses the aspect of deep web with analysis of few existing deep web search engines.  ...  A new Query Intensive Interface Information Extraction Protocol (QIIIEP) for deep web retrieval process is proposed.  ...  Schema extraction of query interface is one of the very prime research challenges for comparing and analysis of an integrated query interface for the deep web.  ... 
doi:10.1109/iama.2009.5228052 fatcat:m4o5p72kereajlqpm3dg2zoqs4

Deep web integration with VisQI

Thomas Kabisch, Eduard C. Dragut, Clement Yu, Ulf Leser
2010 Proceedings of the VLDB Endowment  
VisQI is capable of (1) transforming Web query interfaces into hierarchically structured representations, (2) of classifying them into application domains and (3) of matching the elements of different  ...  Thus VisQI contains solutions for the major challenges in building Deep Web integration systems.  ...  VisQI provides a classification algorithm which automatically infers the domain of a search engine using the query interface of the search engine.  ... 
doi:10.14778/1920841.1921053 fatcat:3jtrhxrrljgs5jaz6ckmvfffcy

Structured databases on the web

Kevin Chen-Chuan Chang, Bin He, Chengkai Li, Mitesh Patel, Zhen Zhang
2004 SIGMOD record  
With the potentially unlimited information hidden behind their query interfaces, this "deep Web" of searchable databases is clearly an important frontier for data access.  ...  On one hand, our "macro" study surveys the deep Web at large, in April 2004, adopting the random IP-sampling approach, with one million samples. (How large is the deep Web?  ...  CONCLUSION This paper presents our survey of databases on the Web, or the so called "deep Web."  ... 
doi:10.1145/1031570.1031584 fatcat:6razqtluiffkjhlgvwnxq5olfe

A Survey on Uniform Resource Locator and Content Matching to Discover Deep- Web Pages

Sayali Shelke, Parth Sagar
2017 Indian Journal of Science and Technology  
Objectives: 1) The Objective is to harvest the deep web pages efficiently 1 2) Personalize search according to user interest. 3) Combine pre-query and post-query approach.  ...  To find the deep web pages from the databases is a challenging, because they are not enrolled with any web indexes and keep constantly changing.  ...  the web pages which is like the queries. 1 On web, deep web is expanding there has been expanded enthusiasm for strategies that assistance to find deep-web interfaces.  ... 
doi:10.17485/ijst/2017/v10i17/110304 fatcat:m22ayj53crh73d3fxptzhxibkm

Latent Dirichlet Allocation Based Semantic Clustering of Heterogeneous Deep Web Sources

Umara Noor, Ali Daud, Ayesha Manzoor
2013 2013 5th International Conference on Intelligent Networking and Collaborative Systems  
Allocation (LDA) for modeling content representative of deep web databases.  ...  Among that a large part comprises of online subject-specific databases, hidden behind query interface forms known as deep web.  ...  Web structure mining involves hyperlink structure analysis and document structure analysis techniques that are usually based in graph theory.  ... 
doi:10.1109/incos.2013.28 dblp:conf/incos/NoorDM13 fatcat:xm3uq63b5zcnljam2uyorzkrxy

Research on discovering deep web entries

Ying Wang, Huilai Li, Wanli Zuo, Fengling He, Xin Wang, Kerui Chen
2011 Computer Science and Information Systems  
Ontology plays an important role in locating Domain-Specific Deep Web contents, therefore, this paper presents a novel framework WFF for efficiently locating Domain-Specific Deep Web databases based on  ...  Then, FSC analyzes the interesting pages and determines whether these pages subsume searchable forms based on structural characteristics.  ...  Experiments Though the above analysis, we implement the graphical interface for discovering Deep Web entries which is shown in Fig. 8 .  ... 
doi:10.2298/csis100322028w fatcat:vs5ll74p75dankfuszist3wpjq

Advance Frameworks for Hidden Web Retrieval Using Innovative Vision-Based Page Segmentation

Kopal Maheshwari
2013 IOSR Journal of Computer Engineering  
The structure of the hidden web pages crafts it unfeasible for conventional web crawlers to access hidden web contents.  ...  We proposed innovative Vision-based Page Segmentation (IVBPS) algorithm for hidden web retrieval and develop intelligent crawler and interpretation of hidden web query interfaces.  ...  A IVBPS approach for multi data-region deep web pages to extract structured results from deep Web pages automatically.  ... 
doi:10.9790/0661-1235258 fatcat:bfwuo364xvethgjzn644gf7ndq

Integrating and querying similar tables from PDF documents using deep learning [article]

Rahul Anand, Hye-Young Paik, Cheng Wang
2019 arXiv   pre-print
This is achieved through table type classification and nearest row search.  ...  This paper proposes a deep learning based method to enable SQL-like query and analysis of financial tables from annual reports in PDF format.  ...  At the end of this processing, we have a backend data structure which contains rich information about the table, and we can run filtering queries on this. Server and Web Interface.  ... 
arXiv:1901.04672v1 fatcat:wdicg4ztljg4tfvvhot7pcvxmq

Understanding deep web search interfaces

Ritu Khare, Yuan An, Il-Yeol Song
2010 SIGMOD record  
This paper presents a survey on the major approaches to search interface understanding. The Deep Web consists of data that exist on the Web but are inaccessible via text search engines.  ...  The traditional way to access these data, i.e., by manually filling-up HTML forms on search interfaces, is not scalable given the growing size of Deep Web.  ...  INTERFACE UNDERSTANDING The Deep Web consists of data that exist on the Web but are inaccessible by text search engines through traditional crawling and indexing [17] .  ... 
doi:10.1145/1860702.1860708 fatcat:n577ypwgu5b2xctg2m6zfhevty

Creating and exploring web form repositories

Luciano Barbosa, Hoa Nguyen, Thanh Nguyen, Ramesh Pinnamaneni, Juliana Freire
2010 Proceedings of the 2010 international conference on Management of data - SIGMOD '10  
DeepPeep allows users to explore the entry points to hidden-Web sites whose contents are out of reach for traditional search engines.  ...  We also present the analysis component of DeepPeep which allows users to explore and visualize information in form repositories, helping them not only to better search and understand forms in different  ...  This work has been partially supported by the National Science Foundation (under grants IIS-0713637, IIS-0746500, CNS-0751152) and a University of Utah Seed Grant.  ... 
doi:10.1145/1807167.1807311 dblp:conf/sigmod/BarbosaNNPF10 fatcat:wl45ukw4hzf5lh2memiu75lbvy

Deep Web Search Interface Identification: A Semi-Supervised Ensemble Approach

Hong Wang, Qingsong Xu, Lifeng Zhou
2014 Information  
To surface the Deep Web, one crucial task is to predict whether a given web page has a search interface (searchable HyperText Markup Language (HTML) form) or not.  ...  In this research, we consider the plausibility of using both labeled and unlabeled data to train better models to identify search interfaces more effectively.  ...  The authors want to thank all of the reviewers for their valuable and constructive comments, which greatly improved the quality of this paper.  ... 
doi:10.3390/info5040634 fatcat:dyysfqid7bgu3pb5fvbjfrm56a
« Previous Showing results 1 — 15 out of 26,411 results