Filters








251,118 Hits in 8.9 sec

A Framework for Extracting Information from Semi-Structured Web Data Sources

Mahmoud Shaker, Hamidah Ibrahim, Aida Mustapha, Lili Nurliyana Abdullah
2008 2008 Third International Conference on Convergence and Hybrid Information Technology  
We proposed a framework for extracting information from different web data sources.  ...  Over the past decade, researchers have developed a rich family of generic Information Extraction techniques that are suitable for a wide variety of sources from rigidly formatted documents such as HTML  ...  A Framework for Extracting Information from Semi-Structured Web Data Sources, Convergence and Hybrid Information Technologies, Marius Crisan (Ed.), ISBN: 978-953-307-068-1, InTech, Available from: http  ... 
doi:10.1109/iccit.2008.60 fatcat:zutqejtmcfcgrp54ez5m4ny7xy

Conceptual Extraction of Domain Knowledge Graph in Different Data Sources

HUI-ZHEN BIAN, SI HA
2018 DEStech Transactions on Computer Science and Engineering  
All the industries and universities have built their own areas of knowledge graph, how to extract the concepts from different data sources becomes the key technology of knowledge graph construction, the  ...  This paper will illustrate the conceptual extraction methods of knowledge graph, in order to guide scholars to choose reasonable methods for academic analysis and enhance the application level of knowledge  ...  This paper explores the concept extraction of knowledge graphs, summarizes and proposes some extraction methods of concepts, and discusses the method of extracting concepts from different data sources.  ... 
doi:10.12783/dtcse/iceit2017/19850 fatcat:4qagp3eford7bgtamxclrpz3cu

Information discovery from semi-structured sources – Application to astronomical literature

Taoufiq Dkaki, Bernard Dousset, Daniel Egret, Josiane Mothe
2000 Computer Physics Communications  
Textual information systems provide different kinds of information seeking that answer different user needs.  ...  Among them, knowledge discovery systems aim at providing global views and useful patterns from raw information.  ...  Technologies from different fields are used to achieve this. We present a framework that aims to extract the information to mine from different heterogeneous document sources.  ... 
doi:10.1016/s0010-4655(99)00509-3 fatcat:ckgts4tu2ncnzn4d46dxlo4avu

Open-Source Android App Detection considering the Effects of Code Obfuscation

Seong-je Cho, Kyeonghwan Lim, Jungkyu Han, Byoung-chir Kim, Minkyu Park, Sangchul Han
2018 Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications  
In this paper, we propose an effective technique to extract software birthmarks (i.e., features) from executable code of Android apps and find out whether the executable code is created from OSS by comparing  ...  The proposed technique uses class hierarchy information (CHI) and control flow graphs (CFGs) as software birthmarks of Java bytecode code level.  ...  When a target app is given, CHI and CFG birthmarks are extracted from the target app in a similar way to extracting the reference birthmark (Extracting Features).  ... 
doi:10.22667/jowua.2018.09.30.050 dblp:journals/jowua/ChoLHKPH18 fatcat:26f3225offfmbpnsuqn6is3kge

Extractive and Abstractive Text Summarization Techniques

2020 International journal of recent technology and engineering  
Text summarization generates an abstract version of information on a particular topic from various sources without modifying its originality.  ...  The manual summarization consumes a large amount of time and hence an automated text summarization model is required. The summarization can be performed from a single source or multiple sources.  ...   Abstract: Text summarization generates an abstract version of information on a particular topic from various sources without modifying its originality.  ... 
doi:10.35940/ijrte.a2235.059120 fatcat:4bfnvpyaxbbw7apxayo4zkuy7u

Summarization [chapter]

Jimmy Lin
2016 Encyclopedia of Database Systems  
Nevertheless, extractive techniques have proven to be effective in various summarization tasks. With extractive techniques, generation is trivial since systems simply copy material from the source.  ...  Similar issues exist with temporal expressions. Note that these problems become more severe in the multi-document case, since extracts are drawn from different sources.  ... 
doi:10.1007/978-1-4899-7993-3_953-2 fatcat:fyegdbfyavgrtcmhbgjiddm4hm

Shiva++: An Enhanced Graph based Ontology Matcher

Iti Mathur, Nisheeth Joshi, Hemant Darbari, Ajai Kumar
2014 International Journal of Computer Applications  
We have used a graph matching technique which works at the core of the system.  ...  With the web getting bigger and assimilating knowledge about different concepts and domains, it is becoming very difficult for simple database driven applications to capture the data for a domain.  ...  Our system is capable of recognizing different formats and extract concepts, sub-concepts, and attributes from ontologies.  ... 
doi:10.5120/16095-5393 fatcat:xfbqvwnmbffepo5bvucy6wkzo4

Automatically maintaining wrappers for semi-structured web sources

Juan Raposo, Alberto Pan, Manuel Álvarez, Justo Hidalgo
2007 Data & Knowledge Engineering  
In order to let software programs gain full benefit from semi-structured web sources, wrapper programs must be built to provide a "machine-readable" view over them.  ...  In our approach the system collects some query results during normal wrapper operation and, when the source changes, it uses them as input to generate a set of labeled examples for the source which can  ...  Typically, a program for extracting data elements of a given type T will have a DEXTL element for each field from T.  ... 
doi:10.1016/j.datak.2006.06.006 fatcat:dexkokkpd5ghpkd3s3ridqt6um

Mining data records in Web pages

Bing Liu, Robert Grossman, Yanhong Zhai
2003 Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '03  
It is useful to mine such data records in order to extract information from them to provide value-added services.  ...  A large amount of information on the Web is contained in regularly structured objects, which we call data records.  ...  Acknowledgement: We thank Chris Livadas for identifying some errors in the original pseudo-code.  ... 
doi:10.1145/956750.956826 dblp:conf/kdd/LiuGZ03 fatcat:aacxpche3vaztmee3mlxjyyhz4

Mining data records in Web pages

Bing Liu, Robert Grossman, Yanhong Zhai
2003 Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '03  
It is useful to mine such data records in order to extract information from them to provide value-added services.  ...  A large amount of information on the Web is contained in regularly structured objects, which we call data records.  ...  Acknowledgement: We thank Chris Livadas for identifying some errors in the original pseudo-code.  ... 
doi:10.1145/956804.956826 fatcat:cv5l6nk3vjhevoorg7hp2y2txu

Ontology-Driven Data Semantics Discovery for Cyber-Security [chapter]

Marcello Balduccini, Sarah Kushner, Jacquelin Speck
2015 Lecture Notes in Computer Science  
We present an architecture for data semantics discovery capable of extracting semantically-rich content from human-readable files without prior specification of the file format.  ...  The ontology also provides an abstraction layer for querying the extracted data. We provide a general description of the architecture as well as details of the current implementation.  ...  Yoon for useful discussions on the topic of ad hoc data sources.  ... 
doi:10.1007/978-3-319-19686-2_1 fatcat:shcpjacf3zh4zmqtluheysm4t4

An Ontological Framework for Information Extraction From Diverse Scientific Sources

Gohar Zaman, Hairulnizam Mahdin, Khalid Hussain, Atta-Ur-Rahman, Jemal Abawajy, Salama A. Mostafa
2021 IEEE Access  
We have also compared the proposed information extraction approach against state-of-the-art techniques.  ...  Although various information extraction techniques have been proposed in the literature, their efficiency demands domain specific documents with static and well-defined format.  ...  Various techniques from different fields have been proposed to automate information extraction from scientific repositories [10] - [12] .  ... 
doi:10.1109/access.2021.3063181 fatcat:bsoz4v7ndvb7xeltpmvo2wmkym

Distilling Scenarios from Patterns for Software Architecture Evaluation – A Position Paper [chapter]

Liming Zhu, Muhammad Ali Babar, Ross Jeffery
2004 Lecture Notes in Computer Science  
, which can be extracted and documented for the SA evaluation.  ...  Most of the existing techniques for developing scenarios use stakeholders and requirements documents as main sources of collecting scenarios.  ...  Similar scenarios can be extracted from States Holder, Value Object, Detailed Object [13] .  ... 
doi:10.1007/978-3-540-24769-2_19 fatcat:v2a4c43avvepppays2edg2iope

On the Fragmentation of Process Information: Challenges, Solutions, and Outlook [chapter]

Han van der Aa, Henrik Leopold, Felix Mannhardt, Hajo A. Reijers
2015 Lecture Notes in Business Information Processing  
The existence of e cient techniques to combine and integrate process information from di↵erent sources can therefore provide much value to an organization.  ...  Such information fragmentation poses considerable problems if, for example, stakeholders wish to develop a comprehensive understanding of their operations.  ...  First, we argue for the importance of improved and new techniques for the extraction and alignment of process information from various sources.  ... 
doi:10.1007/978-3-319-19237-6_1 fatcat:poskx3iu7rgrjgqqcn2cz3vple

Exploring the Field of Text Mining

Radha Guha
2017 International Journal of Computer Applications  
Text mining is the technique of automatically deducing nonobvious but statistically supported novel information from various text data sources written in natural languages.  ...  Thus text mining is becoming very essential for business intelligence extraction as volume of internet data generation is growing exponentially.  ...  Veracity and value of the extracted information from different sources depends on how filtered the information is and how fast the information is extracted to be useful at the right time.  ... 
doi:10.5120/ijca2017915682 fatcat:5slojxqclfcybgrhmyfvtr5dl4
« Previous Showing results 1 — 15 out of 251,118 results