Filters








24,458 Hits in 2.8 sec

Data Cleaning Utilizing Ontology Tool

Jing Ting Wong, Jer Lang Hong
2016 International Journal of Grid and Distributed Computing  
Before data cleaning is carried out, we first need to extract out the data from deep webpages. To do this, we use WISH wrapper to extract data.  ...  On the other hand, Stefan and Fabian [3] uses ontology for Data Quality Management and Julian et. al., [7] uses an ontology based approach for data cleaning which is able to scale across big data in  ... 
doi:10.14257/ijgdc.2016.9.7.05 fatcat:47d3igrdpjhdpbyqsbyahoky2q

Automated Model-to-Metamodel Transformations Based on the Concepts of Deep Instantiation [chapter]

Gerd Kainz, Christian Buckl, Alois Knoll
2011 Lecture Notes in Computer Science  
The resulting approach combines the clean and compact specification of deep instantiation with the easy applicability of model-to-metamodel transformations in an automated way.  ...  Numerous systems, especially component-based systems, are based on a multi-phase development process where an ontological hierarchy is established.  ...  The resulting approach uses the clean and compact description of the deep instantiation to automate the M2MM transformations approach.  ... 
doi:10.1007/978-3-642-24485-8_3 fatcat:lpi752bhevaplhovxi7ehqrfey

Aura: Privacy-preserving augmentation to improve test set diversity in noise suppression applications [article]

Xavier Gitiaux, Aditya Khant, Ebrahim Beyrami, Chandan Reddy, Jayant Gupchup, Ross Cutler
2022 arXiv   pre-print
As an application of , we augment the INTERSPEECH 2021 DNS challenge by sampling audio files from a new batch of data of 20K clean speech clips from Librivox mixed with noise clips obtained from Audio  ...  However, this approach leads to regressions due to the lack of training/testing on representative customer data. Moreover, due to privacy reasons, developers cannot listen to customer content.  ...  Since real-world applications contain speech mixed with noise, most deep learning methods rely on synthetic data that mix clean and noisy speech [4] .  ... 
arXiv:2110.04391v2 fatcat:3n5bjbgqhvcs3ghgj3dv3mmyyu

Research on Semantic Text Mining Based on Domain Ontology [chapter]

Lihua Jiang, Hong-bin Zhang, Xiaorong Yang, Nengfu Xie
2013 IFIP Advances in Information and Communication Technology  
Ontology provides theoretical basis and technical support for semantic information representation and organization.  ...  Before acquisition text information, the text data must be pretreated including data cleaning such as noise reduction and duplication removal, data selection, text segmentation such as Chinese Word Segmentation  ...  and Paragraphs segmentation. ⑶ Text information extraction After pretreatment, the text data must be clean and then feature information must be extracted including word segmentation, feature representation  ... 
doi:10.1007/978-3-642-36124-1_40 fatcat:rmkpgf5xcfh7xbfo3rdbzsnrhe

Foundational Challenges in Automated Semantic Web Data and Ontology Cleaning

J.A. Alonso-Jimenez, J. Borrego-Diaz, A.M. Chavez-Gonzalez, F.J. Martin-Mateos
2006 IEEE Intelligent Systems  
Applying automated reasoning systems to Semantic Web data cleaning and to cleaning-agent design raises many challenges.  ...  / loading process 11 for data cleaning.  ...  The challenges The long-term goal for Semantic Web data cleaning is to design general-purpose cleaning agentsintelligent agents that can find and repair both ontology and data anomalies in KDBs.  ... 
doi:10.1109/mis.2006.7 fatcat:y5x567uyl5ak3cflh27e35epay

ISENS: A Multi-ontology Query System for the Semantic Deep Web

Abir Qasem, Dimitre A. Dimitrov, Jeff Heflin
2008 Advanced Issues of E-Commerce and Web-Based Information Systems (WECWIS), International Workshop on  
Furthermore, it retrieves facts from sources where the data is not directly described in terms of the query ontology.  ...  Instead, its ontology can be translated from the query ontology using mapping axioms. In our solution, we use the concept of source relevance to summarize the content of a data source.  ...  We collected city specific data of about 70,000 cities from this source. The data is fairly clean and we have converted it to OWL with simple scripts.  ... 
doi:10.1109/cecandeee.2008.104 dblp:conf/wecwis/QasemDH08 fatcat:m5nzeuenwfe5rl6xyus6cf7xfe

Taking a Dive: Experiments in Deep Learning for Automatic Ontology-based Annotation of Scientific Literature [article]

Prashanti Manda, Lucas Beasley, Somya Mohanty
2018 bioRxiv   pre-print
These findings indicate that deep learning algorithms are a promising avenue to be explored for automated ontology-based curation of data.  ...  Recent advances in deep learning have shown increased accuracy for textual data annotation.  ...  Annotations for GO, CHEBI, Cell, Protein, and Sequence ontologies were converted from the cleaned files to separate ontology-specific text files that represent the presence or absence of ontology terms  ... 
doi:10.1101/365874 fatcat:amhb4kzqobao7kkcunoc4gkfl4

Data Warehouse Design and Optimization for Drilling Engineering

Ning Jing
2012 Open Petroleum Engineering Journal  
The authors also present a method of drilling data integration based on ontology.  ...  With the development of petroleum informatization and increase of drilling data, data storage, analysis and integration is attached to the key planning process.  ...  management module which will finish data extraction, calibration, cleaning, and conversion here.  ... 
doi:10.2174/1874834101205010124 fatcat:66jipy5esje6xhqi4sv4kmxzz4

Automatic Generation of Integration and Preprocessing Ontologies for Biomedical Sources in a Distributed Scenario

Alberto Anguita, David Pérez-Rey, José Crespo, Víctor Maojo
2008 2008 21st IEEE International Symposium on Computer-Based Medical Systems  
These ontologies are obtained from the analysis of data sources, searching for: (i) valid numerical ranges (using clustering techniques), (ii) different scales, (iii) synonym transformations based on known  ...  This paper describes a method to automatically generate preprocessing structures (ontologies) within an ontology-based KDD model.  ...  The use of ontologies facilitates this stage as it offers users with a deep yet intuitive view of the cleaning model.  ... 
doi:10.1109/cbms.2008.71 dblp:conf/cbms/AnguitaPCM08 fatcat:74m6eocqgremzl2ohfhuozwjpi

Research on Text Mining Based on Domain Ontology [chapter]

Jiang Li-hua, Xie Neng-fu, Zhang Hong-bin
2014 IFIP Advances in Information and Communication Technology  
The author discusses the text mining methods based on ontology and puts forward text mining model based on domain ontology.  ...  Ontology structure is built firstly and the "concept-concept" similarity matrix is introduced, then a conception vector space model based on domain ontology is used to take the place of traditional vector  ...  Text Feature Extraction After data pretreatment, the text feature words must be extracted from the "clean" data.  ... 
doi:10.1007/978-3-642-54341-8_38 fatcat:x6py6tmzmzcvhnv74hzw3tyb7i

Surviving the Legal Jungle: Text Classification of Italian Laws in Extremely Noisy Conditions

Riccardo Coltrinari, Alessandro Antinori, Fabio Celli
2020 Italian Conference on Computational Linguistics  
The results show that Linear Discriminant Analysis obtains very good performances both in clean and noisy conditions, if used as classifier in ensemble learning and in multi-label text classification.  ...  In this paper, we present a method based on Linear Discriminant Analysis for legal text classification of extremely noisy data, such as duplicated documents classified in different classes.  ...  well as automatic ontology population, in particular when dealing with very noisy data.  ... 
dblp:conf/clic-it/ColtrinariAC20 fatcat:4lni6n24cbegzkynvg5hhi4tq4

A Proposal for Masonry Bridge Health Assessment Using AI and Semantics [chapter]

Raissa Garozzo
2021 Representation Challenges. Augmented Reality and Artificial Intelligence in Cultural Heritage and Innovative Design Domain  
After images were scraped it took several days of manual data cleaning, erasing bad quality images, to obtain the final dataset used for the training.  ...  a deep neural network.  ... 
doi:10.3280/oa-686.60 fatcat:al2oyz6zjvcanbm7ijnhlna2he

An Ontology-Based and Deep Learning-Driven Method for Extracting Legal Facts from Chinese Legal Texts

Yong Ren, Jinfeng Han, Yingcheng Lin, Xiujiu Mei, Ling Zhang
2022 Electronics  
The construction of smart courts promotes the in-deep integration of internet, big data, cloud computing and artificial intelligence with judicial trial work, which can both improve trials and ensure judicial  ...  Based on the strong normative characteristics of Chinese legal text content and structure composition and the strong text feature learning ability of deep learning, this paper proposes an ontology-based  ...  The data cleaning removes non-ASCII word noise data contained in the text.  ... 
doi:10.3390/electronics11121821 fatcat:ou3zrd6nlnhalk22ayh3i3l73y

Extracting and Networking Emotions in Extremist Propaganda

Travis Morris
2012 2012 European Intelligence and Security Informatics Conference  
An ontology is used to identify emotional content before being analyzed in text networking software, specifically Automap and ORA (Organizational Risk Analyzer).  ...  Data is cleaned, punctuation removed, symbols are deleted, and stemmed. The ontology is then loaded into Automap and processed over the cleaned data.  ...  Work flow begins with raw data. Raw data is manually coded according to an ethnographic ontology. The data, along with the embedded ontological codes, is then loaded into Automap.  ... 
doi:10.1109/eisic.2012.33 dblp:conf/eisic/Morris12 fatcat:hejuhldarjhnfkpasn4u6k35rm

OntoMaven API4KB - A Maven-based API for Knowledge Bases

Adrian Paschke
2013 Workshop on Semantic Web Applications and Tools for Life Sciences  
Introduction In the life science domain there is a growing number of semantic knowledge bases (KBs) published on the Web, e.g. as linked data stores in the Linked Open Data (LOD) cloud or as semantic Deep  ...  There are three predefined life cycles, namely the Clean life cycle, which cleans the project, the Default life cycle, which processes, builds, tests and installs locally or deploys remotely, and the Site  ... 
dblp:conf/swat4ls/Paschke13 fatcat:rlmhqwkdv5hureltw25uio5thm
« Previous Showing results 1 — 15 out of 24,458 results