A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2018; you can also visit the original URL.
The file type is application/pdf
.
Filters
Data Cleaning Utilizing Ontology Tool
2016
International Journal of Grid and Distributed Computing
Before data cleaning is carried out, we first need to extract out the data from deep webpages. To do this, we use WISH wrapper to extract data. ...
On the other hand, Stefan and Fabian [3] uses ontology for Data Quality Management and Julian et. al., [7] uses an ontology based approach for data cleaning which is able to scale across big data in ...
doi:10.14257/ijgdc.2016.9.7.05
fatcat:47d3igrdpjhdpbyqsbyahoky2q
Automated Model-to-Metamodel Transformations Based on the Concepts of Deep Instantiation
[chapter]
2011
Lecture Notes in Computer Science
The resulting approach combines the clean and compact specification of deep instantiation with the easy applicability of model-to-metamodel transformations in an automated way. ...
Numerous systems, especially component-based systems, are based on a multi-phase development process where an ontological hierarchy is established. ...
The resulting approach uses the clean and compact description of the deep instantiation to automate the M2MM transformations approach. ...
doi:10.1007/978-3-642-24485-8_3
fatcat:lpi752bhevaplhovxi7ehqrfey
Aura: Privacy-preserving augmentation to improve test set diversity in noise suppression applications
[article]
2022
arXiv
pre-print
As an application of , we augment the INTERSPEECH 2021 DNS challenge by sampling audio files from a new batch of data of 20K clean speech clips from Librivox mixed with noise clips obtained from Audio ...
However, this approach leads to regressions due to the lack of training/testing on representative customer data. Moreover, due to privacy reasons, developers cannot listen to customer content. ...
Since real-world applications contain speech mixed with noise, most deep learning methods rely on synthetic data that mix clean and noisy speech [4] . ...
arXiv:2110.04391v2
fatcat:3n5bjbgqhvcs3ghgj3dv3mmyyu
Research on Semantic Text Mining Based on Domain Ontology
[chapter]
2013
IFIP Advances in Information and Communication Technology
Ontology provides theoretical basis and technical support for semantic information representation and organization. ...
Before acquisition text information, the text data must be pretreated including data cleaning such as noise reduction and duplication removal, data selection, text segmentation such as Chinese Word Segmentation ...
and Paragraphs segmentation.
⑶ Text information extraction After pretreatment, the text data must be clean and then feature information must be extracted including word segmentation, feature representation ...
doi:10.1007/978-3-642-36124-1_40
fatcat:rmkpgf5xcfh7xbfo3rdbzsnrhe
Foundational Challenges in Automated Semantic Web Data and Ontology Cleaning
2006
IEEE Intelligent Systems
Applying automated reasoning systems to Semantic Web data cleaning and to cleaning-agent design raises many challenges. ...
/ loading process 11 for data cleaning. ...
The challenges The long-term goal for Semantic Web data cleaning is to design general-purpose cleaning agentsintelligent agents that can find and repair both ontology and data anomalies in KDBs. ...
doi:10.1109/mis.2006.7
fatcat:y5x567uyl5ak3cflh27e35epay
ISENS: A Multi-ontology Query System for the Semantic Deep Web
2008
Advanced Issues of E-Commerce and Web-Based Information Systems (WECWIS), International Workshop on
Furthermore, it retrieves facts from sources where the data is not directly described in terms of the query ontology. ...
Instead, its ontology can be translated from the query ontology using mapping axioms. In our solution, we use the concept of source relevance to summarize the content of a data source. ...
We collected city specific data of about 70,000 cities from this source. The data is fairly clean and we have converted it to OWL with simple scripts. ...
doi:10.1109/cecandeee.2008.104
dblp:conf/wecwis/QasemDH08
fatcat:m5nzeuenwfe5rl6xyus6cf7xfe
Taking a Dive: Experiments in Deep Learning for Automatic Ontology-based Annotation of Scientific Literature
[article]
2018
bioRxiv
pre-print
These findings indicate that deep learning algorithms are a promising avenue to be explored for automated ontology-based curation of data. ...
Recent advances in deep learning have shown increased accuracy for textual data annotation. ...
Annotations for GO, CHEBI, Cell, Protein, and Sequence ontologies were converted from the cleaned files to separate ontology-specific text files that represent the presence or absence of ontology terms ...
doi:10.1101/365874
fatcat:amhb4kzqobao7kkcunoc4gkfl4
Data Warehouse Design and Optimization for Drilling Engineering
2012
Open Petroleum Engineering Journal
The authors also present a method of drilling data integration based on ontology. ...
With the development of petroleum informatization and increase of drilling data, data storage, analysis and integration is attached to the key planning process. ...
management module which will finish data extraction, calibration, cleaning, and conversion here. ...
doi:10.2174/1874834101205010124
fatcat:66jipy5esje6xhqi4sv4kmxzz4
Automatic Generation of Integration and Preprocessing Ontologies for Biomedical Sources in a Distributed Scenario
2008
2008 21st IEEE International Symposium on Computer-Based Medical Systems
These ontologies are obtained from the analysis of data sources, searching for: (i) valid numerical ranges (using clustering techniques), (ii) different scales, (iii) synonym transformations based on known ...
This paper describes a method to automatically generate preprocessing structures (ontologies) within an ontology-based KDD model. ...
The use of ontologies facilitates this stage as it offers users with a deep yet intuitive view of the cleaning model. ...
doi:10.1109/cbms.2008.71
dblp:conf/cbms/AnguitaPCM08
fatcat:74m6eocqgremzl2ohfhuozwjpi
Research on Text Mining Based on Domain Ontology
[chapter]
2014
IFIP Advances in Information and Communication Technology
The author discusses the text mining methods based on ontology and puts forward text mining model based on domain ontology. ...
Ontology structure is built firstly and the "concept-concept" similarity matrix is introduced, then a conception vector space model based on domain ontology is used to take the place of traditional vector ...
Text Feature Extraction After data pretreatment, the text feature words must be extracted from the "clean" data. ...
doi:10.1007/978-3-642-54341-8_38
fatcat:x6py6tmzmzcvhnv74hzw3tyb7i
Surviving the Legal Jungle: Text Classification of Italian Laws in Extremely Noisy Conditions
2020
Italian Conference on Computational Linguistics
The results show that Linear Discriminant Analysis obtains very good performances both in clean and noisy conditions, if used as classifier in ensemble learning and in multi-label text classification. ...
In this paper, we present a method based on Linear Discriminant Analysis for legal text classification of extremely noisy data, such as duplicated documents classified in different classes. ...
well as automatic ontology population, in particular when dealing with very noisy data. ...
dblp:conf/clic-it/ColtrinariAC20
fatcat:4lni6n24cbegzkynvg5hhi4tq4
A Proposal for Masonry Bridge Health Assessment Using AI and Semantics
[chapter]
2021
Representation Challenges. Augmented Reality and Artificial Intelligence in Cultural Heritage and Innovative Design Domain
After images were scraped it took several days of manual data cleaning, erasing bad quality images, to obtain the final dataset used for the training. ...
a deep neural network. ...
doi:10.3280/oa-686.60
fatcat:al2oyz6zjvcanbm7ijnhlna2he
An Ontology-Based and Deep Learning-Driven Method for Extracting Legal Facts from Chinese Legal Texts
2022
Electronics
The construction of smart courts promotes the in-deep integration of internet, big data, cloud computing and artificial intelligence with judicial trial work, which can both improve trials and ensure judicial ...
Based on the strong normative characteristics of Chinese legal text content and structure composition and the strong text feature learning ability of deep learning, this paper proposes an ontology-based ...
The data cleaning removes non-ASCII word noise data contained in the text. ...
doi:10.3390/electronics11121821
fatcat:ou3zrd6nlnhalk22ayh3i3l73y
Extracting and Networking Emotions in Extremist Propaganda
2012
2012 European Intelligence and Security Informatics Conference
An ontology is used to identify emotional content before being analyzed in text networking software, specifically Automap and ORA (Organizational Risk Analyzer). ...
Data is cleaned, punctuation removed, symbols are deleted, and stemmed. The ontology is then loaded into Automap and processed over the cleaned data. ...
Work flow begins with raw data. Raw data is manually coded according to an ethnographic ontology. The data, along with the embedded ontological codes, is then loaded into Automap. ...
doi:10.1109/eisic.2012.33
dblp:conf/eisic/Morris12
fatcat:hejuhldarjhnfkpasn4u6k35rm
OntoMaven API4KB - A Maven-based API for Knowledge Bases
2013
Workshop on Semantic Web Applications and Tools for Life Sciences
Introduction In the life science domain there is a growing number of semantic knowledge bases (KBs) published on the Web, e.g. as linked data stores in the Linked Open Data (LOD) cloud or as semantic Deep ...
There are three predefined life cycles, namely the Clean life cycle, which cleans the project, the Default life cycle, which processes, builds, tests and installs locally or deploys remotely, and the Site ...
dblp:conf/swat4ls/Paschke13
fatcat:rlmhqwkdv5hureltw25uio5thm
« Previous
Showing results 1 — 15 out of 24,458 results