A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Advanced grouping and aggregation for data integration
2001
Proceedings of the tenth international conference on Information and knowledge management - CIKM'01
New applications from the areas of analytical data processing and data integration require powerful features to condense and reconcile available data. ...
The general concept of grouping and aggregation appears to be a fitting paradigm for a number of the mentioned issues, but in its common form of equality based groups and restricted aggregate functions ...
In [7] a Galhardas et. al. propose a framework for data cleaning as a SQL extension and macro-operators to support among other data cleaning issues duplicate elimination by similarity-based clustering ...
doi:10.1145/502684.502685
fatcat:5oipvqkm45hwzifrvmr4wu2ari
Advanced grouping and aggregation for data integration
2001
Proceedings of the tenth international conference on Information and knowledge management - CIKM'01
New applications from the areas of analytical data processing and data integration require powerful features to condense and reconcile available data. ...
The general concept of grouping and aggregation appears to be a fitting paradigm for a number of the mentioned issues, but in its common form of equality based groups and restricted aggregate functions ...
In [7] a Galhardas et. al. propose a framework for data cleaning as a SQL extension and macro-operators to support among other data cleaning issues duplicate elimination by similarity-based clustering ...
doi:10.1145/502585.502685
dblp:conf/cikm/SchallehnSS01
fatcat:zol5g4tmvzev5feyvvjel77fhm
Life science research and data management---what can they give each other?
2004
SIGMOD record
I sincerely thank Ling Liu for coming up with the idea of this special issue of the SIGMOD Record. I also gratefully acknowledge all the reviewers who donated their time. ...
What then calls for a special issue of the SIGMOD Record devoted to the topic? ...
The paper entitled "Automatic Composite Wrapper Generation for Semi-Structured Biological Data Based on Table Structure Identification" by Chen, Jamil and Wang describes the Tag-tree data model that extracts ...
doi:10.1145/1024694.1024696
fatcat:o3ud2qq3orbqrcnzk5274dmgae
Introduction to Information Extraction: Basic Notions and Current Trends
2012
Datenbank-Spektrum
This introduction gives a broad overview about the major topics and current trends in information extraction. ...
While this abstract goal is still unreached and probably unreachable, intelligent information extraction techniques are considered key ingredients on the way to generating and representing knowledge for ...
linking other data sets on the Web to Wikipedia data [8] . ...
doi:10.1007/s13222-012-0090-x
fatcat:cqfncv2xzvczhnmsnlaprmx3lm
D2.2 – Data Services
2022
Zenodo
The developed services are based on the detailed analysis of business, data and technical requirements of all use cases. The best practices and standards are taken into consideration as well. ...
It will be updated again at M36 to describe the final versions of data services, appropriately fine-tuned according to the TheFSM project needs ...
Version 2 of the Data Services already provides access to several public or specialized reconciliation endpoints. ...
doi:10.5281/zenodo.6106652
fatcat:t5xllthcwjdg7hqjzjbpapvoaq
Energy efficiency analysis and optimization of industrial processes based on a novel data reconciliation
2021
IEEE Access
Therefore, an energy efficiency analysis and optimization method based on a novel data reconciliation (DR) integrating Gaussian mixture model (GMM) and mutual information (MI) is put forward. ...
data reconciliation result is evaluated by the hypothesis testing. ...
It means that the introduction of one random variable leads to the uncertainty reduction of the other random variable. ...
doi:10.1109/access.2021.3068374
fatcat:io2f3rjen5huvciirfeqtyhyvm
ETL Auto Reconciliation
2010
Mapana Journal of Sciences
The pattern will help to for implementing both integrated and independent Auto Reconciliation. ...
This ensures that data in the warehouse is consistent with the source data and all stakeholders have clarity on the quality of the data. ...
In line ETL tests applied systematically to all data flows checking for data quality issues. One of the feeds to the error event handler. Sub System 8: Error Event Handler. ...
doi:10.12723/mjs.17.5
fatcat:hf5cnm6l3jhwbeklzkdlmxy5cu
Towards Cleaning-up Open Data Portals: A Metadata Reconciliation Approach
[article]
2015
arXiv
pre-print
As our empiric analysis of ODPs shows, these issues are currently prevalent in most ODPs and effectively hinders the reuse of Open Data. ...
Portal managers use several types of metadata to organize the datasets, one of the most important ones being the tags. ...
ACKNOWLEDGEMENTS This work was supported by a grant of the European Commission within the Horizon2020 framework programme for the project OpenBudgets. A. ...
arXiv:1510.04501v1
fatcat:p3gkbcpafbep7k3xvkqlfwcmce
Data Munging Tools in Preparation for RDF: Catmandu and LODRefine
2015
Code4Lib Journal
Many times we know how we want our data to look, as well as how we want our data to act in discovery interfaces or when exposed, but we are uncertain how to make the data we have into the data we want. ...
data with LOD sets, and transform that data to a RDF model. ...
Depending on the work required, the user might wish to extract data from an external source and save it locally, which can use the convert or export commands, or import the data to a local development ...
doaj:a0b1e1cf5c4c48f7ba2831f2bd0b27d8
fatcat:jprq3zdytrcrroppm67zohy2oe
Turkish Communication and Media Policies as Reflected in Government Programs: A Historical Analysis between 1923 and 2014
2020
International Conference on Cultural Informatics, Communication & Media Studies
As unstructured data, the texts of 60 governmental programs (a corpus of 892 pages) are pre-processed (tokenized, stemmed, tagged and cleaned) by KNIME, an open source software for text analysis and data ...
Finally, we graphed the data suitably for the historical analysis of themes in order to trace the policy changes. ...
KNIME, formerly known as Hades, is a very versatile data mining, cleaning and analysis package. ...
doi:10.12681/cicms.2756
fatcat:uzsbuyfymrbctk6lmma33572im
What does It Look Like, Really? Imagining how Citizens might Effectively, Usefully and Easily Find, Explore, Query and Re-present Open/Linked Data
[chapter]
2010
Lecture Notes in Computer Science
Are we in the semantic web/linked data community effectively attempting to make possible a new literacy -one of data rather than document analysis? ...
The purpose of this talk therefore will be to look at key ineraction issues around defining and delivering a useful, usable *data explorotron* for citizens. ...
Do I perform lots of searches on travel websites to extract that? Do I go to airlines one by one and examine their schedule (in PDF or HTML format), and get the data that way? ...
doi:10.1007/978-3-642-17749-1_28
fatcat:zsm2lmlhk5aqthoja3j4hzoewe
An Unsupervised Method for Automatic Translation Memory Cleaning
2016
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
translation units (TUs) contained in the TM, and iii) use the automatically labelled data to train an ensemble of binary classifiers. ...
We apply our method to clean a test set composed of 1,000 TUs randomly extracted from the English-Italian version of MyMemory, the world's largest public TM. ...
The authors would like to thank Anders Søgaard for sharing the initial version of the code for computing word embeddings. ...
doi:10.18653/v1/p16-2047
dblp:conf/acl/SabetNTB16
fatcat:hsnrnty7qrh5nbphehseltfrsi
Capturing Carbon & Conserving Biodiversity: The Market Approach edited by Ian R. Swingland (2003), xxiv + 368 pp., Earthscan, London, UK. ISBN 185283 950 7 (hbk), £55.00, 1 85383 951 5 (pbk), £19.95
2004
Oryx
They gingerly take on several important issues, including the special status being accorded to indigenous peoples in the current debates, maintaining that seeking conservation based on the broad set of ...
Proof of this success is that the species has been downlisted to Least Concern on the IUCN Red List, with only the Mexican wolf subspecies and a few isolated populations requiring special protection. ...
This book makes no claim to guide the reader through the complexities of the arguments surrounding the United Nations Framework Convention on Climate Change (UNFCCC) and the Kyoto Protocol. ...
doi:10.1017/s0030605304240418
fatcat:qp6mvn5vqbfjjhgihcxj6okbhu
Rationalizing the Use of Water in Industry—Part 2: Instruments Developed by the Clean Technology Network in the State of Bahia
2013
Journal of Environmental Protection
The instruments developed by the Clean Technology Network of Bahia (TECLIM) at the Federal University of Bahia (UFBA) (cited in Part 1 of this paper) are presented. ...
Factors regarding water management in industry were examined, on the basis of experience acquired over the period of a decade in cooperative research projects with large industrial process plants located ...
Special thanks go to all the researchers which were part of the project's teams. ...
doi:10.4236/jep.2013.45058
fatcat:ganwcvbmarakjnpgl4wqhiy5l4
Win-Win Ecology. How the Earth's Species Can Survive in the Midst of Human Enterprise by Michael L. Rosenzweig (2003), xii + 211 pp., Oxford University Press, New York, USA. ISBN 0 19 515604 8 (hbk), $27.00
2004
Oryx
They gingerly take on several important issues, including the special status being accorded to indigenous peoples in the current debates, maintaining that seeking conservation based on the broad set of ...
Proof of this success is that the species has been downlisted to Least Concern on the IUCN Red List, with only the Mexican wolf subspecies and a few isolated populations requiring special protection. ...
This book makes no claim to guide the reader through the complexities of the arguments surrounding the United Nations Framework Convention on Climate Change (UNFCCC) and the Kyoto Protocol. ...
doi:10.1017/s0030605304210419
fatcat:c2z5qdai5rbidlleycig57ts7i
« Previous
Showing results 1 — 15 out of 6,830 results