Filters








6,830 Hits in 5.2 sec

Advanced grouping and aggregation for data integration

Eike Schallehn, Kai-Uwe Sattler, Gunter Saake
2001 Proceedings of the tenth international conference on Information and knowledge management - CIKM'01  
New applications from the areas of analytical data processing and data integration require powerful features to condense and reconcile available data.  ...  The general concept of grouping and aggregation appears to be a fitting paradigm for a number of the mentioned issues, but in its common form of equality based groups and restricted aggregate functions  ...  In [7] a Galhardas et. al. propose a framework for data cleaning as a SQL extension and macro-operators to support among other data cleaning issues duplicate elimination by similarity-based clustering  ... 
doi:10.1145/502684.502685 fatcat:5oipvqkm45hwzifrvmr4wu2ari

Advanced grouping and aggregation for data integration

Eike Schallehn, Kai-Uwe Sattler, Gunter Saake
2001 Proceedings of the tenth international conference on Information and knowledge management - CIKM'01  
New applications from the areas of analytical data processing and data integration require powerful features to condense and reconcile available data.  ...  The general concept of grouping and aggregation appears to be a fitting paradigm for a number of the mentioned issues, but in its common form of equality based groups and restricted aggregate functions  ...  In [7] a Galhardas et. al. propose a framework for data cleaning as a SQL extension and macro-operators to support among other data cleaning issues duplicate elimination by similarity-based clustering  ... 
doi:10.1145/502585.502685 dblp:conf/cikm/SchallehnSS01 fatcat:zol5g4tmvzev5feyvvjel77fhm

Life science research and data management---what can they give each other?

Amarnath Gupta
2004 SIGMOD record  
I sincerely thank Ling Liu for coming up with the idea of this special issue of the SIGMOD Record. I also gratefully acknowledge all the reviewers who donated their time.  ...  What then calls for a special issue of the SIGMOD Record devoted to the topic?  ...  The paper entitled "Automatic Composite Wrapper Generation for Semi-Structured Biological Data Based on Table Structure Identification" by Chen, Jamil and Wang describes the Tag-tree data model that extracts  ... 
doi:10.1145/1024694.1024696 fatcat:o3ud2qq3orbqrcnzk5274dmgae

Introduction to Information Extraction: Basic Notions and Current Trends

Wolf-Tilo Balke
2012 Datenbank-Spektrum  
This introduction gives a broad overview about the major topics and current trends in information extraction.  ...  While this abstract goal is still unreached and probably unreachable, intelligent information extraction techniques are considered key ingredients on the way to generating and representing knowledge for  ...  linking other data sets on the Web to Wikipedia data [8] .  ... 
doi:10.1007/s13222-012-0090-x fatcat:cqfncv2xzvczhnmsnlaprmx3lm

D2.2 – Data Services

Svetla Boytcheva, Plamen Tarkalanov, Nikola Tulechki, Pavlin Gyurov, Nikola Rusinov, Antoniy Kunchev
2022 Zenodo  
The developed services are based on the detailed analysis of business, data and technical requirements of all use cases. The best practices and standards are taken into consideration as well.  ...  It will be updated again at M36 to describe the final versions of data services, appropriately fine-tuned according to the TheFSM project needs  ...  Version 2 of the Data Services already provides access to several public or specialized reconciliation endpoints.  ... 
doi:10.5281/zenodo.6106652 fatcat:t5xllthcwjdg7hqjzjbpapvoaq

Energy efficiency analysis and optimization of industrial processes based on a novel data reconciliation

Sen Xie, Huaizhi Wang, Jianchun Peng
2021 IEEE Access  
Therefore, an energy efficiency analysis and optimization method based on a novel data reconciliation (DR) integrating Gaussian mixture model (GMM) and mutual information (MI) is put forward.  ...  data reconciliation result is evaluated by the hypothesis testing.  ...  It means that the introduction of one random variable leads to the uncertainty reduction of the other random variable.  ... 
doi:10.1109/access.2021.3068374 fatcat:io2f3rjen5huvciirfeqtyhyvm

ETL Auto Reconciliation

Jibrael Jos, Bragadishwaran U.
2010 Mapana Journal of Sciences  
The pattern will help to for implementing both integrated and independent Auto Reconciliation.  ...  This ensures that data in the warehouse is consistent with the source data and all stakeholders have clarity on the quality of the data.  ...  In line ETL tests applied systematically to all data flows checking for data quality issues. One of the feeds to the error event handler. Sub System 8: Error Event Handler.  ... 
doi:10.12723/mjs.17.5 fatcat:hf5cnm6l3jhwbeklzkdlmxy5cu

Towards Cleaning-up Open Data Portals: A Metadata Reconciliation Approach [article]

Alan Tygel, Sören Auer, Jeremy Debattista, Fabrizio Orlandi, Maria Luiza Machado Campos
2015 arXiv   pre-print
As our empiric analysis of ODPs shows, these issues are currently prevalent in most ODPs and effectively hinders the reuse of Open Data.  ...  Portal managers use several types of metadata to organize the datasets, one of the most important ones being the tags.  ...  ACKNOWLEDGEMENTS This work was supported by a grant of the European Commission within the Horizon2020 framework programme for the project OpenBudgets. A.  ... 
arXiv:1510.04501v1 fatcat:p3gkbcpafbep7k3xvkqlfwcmce

Data Munging Tools in Preparation for RDF: Catmandu and LODRefine

Christina Harlow
2015 Code4Lib Journal  
Many times we know how we want our data to look, as well as how we want our data to act in discovery interfaces or when exposed, but we are uncertain how to make the data we have into the data we want.  ...  data with LOD sets, and transform that data to a RDF model.  ...  Depending on the work required, the user might wish to extract data from an external source and save it locally, which can use the convert or export commands, or import the data to a local development  ... 
doaj:a0b1e1cf5c4c48f7ba2831f2bd0b27d8 fatcat:jprq3zdytrcrroppm67zohy2oe

Turkish Communication and Media Policies as Reflected in Government Programs: A Historical Analysis between 1923 and 2014

Ümit Atabek, Gülseren Şendur Atabek
2020 International Conference on Cultural Informatics, Communication & Media Studies  
As unstructured data, the texts of 60 governmental programs (a corpus of 892 pages) are pre-processed (tokenized, stemmed, tagged and cleaned) by KNIME, an open source software for text analysis and data  ...  Finally, we graphed the data suitably for the historical analysis of themes in order to trace the policy changes.  ...  KNIME, formerly known as Hades, is a very versatile data mining, cleaning and analysis package.  ... 
doi:10.12681/cicms.2756 fatcat:uzsbuyfymrbctk6lmma33572im

What does It Look Like, Really? Imagining how Citizens might Effectively, Usefully and Easily Find, Explore, Query and Re-present Open/Linked Data [chapter]

mc schraefel
2010 Lecture Notes in Computer Science  
Are we in the semantic web/linked data community effectively attempting to make possible a new literacy -one of data rather than document analysis?  ...  The purpose of this talk therefore will be to look at key ineraction issues around defining and delivering a useful, usable *data explorotron* for citizens.  ...  Do I perform lots of searches on travel websites to extract that? Do I go to airlines one by one and examine their schedule (in PDF or HTML format), and get the data that way?  ... 
doi:10.1007/978-3-642-17749-1_28 fatcat:zsm2lmlhk5aqthoja3j4hzoewe

An Unsupervised Method for Automatic Translation Memory Cleaning

Masoud Jalili Sabet, Matteo Negri, Marco Turchi, Eduard Barbu
2016 Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)  
translation units (TUs) contained in the TM, and iii) use the automatically labelled data to train an ensemble of binary classifiers.  ...  We apply our method to clean a test set composed of 1,000 TUs randomly extracted from the English-Italian version of MyMemory, the world's largest public TM.  ...  The authors would like to thank Anders Søgaard for sharing the initial version of the code for computing word embeddings.  ... 
doi:10.18653/v1/p16-2047 dblp:conf/acl/SabetNTB16 fatcat:hsnrnty7qrh5nbphehseltfrsi

Capturing Carbon & Conserving Biodiversity: The Market Approach edited by Ian R. Swingland (2003), xxiv + 368 pp., Earthscan, London, UK. ISBN 185283 950 7 (hbk), £55.00, 1 85383 951 5 (pbk), £19.95

Paul Smith, Clare Tenner
2004 Oryx  
They gingerly take on several important issues, including the special status being accorded to indigenous peoples in the current debates, maintaining that seeking conservation based on the broad set of  ...  Proof of this success is that the species has been downlisted to Least Concern on the IUCN Red List, with only the Mexican wolf subspecies and a few isolated populations requiring special protection.  ...  This book makes no claim to guide the reader through the complexities of the arguments surrounding the United Nations Framework Convention on Climate Change (UNFCCC) and the Kyoto Protocol.  ... 
doi:10.1017/s0030605304240418 fatcat:qp6mvn5vqbfjjhgihcxj6okbhu

Rationalizing the Use of Water in Industry—Part 2: Instruments Developed by the Clean Technology Network in the State of Bahia

Asher Kiperstok, Karla Esquerre, Ricardo Kalid, Emerson Sales, Geiza Oliveira
2013 Journal of Environmental Protection  
The instruments developed by the Clean Technology Network of Bahia (TECLIM) at the Federal University of Bahia (UFBA) (cited in Part 1 of this paper) are presented.  ...  Factors regarding water management in industry were examined, on the basis of experience acquired over the period of a decade in cooperative research projects with large industrial process plants located  ...  Special thanks go to all the researchers which were part of the project's teams.  ... 
doi:10.4236/jep.2013.45058 fatcat:ganwcvbmarakjnpgl4wqhiy5l4

Win-Win Ecology. How the Earth's Species Can Survive in the Midst of Human Enterprise by Michael L. Rosenzweig (2003), xii + 211 pp., Oxford University Press, New York, USA. ISBN 0 19 515604 8 (hbk), $27.00

Richard Cowling
2004 Oryx  
They gingerly take on several important issues, including the special status being accorded to indigenous peoples in the current debates, maintaining that seeking conservation based on the broad set of  ...  Proof of this success is that the species has been downlisted to Least Concern on the IUCN Red List, with only the Mexican wolf subspecies and a few isolated populations requiring special protection.  ...  This book makes no claim to guide the reader through the complexities of the arguments surrounding the United Nations Framework Convention on Climate Change (UNFCCC) and the Kyoto Protocol.  ... 
doi:10.1017/s0030605304210419 fatcat:c2z5qdai5rbidlleycig57ts7i
« Previous Showing results 1 — 15 out of 6,830 results