3,242 Hits in 7.0 sec

Data Cleaning for Accurate, Fair, and Robust Models: A Big Data - AI Integration Approach [article]

Ki Hyun Tae, Yuji Roh, Young Hun Oh, Hyunsu Kim, Steven Euijong Whang
2019 arXiv   pre-print
We identify dependencies among the data preprocessing techniques and propose MLClean, a unified data cleaning framework that integrates the techniques and helps train accurate and fair models.  ...  This work is part of a broader trend of Big data -- Artificial Intelligence (AI) integration.  ...  From a data management standpoint, we contend that it is a good time to extend the data cleaning problem for the pressing needs of modern machine learning for accurate, fair, and robust model training.  ... 
arXiv:1904.10761v1 fatcat:fcvusaqesndyxlnnndleu4pkh4

A Survey on Data Collection for Machine Learning: a Big Data – AI Integration Perspective [article]

Yuji Roh, Geon Heo, Steven Euijong Whang
2019 arXiv   pre-print
The integration of machine learning and data management for data collection is part of a larger trend of Big data and Artificial Intelligence (AI) integration and opens many opportunities for new research  ...  Data collection largely consists of data acquisition, data labeling, and improvement of existing data or models.  ...  In the future, we expect the integration of Big data and AI to happen not only in data collection, but in all aspects of machine learning.  ... 
arXiv:1811.03402v2 fatcat:wviufzo2p5dtrnfrbisgkzrpd4

Data Economy 2.0: From Big Data Value to AI Value and a European Data Space [chapter]

Sonja Zillner, Jon Ander Gomez, Ana García Robles, Thomas Hahn, Laure Le Bars, Milan Petkovic, Edward Curry
2021 The Elements of Big Data Value  
The chapter describes the European AI framework as a foundation for deploying AI successfully and the critical need for a common European data space to power this vision.  ...  This chapter explores the opportunities and challenges of big data and AI in exploiting data ecosystems and creating AI value.  ...  These tools, methods and processes integrate AI, Data and Robotics technologies into systems and are responsible for ensuring that core system properties and characteristics such as safety, robustness,  ... 
doi:10.1007/978-3-030-68176-0_16 fatcat:n7fgc76zbfbznm4npmt3iis7si

Big Continuous Data: Dealing with Velocity by Composing Event Streams [chapter]

Genoveva Vargas-Solar, Javier A. Espinosa-Oviedo, José Luis Zechinelli-Martini
2016 Big Data Concepts, Theories, and Applications  
Nevertheless, to the best of our knowledge, few approaches integrate different composition techniques (online and post-mortem) for dealing with Big Data velocity.  ...  Event streams with their volume and continuous production cope mainly with two of the characteristics given to Big Data by the 5V's model: volume & velocity.  ...  Nevertheless, to the best of our knowledge, few approaches integrate different composition techniques (online and post-mortem) for dealing with Big Data velocity and volume.  ... 
doi:10.1007/978-3-319-27763-9_1 fatcat:vk67le6wnjbd5cdztvhq5i3dra

Manufacturing process data analysis pipelines: a requirements analysis and survey

Ahmed Ismail, Hong-Linh Truong, Wolfgang Kastner
2019 Journal of Big Data  
AI thanks G. Gridling and R. Trubko for their helpful discussions.  ...  Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.  ...  Thus, there is a strong need for versatile and well-integrated controls for platform-wide data governance and policy enforcement.  ... 
doi:10.1186/s40537-018-0162-3 fatcat:6tlovbsubzhqfagjjyimsythm4

Quality Assurance Technologies of Big Data Applications: A Systematic Literature Review [article]

Pengcheng Zhang, Wennan Cao, Henry Muccini
2020 arXiv   pre-print
the quality for big data applications include correctness, performance, availability, scalability, reliability and so on; 3) the existing QA technologies, including analysis, specification, model-driven  ...  This study provides a solid foundation for research on QA technologies of big data applications. However, many challenges of big data applications regarding quality still remain.  ...  Paper P4 proposes an approach which integrates Architecture Analysis & Design Language (AADL) to consider big data properties through customized concepts and models in a rigorous way.  ... 
arXiv:2002.01759v2 fatcat:k6fdzjfsujbxbo7pegh6qtz5oe

A study of data analytics and applications in multiple field using big data and internet of things(IoT)

Dr Sachin Kumar
2021 Zenodo  
A study of data analytics and applications in multiple field using big data and internet of things(IoT): PDF  ...  In the process of making accurate decisions based on facts and statistics, Islamic fund managers have recently begun integrating AI and big data analytics into their approach, removing any stereotypes  ...  Data analysts and data engineers work together in the process of data analytics to collect, integrate, and prepare data for analytical model creation, testing, and revision, ensuring accurate performance  ... 
doi:10.5281/zenodo.5148785 fatcat:ysqdqumwgbfr7kvbpbrd4n6vr4

Survey on data analysis in social media: A practical application aspect

Qixuan Hou, Meng Han, Zhipeng Cai
2020 Big Data Mining and Analytics  
It serves as a critical information source with large volumes, high velocity, and a wide variety of data.  ...  data.  ...  Data collection Most social media platforms, such as Twitter and Facebook, provide robust official API for developers to collect data.  ... 
doi:10.26599/bdma.2020.9020006 fatcat:msf6yz7tozbdne2mutwepo2ujy

The Role of AI, Machine Learning, and Big Data in Digital Twinning: A Systematic Literature Review, Challenges, and Opportunities

M. Mazhar Rathore, Syed Attique Shah, Dhirendra Shukla, Elmahdi Bentafat, Spiridon Bakiras
2021 IEEE Access  
Further, we designed a big data driven and AI-enriched reference architecture that leads developers to a complete DT-enabled system.  ...  The integration of big data analytics and artificial intelligence/machine learning (AI-ML) techniques with digital twinning, further enriches its significance and research potential with new opportunities  ...  Finally, we designed a reference model for digital twinning that exploits IoT, big data, and AI-ML approaches. The rest of the paper is organized as follows.  ... 
doi:10.1109/access.2021.3060863 fatcat:cvm5ubwwrbcdph5z37dvdodgx4

Quality Assurance Technologies of Big Data Applications: A Systematic Literature Review

Shunhui Ji, Qingqiu Li, Wennan Cao, Pengcheng Zhang, Henry Muccini
2020 Applied Sciences  
This study provides a solid foundation for research on QA technologies of big data applications and can help developers of big data applications apply suitable QA technologies.  ...  We have conducted a systematic literature review (SLR) by searching major scientific databases, resulting in 83 primary and relevant studies on QA technologies for big data applications.  ...  Metamodels and model mapping approaches for other kinds of big data applications are also urgently needed.  ... 
doi:10.3390/app10228052 fatcat:yqutaywdafduxhlnlm74pps6r4

Topic analysis and forecasting for science, technology and innovation: Methodology with a case study focusing on big data research

Yi Zhang, Guangquan Zhang, Hongshu Chen, Alan L. Porter, Donghua Zhu, Jie Lu
2016 Technological forecasting & social change  
The resulting knowledge may hold interest for R&D management and science policy in practice.  ...  An empirical case study of Awards data from the United States National Science Foundation, Division of Computer and Communication Foundation, is performed to demonstrate the proposed method.  ...  Introduction The coming of the Big Data Age introduces big opportunities and big challenges for modern society.  ... 
doi:10.1016/j.techfore.2016.01.015 fatcat:zjqqq2pwondftgvl6zjbpq6c2y

A Roadmap For Big Data Incorporating Both The Research Roadmap And The Policy Roadmap: Byte Policy And Research Roadmap

Stéphane Grumbach, Aurélien Faravelon, Martí Cuquet, Anna Fensel, Scott Cunningham, Rachel Finn
2017 Zenodo  
A roadmap for big data incorporating both the research roadmap and the policy roadmap: BYTE Policy and Research Roadmap. Deliverable D6.1 BYTE Project  ...  There is a need of better integration between algorithmic and human computation approaches (Freitas and Curry 2016) .  ...  The policy roadmap focuses on safeguarding the conditions for the creation of a European big data infrastructure while promoting social good and ensuring a fair data governance.  ... 
doi:10.5281/zenodo.1195744 fatcat:aqpyxix3crawhpheliz3v33kye

A Roadmap for Big Model [article]

Sha Yuan, Hanyu Zhao, Shuai Zhao, Jiahong Leng, Yangxiao Liang, Xiaozhi Wang, Jifan Yu, Xin Lv, Zhou Shao, Jiaao He, Yankai Lin, Xu Han (+88 others)
2022 arXiv   pre-print
With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm.  ...  We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability  ...  The goal also needs to ensure the security of the big model's underlying data, the interpretation and fairness of algorithms, and the robustness and accountability of the model. AI FOR GOOD.  ... 
arXiv:2203.14101v4 fatcat:rdikzudoezak5b36cf6hhne5u4

Selectivity Estimation with Deep Likelihood Models [article]

Zongheng Yang, Eric Liang, Amog Kamsetty, Chenggang Wu, Yan Duan, Xi Chen, Pieter Abbeel, Joseph M. Hellerstein, Sanjay Krishnan, Ion Stoica
2019 arXiv   pre-print
However, direct application of these models leads to a limited estimator that is prohibitively expensive to evaluate for range and wildcard predicates.  ...  To make a truly usable estimator, we develop a Monte Carlo integration scheme on top of likelihood models that can efficiently handle range queries with dozens of filters or more.  ...  Outlier detection or data cleaning can benefit from a statistical model to check how likely a tuple is dirty [18] (i.e., outside the data distribution).  ... 
arXiv:1905.04278v1 fatcat:ls6n36rjyrge3jqhs4jofwvnrq

Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing

Shailesh Tripathi, David Muhr, Manuel Brunner, Herbert Jodlbauer, Matthias Dehmer, Frank Emmert-Streib
2021 Frontiers in Artificial Intelligence  
Overall, such a customizable GCRISP-DS framework provides an enhancement for model improvements and reusability by minimizing robustness-issues.  ...  However, the practical application of robust industry-specific data-driven knowledge discovery models faces multiple data- and model development-related issues.  ...  ROBUSTNESS ISSUES OF ML AND AI MODELS In the following, we discuss data-related issues for robust data analytics because the shortcomings of data reflect on model evaluation and on the deployment phase  ... 
doi:10.3389/frai.2021.576892 fatcat:zyf6bk2mhvd2fnoy7hpdhoagtu
« Previous Showing results 1 — 15 out of 3,242 results