Filters








72,206 Hits in 10.6 sec

A Survey of Machine Learning for Big Code and Naturalness [article]

Miltiadis Allamanis, Earl T. Barr, Premkumar Devanbu, Charles Sutton
2018 arXiv   pre-print
Research at the intersection of machine learning, programming languages, and software engineering has recently taken important steps in proposing learnable probabilistic models of source code that exploit  ...  In this article, we survey this work. We contrast programming languages against natural languages and discuss how these similarities and differences drive the design of probabilistic models.  ...  "Naturalness" and "big code" should be viewed as instances of a more general concept that there is exploitable regularity across human-written code that can be "absorbed" and generalized by a learning  ... 
arXiv:1709.06182v2 fatcat:hbvgyonqsjgq3nqwji6jf3aybe

A Survey of Machine Learning for Big Code and Naturalness

Miltiadis Allamanis, Earl T. Barr, Premkumar Devanbu, Charles Sutton
2018 ACM Computing Surveys  
Research at the intersection of machine learning, programming languages, and software engineering has recently taken important steps in proposing learnable probabilistic models of source code that exploit  ...  In this article, we survey this work. We contrast programming languages against natural languages and discuss how these similarities and differences drive the design of probabilistic models.  ...  "Naturalness" and "big code" should be viewed as instances of a more general concept that there is exploitable regularity across human-written code that can be "absorbed" and generalized by a learning  ... 
doi:10.1145/3212695 fatcat:iuuocyctg5adjmobhc2zw23rfu

A Survey on Domain-Specific Languages for Machine Learning in Big Data [article]

Ivens Portugal, Paulo Alencar, Donald Cowan
2016 arXiv   pre-print
Therefore, this literature survey identifies and describes domain-specific languages and frameworks used for Machine Learning in Big Data.  ...  Machine Learning algorithms can be used in Big Data to make better and more accurate inferences.  ...  The authors would like to thank the Natural Sciences and Engineering Research Council of Canada (NSERC), the Ontario Research Fund of the Ontario Ministry of Research and Innovation, SAP, and the Centre  ... 
arXiv:1602.07637v2 fatcat:kn34njlaojdqpn35xz6q6gk5im

Modelling Natural Language, Programs, and their Intersection

Graham Neubig, Miltiadis Allamanis
2018 Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorial Abstracts  
, much of source code itself contains elements that are expressed in natural language (e.g. variable names and code comments), giving a form of grounded semantics to these aspects of code. • Analysis of  ...  Because of the increasing need for programs and programming in our working and everyday lives, there are now massive amounts of source code being produced every day.  ...  He received his PhD at the University of Edinburgh, UK advised by Dr. Charles Sutton.Bibliography [1] Allamanis, Miltiadis, et al. "A Survey of Machine Learning for Big Code and Naturalness."  ... 
doi:10.18653/v1/n18-6001 dblp:conf/naacl/NeubigA18 fatcat:uufv3ygllnbhbltlti66752oqe

A Survey on Cleaning Dirty Data Using Machine Learning Paradigm for Big Data Analytics

Jesmeen M. Z. H, J. Hossen, S. Sayeed, CK Ho, Tawsif K, Armanur Rahman, E.M.H. Arif
2018 Indonesian Journal of Electrical Engineering and Computer Science  
Also challenges faced in cleaning big data due to nature of data are discussed. Machine learning algorithms can be used to analyze data and make predictions and finally clean data automatically.  ...  Almost all big data sets are dirty, i.e. the set may contain inaccuracies, missing data, miscoding and other issues that influence the strength of big data analytics.  ...  Figure 4 . 4 Machine learning algorithms Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752  A Survey on Cleaning Dirty Data Using Machine Learning Paradigm for Big Data … (Jesmeen M. Z. H.)  ... 
doi:10.11591/ijeecs.v10.i3.pp1234-1243 fatcat:4a6hqthlavcy5disedzsww67ha

NLP Algorithms Endowed for Automatic Extraction of Information from Unstructured Free-Text Reports of Radiology Monarchy

2020 VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE  
, supervised machine learning algorithm and many more.  ...  Natural Language Processing (NLP) Algorithms are the key factors for automatic information extraction form the unstructured free-text radiology reports .To extract clinically important findings and recommendations  ...  author gave survey on six diffrent categories of machine learning applications in radiology listed in the table below.  ... 
doi:10.35940/ijitee.l8009.1091220 fatcat:sjth33dnvjfnhn442figt75llq

Morpheus's big dreams

May Chiao
2020 Nature Astronomy  
But with large surveys such as the Legacy Survey of Space and Time expected to capture billions of galaxies, machine learning techniques that can increase the speed and capacity will be vital.  ...  To this end, Ryan Hausen and Brant Robertson have created a deep learning model, Morpheus, for generating morphological classifications of astronomical sources at the pixel level.  ... 
doi:10.1038/s41550-020-1135-y fatcat:splhw5vxzvf5zmtgpytiy4z74a

Astronomy in the Big Data Era

Yanxia Zhang, Yongheng Zhao
2015 Data Science Journal  
The book “Statistics, Data Mining, and Machine Learning in Astronomy”, written by Ivezic, Connolly, VanderPlas, and Gray ( 2014 ), is a practical python guide for the analysis of survey data.  ...  We live in a big data era, and we should learn about the world with big data views.  ... 
doi:10.5334/dsj-2015-011 fatcat:ufes5d3ng5bnvkyeprznbjzjwa

The Key Questions in Data Sciences and Machine Learning - A Literature Review

Aman Kumar, Nidhi Upadhyay, Ankita Singh, Ankit Raj
2021 International Journal of Engineering and Applied Sciences (IJEAS)  
This multi-disciplinary field has concepts overlaying with data-driven technologies like Big Data, Machine Learning, Statistical Inferences, Cloud computing and mathematics.  ...  With problems ranging from medical sciences to addressing problems of business intelligence, the application of data sciences in various domains is accepted as a major factor for decision making which  ...  ACKNOWLEDGEMENTS We would like to thank Kaggle Platform for access to relevant 'Data Science -Survey' datasets and Google Trends (2020) for data on Google keywords search patterns.  ... 
doi:10.31873/ijeas.8.8.09 fatcat:b5yanykrovaghbw2ziptq3pexm

Table of contents

2020 2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)  
for Diagnosis of Obstructive Sleep Apnea Utilizing Machine Learning Models 0522 SESSION 18: DATA MINING AND MACHINE LEARNING 18.1 1570681958 Mobile Natural Gas Concentration Intelligence Device  ...  Edge Classification of Ocean Sounds 0343 SESSION 12: BIG DATA, DATA MANAGEMENT AND ANALYTICS 12.1 1570681396 A Provenance Meta Learning Framework for Missing Data Handling Methods Selection  ... 
doi:10.1109/uemcon51285.2020.9298057 fatcat:p4v3pn2m2zaaxdgobcaw5db76m

West-Life: Big Data Software Introduced

Martyn Winn
2017 Zenodo  
In this report, we survey some of the technologies being applied in modern Big Data science, and discuss their applicability to Structural Biology.  ...  These prototypes are being developed within the West-Life consortium, with input from IBM and the STFC Hartree Centre.  ...  We have deployed this package on our test platform, and tried some example applications. Anna Paola Carrieri (IBM) is working on applications of Hadoop for genomics, and is providing advice.  ... 
doi:10.5281/zenodo.1040457 fatcat:4u5uistxifcbto457nxvr4cfju

MORF: A Framework for Predictive Modeling and Replication At Scale With Privacy-Restricted MOOC Data [article]

Josh Gardner, Christopher Brooks, Juan Miguel L. Andres, Ryan Baker
2018 arXiv   pre-print
Big data repositories from online learning platforms such as Massive Open Online Courses (MOOCs) represent an unprecedented opportunity to advance research on education at scale and impact a global population  ...  To date, such research has been hindered by poor reproducibility and a lack of replication, largely due to three types of barriers: experimental, inferential, and data.  ...  methods and data -are widespread in the field of big data research, which covers a variety of methods spanning artificial intelligence, machine learning, data mining, and simulation-based research.  ... 
arXiv:1801.05236v3 fatcat:jl47wjv5rrd2lds5uqswm6xkkq

A Survey of Chatbot creation tools for non-coder

Asia Hamdan Al-Sinani, Bayan Said Al-Saidi
2020 Journal of student research  
Chatbot is an application of artificial intelligence that has a big role in saving time and effort by automating different aspect of human life.  ...  Thereby this survey will help the non-coder to develop the Chabot without undergoing to learn the programming aspect of developing the Chatbot  ...  advice during the survey.  ... 
doi:10.47611/jsr.vi.896 fatcat:uo3klccmizdslci7xwqeqbduei

Machine Learning in Official Statistics [article]

Martin Beck, Florian Dumpert, Joerg Feuerhake
2018 arXiv   pre-print
A major component of this was surveys on the use of machine learning methods in official statistics, which were conducted at selected national and international statistical institutions and among the divisions  ...  It was of particular interest to find out in which statistical areas and for which tasks machine learning is used and which methods are applied.  ...  to get a picture of use cases for machine learning currently pursued in national and international statistics producing institutions, a survey was conducted.  ... 
arXiv:1812.10422v1 fatcat:qpuvfdzbevcfxgdd7pmq2ifhmi

Survey on categorical data for neural networks

John T. Hancock, Taghi M. Khoshgoftaar
2020 Journal of Big Data  
Practitioners working with big data often have a need to encode categorical values in their datasets in order to leverage machine learning algorithms.  ...  Some of these domains are natural language processing, fraud detection, and clinical document automation.  ...  Acknowledgements The authors would like to thank the anonymous reviewers for their constructive evaluation of this paper, and the various members of the Data Mining and Machine Learning Laboratory, Florida  ... 
doi:10.1186/s40537-020-00305-w fatcat:u3rhdtbkd5hljf76265mglrpmy
« Previous Showing results 1 — 15 out of 72,206 results