Filters








12,849 Hits in 2.9 sec

Online Similarity Learning for Big Data with Overfitting

Yang Cong, Ji Liu, Baojie Fan, Peng Zeng, Haibin Yu, Jiebo Luo
2018 IEEE Transactions on Big Data  
In this paper, we propose a general model to address the overfitting problem in online similarity learning for big data, which is generally generated by two kinds of redundancies: 1) feature redundancy  ...  Our model is as efficient as the fastest online similarity learning model OASIS, while performing generally as well as the accurate model OMLLR.  ...  [16] , [17] , [23] design an Online Algorithm for Scalable Image Similarity learning (OASIS) for learning pairwise similarity.  ... 
doi:10.1109/tbdata.2017.2688360 fatcat:tkbedhntdfbvdeg3lnuthxifuy

Similarity Based Prediction System using Machine Learning Algorithms in Big Data Analytics

2019 VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE  
Using datamining, the big data are having lack of compatibility with database systems and analysis tools; large dataset clustering and analyzing is a big issue in big data.  ...  The big data utilizes machine learning algorithms to process large datasets which comes from various places such as histories, weblogs, and data repositories, large datasets and data warehousing, etc.  ...  [1] describes a model for big data to control the overfitting problem which where comes under online similarity learning.  ... 
doi:10.35940/ijitee.l3524.1081219 fatcat:yiq3x3ohrvb45cv3x7lfz62w2u

JavaDL: a Java-based Deep Learning Tool to Predict Drug Responses [article]

shuxing zhang, Lon Wolf Fong, Rajan Chaudhari, Zhi Tan, Shuxing Zhang
2020 bioRxiv   pre-print
Recently, deep learning techniques have been witnessed with revival in a variety of areas such as image processing and genomic data analysis, and they will be useful for the coming age of big data analysis  ...  To evaluate our program, we compared it with several machine learning programs including SVM and kNN.  ...  Acknowledgements Special thanks to the Deeplearning4j and CDK toolkit development teams for providing us academic licenses.  ... 
doi:10.1101/2020.05.04.077701 fatcat:7ecfg4xzw5akxa4gceiz25t5pu

English Web-Based Teaching Supervision Based on Intelligent Face Image Perception and Processing for IoT

Juan Ma, Jiangyi Li
2021 Complexity  
combined with the old result-oriented effectiveness monitoring method for online teaching, with certain theoretical research significance and practical application value.  ...  In this paper, the Internet of Things (IoT) with intelligent face perception and processing function is used to supervise online English teaching.  ...  In other words, the owner of big data can only give full play to the advantages of big data by establishing effective models and tools based on big data. e combination of big data and artificial intelligence  ... 
doi:10.1155/2021/6368880 doaj:7ed2ce1adec04932a41b05339e8ac806 fatcat:msfwsincxvcx3cl2r3mopfgwsm

Stream-Based Extreme Learning Machine Approach for Big Data Problems

Euler Guimarães Horta, Cristiano Leite de Castro, Antônio Pádua Braga
2015 Mathematical Problems in Engineering  
Big Data problems demand data models with abilities to handle time-varying, massive, and high dimensional data.  ...  The importance of Active Learning for Big Data becomes more evident when labeling cost is high and data is presented to the learner via data streams.  ...  Acknowledgment The authors would like to acknowledge FAPEMIG for the financial support.  ... 
doi:10.1155/2015/126452 fatcat:3p4uaantuva6bh2zu2ry34oxma

I TRIED A BUNCH OF THINGS: THE DANGERS OF UNEXPECTED OVERFITTING IN CLASSIFICATION [article]

Michael Skocik, John Collins, Chloe Callahan-Flintoft, Howard Bowman, Brad Wyble
2016 bioRxiv   pre-print
Adopting similar safeguards is critical for ensuring the robustness of machine-learning techniques.  ...  With these new techniques come new dangers of overfitting that are not well understood by the neuroscience community.  ...  For example, machine-learning competitions on websites such as Kaggle.com provide contestants with sample data on which to optimize their models.  ... 
doi:10.1101/078816 fatcat:4maqmjypwzhapm3zbt62nhti44

Using big data to enhance the bosch production line performance: A Kaggle challenge

Ankita Mangal, Nishant Kumar
2016 2016 IEEE International Conference on Big Data (Big Data)  
At the Bosch assembly line, data is recorded for products as they progress through each stage.  ...  Data science methods are applied to this huge data repository consisting records of tests and measurements made for each component along the assembly line to predict internal failures.  ...  Acknowledgments The authors would like to thank Kaggle user Belluga for posting their insights about the date-time data on the Kaggle competition forum [16] , and Kaggle user Dmitry Sergeev [17] for  ... 
doi:10.1109/bigdata.2016.7840826 dblp:conf/bigdataconf/MangalK16 fatcat:ngq65ktfwzfqlbjrdpzrlvmpva

Table of Contents

2021 2021 IEEE/ACIS 19th International Conference on Computer and Information Science (ICIS)  
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Sustainable Energy Resources of Bangladesh: A Big Data Approach Sowvik Kanti Das, H.M.Mahir Shahriyar, Mahady Hasan . . . . . . . . .  ...  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 A Partitioning Algorithm Based on Exact Resolution to Solve Big Data PTSP Mohamed Abdellahi Amar . . . . . . .  ... 
doi:10.1109/icis51600.2021.9516860 fatcat:yg5imhvvcje57kztadp5zlcb2m

Application of digital image processing technology in online education under COVID-19 epidemic

Baoxian Jia, Wunong Zhang
2021 Journal of Intelligent & Fuzzy Systems  
quality of teaching and learning, and provide technical support for the improvement of online learning.  ...  In the current epidemic situation, appropriate learning resources are the prerequisite and basis for effective online education.  ...  With the help of the 369 previous data classification, a large number of word 370 pairs are labeled for similarity and relevance, and at 371 the same time data values for similarity and relevance 372 are  ... 
doi:10.3233/jifs-219045 fatcat:jrm5u7jcy5hbthkgeiesewzaqu

Remember More by Recalling Less: Investigating the Role of Batch Size in Continual Learning with Experience Replay (Student Abstract)

Maciej Wolczyk, Andrii Krutsylo
2021 AAAI Conference on Artificial Intelligence  
Experience replay is a simple and well-performing strategy for continual learning problems, often used as a basis for more advanced methods.  ...  We show that this phenomenon does not disappear under learning rate tuning and we propose possible directions for further analysis.  ...  POIR.04.04.00-00-14DE/18-00) within the Team-Net program of the Foundation for Polish Science co-financed by the European Union under the European Regional Development Fund.  ... 
dblp:conf/aaai/WolczykK21 fatcat:n4sl77oggnaolpsne7o6xfr5ka

An Approximation of Label Distribution-Based Ensemble Learning Method for Online Educational Prediction

Long Zhang, Shu Kai, Huang Keyu, Zhang Ruiqiu
2021 International Journal of Computers Communications & Control  
To better develop personalized learning plans for students, it is necessary to build a model that can automatically evaluate students' performance in online education.  ...  Online education becomes increasingly important since traditional learning is shocked heavily by COVID-19.  ...  in processing data without obvious regularity and performs unsatisfactorily in accuracy and anti-overfitting compared with ensemble learning methods [21] .  ... 
doi:10.15837/ijccc.2021.3.4153 fatcat:totiy64q6raurkdjxdsrm2khju

Investigating the impact of development and internal validation design when training prognostic models using a retrospective cohort in big US observational healthcare data

Jenna M Reps, Patrick Ryan, P R Rijnbeek
2021 BMJ Open  
This indicates even with large data the 'no test/validation set' design causes models to overfit.  ...  These designs had similar internal performance estimates and performed similarly when externally validated in the two external databases.ConclusionsEven with big data, it is important to use some validation  ...  This makes them suitable for learning in big p data, but the optimal hyperparameter needs to be identified.  ... 
doi:10.1136/bmjopen-2021-050146 pmid:34952871 pmcid:PMC8710861 fatcat:j2ws5cc7o5ayddtsoebpokmbx4

Machine Learning and Integrative Analysis of Biomedical Big Data

Bilal Mirza, Wei Wang, Jie Wang, Howard Choi, Neo Christopher Chung, Peipei Ping
2019 Genes  
In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing  ...  Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods.  ...  Online machine learning algorithms including online sequential extreme learning machine (OS-ELM), incremental decremental support vector machine (IDSVM), and online deep learning are attractive for big  ... 
doi:10.3390/genes10020087 pmid:30696086 pmcid:PMC6410075 fatcat:vopnjgke4fculmr7t3n43ewfiy

Machine Learning-Based State-of-the-Art Methods for the Classification of RNA-Seq Data [chapter]

Almas Jabeen, Nadeem Ahmad, Khalid Raza
2017 Lecture Notes in Computational Vision and Biomechanics  
Advancements in bioinformatics, along with developments in machine learning based classification, would provide powerful toolboxes for classifying transcriptome information available through RNA-Seq data  ...  In this chapter, we are going to discuss various machine learning approaches for RNA-Seq data classification and their implementation.  ...  alternative for big data analytics with high The traditional machine learning methods are found inadequate in handling voluminous data using the current computational resources Therefore Deep learning  ... 
doi:10.1007/978-3-319-65981-7_6 fatcat:ybc2r3cx5vdsnexel3bqq3rinm

Machine Learning in Big Data

Lidong Wang, Cheryl Ann Alexander
2016 International journal of mathematical, engineering and management sciences  
Machine learning is an artificial intelligence method of discovering knowledge for making intelligent decisions. Big Data has great impacts on scientific discoveries and value creation.  ...  This paper introduces methods in machine learning, main technologies in Big Data, and some applications of machine learning in Big Data.  ...  First is that the data size is too big to be relaxed by either online or distributed learning. Sequential online learning on big data requires too much time for training on a single machine.  ... 
doi:10.33889/ijmems.2016.1.2-006 fatcat:eidif7z3afbihflemxwcnxo7xi
« Previous Showing results 1 — 15 out of 12,849 results