Filters








1,347 Hits in 7.8 sec

Practical Recommendations for Gradient-Based Training of Deep Architectures [chapter]

Yoshua Bengio
2012 Lecture Notes in Computer Science  
Learning algorithms related to artificial neural networks and in particular for Deep Learning may seem to involve many bells and whistles, called hyperparameters.  ...  Overall, it describes elements of the practice used to successfully and efficiently train and debug large-scale and often deep multi-layer neural networks.  ...  Frederic Bastien, and Sina Honari, as well as for the financial support of NSERC, FQRNT, CIFAR, and the Canada Research Chairs.  ... 
doi:10.1007/978-3-642-35289-8_26 fatcat:k6lsp2fxv5ei3efgkmf5p5okyy

Practical recommendations for gradient-based training of deep architectures [article]

Yoshua Bengio
2012 arXiv   pre-print
Learning algorithms related to artificial neural networks and in particular for Deep Learning may seem to involve many bells and whistles, called hyper-parameters.  ...  Overall, it describes elements of the practice used to successfully and efficiently train and debug large-scale and often deep multi-layer neural networks.  ...  Frederic Bastien, and Sina Honari, as well as for the financial support of NSERC, FQRNT, CIFAR, and the Canada Research Chairs.  ... 
arXiv:1206.5533v2 fatcat:xbtvaaby2jfjjae4hvwyxks7yu

LiBRe: A Practical Bayesian Approach to Adversarial Detection [article]

Zhijie Deng, Xiao Yang, Shizhen Xu, Hang Su, Jun Zhu
2021 arXiv   pre-print
In this work, we propose a more practical approach, Lightweight Bayesian Refinement (LiBRe), in the spirit of leveraging Bayesian neural networks (BNNs) for adversarial detection.  ...  Concretely, we build the few-layer deep ensemble variational and adopt the pre-training & fine-tuning workflow to boost the effectiveness and efficiency of LiBRe.  ...  Bayesian Neural Networks In essence, the problem of distinguishing adversarial examples from benign ones can be viewed as a specialized out-of-distribution (OOD) detection problem of particular concern  ... 
arXiv:2103.14835v2 fatcat:uzyyhyhdabd3piotnowwzgknuy

Scalable Exploration for Neural Online Learning to Rank with Perturbed Feedback

Yiling Jia, Hongning Wang
2022 Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval  
Driven by the recent developments in optimization and generalization of DNNs, learning a neural ranking model online from its interactions with users becomes possible.  ...  CCS CONCEPTS • Information systems → Learning to rank; • Theory of computation → Online learning algorithms; Pseudorandomness and derandomization; Regret bounds.  ...  ACKNOWLEDGEMENTS We want to thank the reviewers for their insightful comments.  ... 
doi:10.1145/3477495.3532057 fatcat:vdh4qxuxwzbpncjqquqechsq6y

Machine learning-based clinical prediction modeling – A practical guide for clinicians [article]

Julius M. Kernbach, Victor E. Staartjes
2020 arXiv   pre-print
investigators, reviewers and editors, who even as experts in their clinical field, sometimes find themselves insufficiently equipped to evaluate machine learning methodologies.  ...  discussion of common caveats and other points of significance (Part III), as well as offer a practical guide to classification (Part IV) and regression modelling (Part V), with a complete coding pipeline  ...  This is often the case for highly complex models such as deep neural networks or gradient boosting machines.  ... 
arXiv:2006.15069v1 fatcat:ovhpshrz5nbn7dwjrrzrvouraq

On Practical Reinforcement Learning: Provable Robustness, Scalability, and Statistical Efficiency [article]

Thanh Nguyen-Tang
2022 arXiv   pre-print
This thesis rigorously studies fundamental reinforcement learning (RL) methods in modern practical considerations, including robust RL, distributional RL, and offline RL with neural function approximation  ...  The thesis makes fundamental contributions to the three settings above, both algorithmically, theoretically, and empirically, while staying relevant to practical considerations.  ...  Acknowledgements I would like to thank my co-authors at A 2 I 2 for the fruitful discussions and collaborations: Sunil Gupta (A/Prof. at Deakin University), Svetha Venkatesh (Prof.  ... 
arXiv:2203.01758v1 fatcat:a5lievly6nbthgcjvume4yzsnu

Scalable Exploration for Neural Online Learning to Rank with Perturbed Feedback [article]

Yiling Jia, Hongning Wang
2022 arXiv   pre-print
Driven by the recent technical developments in optimization and generalization of DNNs, learning a neural ranking model online from its interactions with users becomes possible.  ...  However, the required exploration for model learning has to be performed in the entire neural network parameter space, which is prohibitively expensive and limits the application of such online solutions  ...  Acknowledgements We want to thank the reviewers for their insightful comments.  ... 
arXiv:2206.05954v1 fatcat:zqlaek6y7vabjfrjpi5dd7c42q

Forecasting: theory and practice [article]

Fotios Petropoulos, Daniele Apiletti, Vassilios Assimakopoulos, Mohamed Zied Babai, Devon K. Barrow, Souhaib Ben Taieb, Christoph Bergmeir, Ricardo J. Bessa, Jakub Bijak, John E. Boylan, Jethro Browell, Claudio Carnevale (+68 others)
2022 arXiv   pre-print
This article provides a non-systematic review of the theory and the practice of forecasting.  ...  and practice.  ...  theory and practice.  ... 
arXiv:2012.03854v4 fatcat:p32c67sy65cfdejq7ndfs3g7dm

Learning Groupwise Multivariate Scoring Functions Using Deep Neural Networks

Qingyao Ai, Xuanhui Wang, Sebastian Bruch, Nadav Golbandi, Michael Bendersky, Marc Najork
2019 Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval - ICTIR '19  
We learn GSFs with a deep neural network architecture, and demonstrate that several representative learning-to-rank algorithms can be modeled as special cases in our framework.  ...  This difference leads to the notion of relative relevance between documents in ranking.  ...  ACKNOWLEDGMENTS This work would not be possible without the support provided by the TF-Ranking team.  ... 
doi:10.1145/3341981.3344218 dblp:conf/ictir/AiWBGBN19 fatcat:ds2xix2qgbfqfbjxuk6aeeq6ha

Deep Neural Networks and Tabular Data: A Survey [article]

Vadim Borisov, Tobias Leemann, Kathrin Seßler, Johannes Haug, Martin Pawelczyk, Gjergji Kasneci
2022 arXiv   pre-print
On homogeneous data sets, deep neural networks have repeatedly shown excellent performance and have therefore been widely adopted.  ...  Our results, which we have made publicly available as competitive benchmarks, indicate that algorithms based on gradient-boosted tree ensembles still mostly outperform deep learning models on supervised  ...  It distills the knowledge from gradient boosting decision trees to retrieve feature groups; it clusters them and then constructs the neural network based on those feature combinations.  ... 
arXiv:2110.01889v3 fatcat:4d4lwkzfjrb75bofncvn2x5ohi

Understanding Random Forests: From Theory to Practice [article]

Gilles Louppe
2015 arXiv   pre-print
Accordingly, the goal of this thesis is to provide an in-depth analysis of random forests, consistently calling into question each and every part of the algorithm, in order to shed new light on its learning  ...  In particular, the use of algorithms should ideally require a reasonable understanding of their mechanisms, properties and limitations, in order to better apprehend and interpret their results.  ...  Neural networks The family of neural networks methods finds its origins in attempts to identify mathematical representations of information processing in biological systems.  ... 
arXiv:1407.7502v3 fatcat:vb62j3zs7ndwnbmiula3exwqea

A formal approach to good practices in Pseudo-Labeling for Unsupervised Domain Adaptive Re-Identification [article]

Fabian Dubourvieux, Romaric Audigier, Angélique Loesch, Samia Ainouz, Stéphane Canu
2022 arXiv   pre-print
It can be hard to deduce from them general good practices, which can be implemented in any Pseudo-Labeling method, to consistently improve its performance.  ...  (ii) General good practices for Pseudo-Labeling, directly deduced from the interpretation of the proposed theoretical framework, in order to improve the target re-ID performance.  ...  In addition, we were able to derive from the theory a set of general good practices for Pseudo-Labeling UDA re-ID.  ... 
arXiv:2112.12887v3 fatcat:vgwj4xciiffydfocscz2h6f6lm

Modeling Housing Rent in the Atlanta Metropolitan Area Using Textual Information and Deep Learning

Xiaolu Zhou, Weitian Tong, Dongying Li
2019 ISPRS International Journal of Geo-Information  
We tested a number of machine learning and deep learning models (e.g., convolutional neural network, recurrent neural network) for the prediction of rental prices based on data collected from Atlanta,  ...  Given the importance of rent prediction in urban studies, this study aims to develop and evaluate models of rental market dynamics using deep learning approaches on spatial and textual data from Craigslist  ...  Acknowledgments: The authors thank reviewers for giving valuable suggestions and comments. Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/ijgi8080349 fatcat:acj5qdskmveevl2mc7m5nicpre

The Neural State Pushdown Automata [article]

Ankur Mali, Alexander Ororbia, C. Lee Giles
2019 arXiv   pre-print
In order to learn complex grammars, recurrent neural networks (RNNs) require sufficient computational resources to ensure correct grammar recognition.  ...  Next, we introduce a noise regularization scheme for higher-order (tensor) networks, to our knowledge the first of its kind, and design an algorithm for improved incremental learning.  ...  Unbiased Online Recurrent Optimization Unbiased Online Recurrent Optimization (UORO) [28] uses a rank-one trick to approximate the operations need to make RTRL's gradient computation work.  ... 
arXiv:1909.05233v2 fatcat:imaxvlic4fepppsleoxw3xmez4

Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training [article]

Diego Granziol, Stefan Zohren, Stephen Roberts
2021 arXiv   pre-print
We study the effect of mini-batching on the loss landscape of deep neural networks using spiked, field-dependent random matrix theory.  ...  (linear scaling) and adaptive algorithms, such as Adam (square root scaling), for smooth, non-convex deep neural networks.  ...  Vetrov for the opportunity to develop software relevant to this line of research in Moscow.  ... 
arXiv:2006.09092v6 fatcat:stgxpqfvbjfzpc2kvr7hebk4ei
« Previous Showing results 1 — 15 out of 1,347 results