56,241 Hits in 4.6 sec

A Q-Q Plot Dissection Kit [article]

Sean Kross
2019 Zenodo  
Q-Q plots are used in many scientific fields to compare distributions of data, however interpreting a Q-Q plot is often not a straightforward task.  ...  Ultimately I explore a few real-world Q-Q plots to demonstrate how principles of Q-Q plot analysis can be applied.  ...  At first the data is farther from zero than it would be theoretically, and then the "thin tails" affect comes into play toward the right side of the histogram. "Ah, a Q-Q plot.  ... 
doi:10.5281/zenodo.3478585 fatcat:g3ff37sogbbrbj7k2hilq5ofxq

Neural Network Training Techniques Regularize Optimization Trajectory: An Empirical Study [article]

Cheng Chen, Junjie Yang, Yi Zhou
2020 arXiv   pre-print
., nonlinear activation functions, batch normalization, skip-connections, etc. Despite their effectiveness, it is still mysterious how they help accelerate DNN trainings in practice.  ...  Theoretically, we show that such a regularity principle leads to a convergence guarantee in nonconvex optimization and the convergence rate depends on a regularization parameter.  ...  In this paper, we take a step toward understanding DNN training techniques by providing a systematic empirical study of their regularization effect in the perspective of optimization.  ... 
arXiv:2011.06702v1 fatcat:gi4akjtmardmtnoph3xnyxbwau


L.M.M. Tijskens, P. Konopacki
2003 Acta Horticulturae  
By modelling the dynamics on the level of the individual units that constitutes a batch, rather then modelling the mean value for the batch itself, more fundamental models can be developed.  ...  Since technology became increasingly important, the presence of biological variance in our food became more and more of a nuisance. Techniques and procedures (statistical, technical) were developed.  ...  Suppose the distribution in maturity (expressed as days of development) can be approximated with a normal distribution. Out of this population, smaller consumer batches of 10 tomatoes are prepared.  ... 
doi:10.17660/actahortic.2003.600.99 fatcat:j3lw6dfkqvfj5n6mlzyrmwogxa

Towards Understanding Normalization in Neural ODEs [article]

Julia Gusak, Larisa Markeeva, Talgat Daulbaev, Alexandr Katrutsa, Andrzej Cichocki, Ivan Oseledets
2020 arXiv   pre-print
This paper investigates how different normalization techniques affect the performance of neural ODEs.  ...  Normalization is an important and vastly investigated technique in deep learning. However, its role for Ordinary Differential Equation based networks (neural ODEs) is still poorly understood.  ...  ACKNOWLEDGEMENT Sections 2 and 3 were supported by Ministry of Education and Science of the Russian Federation grant 14.756.31.0001.  ... 
arXiv:2004.09222v2 fatcat:q4qoencwjvchzclm2qkvfhjnrq

The Break-Even Point on Optimization Trajectories of Deep Neural Networks [article]

Stanislaw Jastrzebski, Maciej Szymczak, Stanislav Fort, Devansh Arpit, Jacek Tabor, Kyunghyun Cho, Krzysztof Geras
2020 arXiv   pre-print
Complementing prior work, we also show that using a low learning rate results in bad conditioning of the loss surface even for a neural network with batch normalization layers.  ...  We argue that studying the impact of the identified effects on generalization is a promising future direction.  ...  Theoretical approaches to understanding deep networks also increasingly focus on the early part of the optimization trajectory Arora et al., 2019) .  ... 
arXiv:2002.09572v1 fatcat:qyrskuopzrex7f2zz5mrq6w764

A Loss Curvature Perspective on Training Instability in Deep Learning [article]

Justin Gilmer, Behrooz Ghorbani, Ankush Garg, Sneha Kudugunta, Behnam Neyshabur, David Cardoze, George Dahl, Zachary Nado, Orhan Firat
2021 arXiv   pre-print
Inspired by the conditioning perspective, we show that learning rate warmup can improve training stability just as much as batch normalization, layer normalization, MetaInit, GradInit, and Fixup initialization  ...  In this work, we study the evolution of the loss Hessian across many classification tasks in order to understand the effect the curvature of the loss has on the training dynamics.  ...  Our results are generally consistent with this current understanding of Batch Normalization, however some of our experiments provide additional nuance-notably we observe several instances where models  ... 
arXiv:2110.04369v1 fatcat:ml5q7fdbyjg3nke7hgdq3nxqt4

A Robust Initialization of Residual Blocks for Effective ResNet Training without Batch Normalization [article]

Enrico Civitelli, Alessio Sortino, Matteo Lapucci, Francesco Bagattini, Giulio Galvan
2021 arXiv   pre-print
Batch Normalization is an essential component of all state-of-the-art neural networks architectures.  ...  In particular, we propose a slight modification to the summation operation of a block output to the skip connection branch, so that the whole network is correctly initialized.  ...  Towards understanding regularization in batch normalization, 2019. Andrew Brock, Soham De, and Samuel L Smith.  ... 
arXiv:2112.12299v1 fatcat:q7ms67kuejduffprk3diuyvswm

Toward Understanding the Impact of Staleness in Distributed Machine Learning [article]

Wei Dai, Yi Zhou, Nanqing Dong, Hao Zhang, Eric P. Xing
2018 arXiv   pre-print
In this work, we study the convergence behaviors of a wide array of ML models and algorithms under delayed updates.  ...  The empirical findings also inspire a new convergence analysis of stochastic gradient descent in non-convex optimization under staleness, matching the best-known convergence rate of O(1/√(T)).  ...  Understanding the impact of staleness on ML convergence independently from the underlying distributed systems is a crucial step towards decoupling statistical efficiency from the system complexity.  ... 
arXiv:1810.03264v1 fatcat:drjb6zxycnawpgd7j62r4d7gbu

Truth or Backpropaganda? An Empirical Investigation of Deep Learning Theory [article]

Micah Goldblum, Jonas Geiping, Avi Schwarzschild, Michael Moeller, Tom Goldstein
2020 arXiv   pre-print
not optimal for generalization; (3) demonstrate that ResNets do not conform to wide-network theories, such as the neural tangent kernel, and that the interaction between skip connections and batch normalization  ...  plays a role; (4) find that rank does not correlate with generalization or robustness in a practical setting.  ...  Recent theoretical work has certainly made impressive strides towards understanding optimization and generalization in neural networks.  ... 
arXiv:1910.00359v3 fatcat:oas2iunoyfantiepiklcz5pude

Page 197 of SMPTE Motion Imaging Journal Vol. 83, Issue 3 [page]

1974 SMPTE Motion Imaging Journal  
will have a greater influence on distributing the variations toward the extremes, thus limiting the practicality of normal distribution application.  ...  contributes to an understanding of the problem.  ... 

Towards Stabilizing Batch Statistics in Backward Propagation of Batch Normalization [article]

Junjie Yan, Ruosi Wan, Xiangyu Zhang, Wei Zhang, Yichen Wei, Jian Sun
2020 arXiv   pre-print
Based on our analysis, we propose a novel normalization method, named Moving Average Batch Normalization (MABN).  ...  Batch Normalization (BN) is one of the most widely used techniques in Deep Learning field. But its performance can awfully degrade with insufficient batch size.  ...  STATISTICS IN BATCH NORMALIZATION REVIEW OF BATCH NORMALIZATION First of all, let's review the formulation of batch Normalization (Ioffe & Szegedy, 2015) : assume the input of a BN layer is denoted  ... 
arXiv:2001.06838v2 fatcat:ffa3atitovandl7j2abcuzpblq

Normalization Techniques in Training DNNs: Methodology, Analysis and Application [article]

Lei Huang, Jie Qin, Yi Zhou, Fan Zhu, Li Liu, Ling Shao
2020 arXiv   pre-print
Finally, we discuss the current progress in understanding normalization methods, and provide a comprehensive review of the applications of normalization for particular tasks, in which it can effectively  ...  We provide a unified picture of the main motivation behind different approaches from the perspective of optimization, and present a taxonomy for understanding the similarities and differences between them  ...  (2) How can we reduce the gap between the empirical success of normalization techniques and our theoretical understanding of them?  ... 
arXiv:2009.12836v1 fatcat:fei3jdfm2rajfdzqdmjghmmjsq

Metabolic network analysis and experimental study of lipid production in Rhodosporidium toruloides grown on single and mixed substrates

Rajesh Reddy Bommareddy, Wael Sabra, Garima Maheshwari, An-Ping Zeng
2015 Microbial Cell Factories  
A lipid yield as high as 0.53 (C-mol TAG/C-mol) has been experimentally obtained for growth on glycerol, compared to a theoretical maximum of 0.63 (C-mol TAG/C-mol).  ...  Results: A simplified metabolic network of R.toruloides was reconstructed based on a combination of genome and proteome annotations.  ...  Geoge Aggelis from the University of Patra und Prof. Seraphim Papanikoloau from Agricultural University of Athens for helpful discussion. We also thank Dr.  ... 
doi:10.1186/s12934-015-0217-5 pmid:25888986 pmcid:PMC4377193 fatcat:ibmvkv2hhzdslbjylrvskdasum

Characterization of Solid-State Drug Polymorphs and Real-Time Evaluation of Crystallization Process Consistency by Near-Infrared Spectroscopy

Shu-Ye Qi, Ye Tian, Wen-Bo Zou, Chang-Qin Hu
2018 Frontiers in Chemistry  
Theoretical analysis using a combination of 13C solid-state nuclear magnetic resonance spectroscopy with other polymorphism analysis techniques identified a number of marker signals, the changes of which  ...  Thus, we established a technique for the rapid evaluation of crystallization process consistency and deepened our understanding of crystallization behavior by using NIR in combination with polymorphism  ...  According to the quality by design (QbD) approach, the quality target product profile (QTPP) is mostly described by critical quality attributes (CQAs), a deep understanding of which is required for the  ... 
doi:10.3389/fchem.2018.00506 pmid:30406084 pmcid:PMC6204365 fatcat:exs4o47vvjc7li7tiwqmtnxuc4

An Investigation into Neural Net Optimization via Hessian Eigenvalue Density [article]

Behrooz Ghorbani, Shankar Krishnan, Ying Xiao
2019 arXiv   pre-print
We then thoroughly analyze a crucial structural feature of the spectra: in non-batch normalized networks, we observe the rapid appearance of large isolated eigenvalues in the spectrum, along with a surprising  ...  To understand the dynamics of optimization in deep neural networks, we develop a tool to study the evolution of the entire Hessian spectrum throughout the optimization process.  ...  We uncovered surprising phenomena, some of which run contrary to the widely held beliefs in the machine learning community.  ... 
arXiv:1901.10159v1 fatcat:gu5hltcqijcohitchnyoxksece
« Previous Showing results 1 — 15 out of 56,241 results