A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
A Q-Q Plot Dissection Kit
[article]
2019
Zenodo
Q-Q plots are used in many scientific fields to compare distributions of data, however interpreting a Q-Q plot is often not a straightforward task. ...
Ultimately I explore a few real-world Q-Q plots to demonstrate how principles of Q-Q plot analysis can be applied. ...
At first the data is farther from zero than it would be theoretically, and then the "thin tails" affect comes into play toward the right side of the histogram. "Ah, a Q-Q plot. ...
doi:10.5281/zenodo.3478585
fatcat:g3ff37sogbbrbj7k2hilq5ofxq
Neural Network Training Techniques Regularize Optimization Trajectory: An Empirical Study
[article]
2020
arXiv
pre-print
., nonlinear activation functions, batch normalization, skip-connections, etc. Despite their effectiveness, it is still mysterious how they help accelerate DNN trainings in practice. ...
Theoretically, we show that such a regularity principle leads to a convergence guarantee in nonconvex optimization and the convergence rate depends on a regularization parameter. ...
In this paper, we take a step toward understanding DNN training techniques by providing a systematic empirical study of their regularization effect in the perspective of optimization. ...
arXiv:2011.06702v1
fatcat:gi4akjtmardmtnoph3xnyxbwau
BIOLOGICAL VARIANCE IN AGRICULTURAL PRODUCTS THEORETICAL CONSIDERATIONS
2003
Acta Horticulturae
By modelling the dynamics on the level of the individual units that constitutes a batch, rather then modelling the mean value for the batch itself, more fundamental models can be developed. ...
Since technology became increasingly important, the presence of biological variance in our food became more and more of a nuisance. Techniques and procedures (statistical, technical) were developed. ...
Suppose the distribution in maturity (expressed as days of development) can be approximated with a normal distribution. Out of this population, smaller consumer batches of 10 tomatoes are prepared. ...
doi:10.17660/actahortic.2003.600.99
fatcat:j3lw6dfkqvfj5n6mlzyrmwogxa
Towards Understanding Normalization in Neural ODEs
[article]
2020
arXiv
pre-print
This paper investigates how different normalization techniques affect the performance of neural ODEs. ...
Normalization is an important and vastly investigated technique in deep learning. However, its role for Ordinary Differential Equation based networks (neural ODEs) is still poorly understood. ...
ACKNOWLEDGEMENT Sections 2 and 3 were supported by Ministry of Education and Science of the Russian Federation grant 14.756.31.0001. ...
arXiv:2004.09222v2
fatcat:q4qoencwjvchzclm2qkvfhjnrq
The Break-Even Point on Optimization Trajectories of Deep Neural Networks
[article]
2020
arXiv
pre-print
Complementing prior work, we also show that using a low learning rate results in bad conditioning of the loss surface even for a neural network with batch normalization layers. ...
We argue that studying the impact of the identified effects on generalization is a promising future direction. ...
Theoretical approaches to understanding deep networks also increasingly focus on the early part of the optimization trajectory Arora et al., 2019) . ...
arXiv:2002.09572v1
fatcat:qyrskuopzrex7f2zz5mrq6w764
A Loss Curvature Perspective on Training Instability in Deep Learning
[article]
2021
arXiv
pre-print
Inspired by the conditioning perspective, we show that learning rate warmup can improve training stability just as much as batch normalization, layer normalization, MetaInit, GradInit, and Fixup initialization ...
In this work, we study the evolution of the loss Hessian across many classification tasks in order to understand the effect the curvature of the loss has on the training dynamics. ...
Our results are generally consistent with this current understanding of Batch Normalization, however some of our experiments provide additional nuance-notably we observe several instances where models ...
arXiv:2110.04369v1
fatcat:ml5q7fdbyjg3nke7hgdq3nxqt4
A Robust Initialization of Residual Blocks for Effective ResNet Training without Batch Normalization
[article]
2021
arXiv
pre-print
Batch Normalization is an essential component of all state-of-the-art neural networks architectures. ...
In particular, we propose a slight modification to the summation operation of a block output to the skip connection branch, so that the whole network is correctly initialized. ...
Towards understanding regularization in batch normalization,
2019.
Andrew Brock, Soham De, and Samuel L Smith. ...
arXiv:2112.12299v1
fatcat:q7ms67kuejduffprk3diuyvswm
Toward Understanding the Impact of Staleness in Distributed Machine Learning
[article]
2018
arXiv
pre-print
In this work, we study the convergence behaviors of a wide array of ML models and algorithms under delayed updates. ...
The empirical findings also inspire a new convergence analysis of stochastic gradient descent in non-convex optimization under staleness, matching the best-known convergence rate of O(1/√(T)). ...
Understanding the impact of staleness on ML convergence independently from the underlying distributed systems is a crucial step towards decoupling statistical efficiency from the system complexity. ...
arXiv:1810.03264v1
fatcat:drjb6zxycnawpgd7j62r4d7gbu
Truth or Backpropaganda? An Empirical Investigation of Deep Learning Theory
[article]
2020
arXiv
pre-print
not optimal for generalization; (3) demonstrate that ResNets do not conform to wide-network theories, such as the neural tangent kernel, and that the interaction between skip connections and batch normalization ...
plays a role; (4) find that rank does not correlate with generalization or robustness in a practical setting. ...
Recent theoretical work has certainly made impressive strides towards understanding optimization and generalization in neural networks. ...
arXiv:1910.00359v3
fatcat:oas2iunoyfantiepiklcz5pude
Page 197 of SMPTE Motion Imaging Journal Vol. 83, Issue 3
[page]
1974
SMPTE Motion Imaging Journal
will have a greater influence on distributing the variations toward the extremes, thus limiting the practicality of normal distribution application. ...
contributes to an understanding of the problem. ...
Towards Stabilizing Batch Statistics in Backward Propagation of Batch Normalization
[article]
2020
arXiv
pre-print
Based on our analysis, we propose a novel normalization method, named Moving Average Batch Normalization (MABN). ...
Batch Normalization (BN) is one of the most widely used techniques in Deep Learning field. But its performance can awfully degrade with insufficient batch size. ...
STATISTICS IN BATCH NORMALIZATION
REVIEW OF BATCH NORMALIZATION First of all, let's review the formulation of batch Normalization (Ioffe & Szegedy, 2015) : assume the input of a BN layer is denoted ...
arXiv:2001.06838v2
fatcat:ffa3atitovandl7j2abcuzpblq
Normalization Techniques in Training DNNs: Methodology, Analysis and Application
[article]
2020
arXiv
pre-print
Finally, we discuss the current progress in understanding normalization methods, and provide a comprehensive review of the applications of normalization for particular tasks, in which it can effectively ...
We provide a unified picture of the main motivation behind different approaches from the perspective of optimization, and present a taxonomy for understanding the similarities and differences between them ...
(2) How can we reduce the gap between the empirical success of normalization techniques and our theoretical understanding of them? ...
arXiv:2009.12836v1
fatcat:fei3jdfm2rajfdzqdmjghmmjsq
Metabolic network analysis and experimental study of lipid production in Rhodosporidium toruloides grown on single and mixed substrates
2015
Microbial Cell Factories
A lipid yield as high as 0.53 (C-mol TAG/C-mol) has been experimentally obtained for growth on glycerol, compared to a theoretical maximum of 0.63 (C-mol TAG/C-mol). ...
Results: A simplified metabolic network of R.toruloides was reconstructed based on a combination of genome and proteome annotations. ...
Geoge Aggelis from the University of Patra und Prof. Seraphim Papanikoloau from Agricultural University of Athens for helpful discussion. We also thank Dr. ...
doi:10.1186/s12934-015-0217-5
pmid:25888986
pmcid:PMC4377193
fatcat:ibmvkv2hhzdslbjylrvskdasum
Characterization of Solid-State Drug Polymorphs and Real-Time Evaluation of Crystallization Process Consistency by Near-Infrared Spectroscopy
2018
Frontiers in Chemistry
Theoretical analysis using a combination of 13C solid-state nuclear magnetic resonance spectroscopy with other polymorphism analysis techniques identified a number of marker signals, the changes of which ...
Thus, we established a technique for the rapid evaluation of crystallization process consistency and deepened our understanding of crystallization behavior by using NIR in combination with polymorphism ...
According to the quality by design (QbD) approach, the quality target product profile (QTPP) is mostly described by critical quality attributes (CQAs), a deep understanding of which is required for the ...
doi:10.3389/fchem.2018.00506
pmid:30406084
pmcid:PMC6204365
fatcat:exs4o47vvjc7li7tiwqmtnxuc4
An Investigation into Neural Net Optimization via Hessian Eigenvalue Density
[article]
2019
arXiv
pre-print
We then thoroughly analyze a crucial structural feature of the spectra: in non-batch normalized networks, we observe the rapid appearance of large isolated eigenvalues in the spectrum, along with a surprising ...
To understand the dynamics of optimization in deep neural networks, we develop a tool to study the evolution of the entire Hessian spectrum throughout the optimization process. ...
We uncovered surprising phenomena, some of which run contrary to the widely held beliefs in the machine learning community. ...
arXiv:1901.10159v1
fatcat:gu5hltcqijcohitchnyoxksece
« Previous
Showing results 1 — 15 out of 56,241 results