9,694 Hits in 2.6 sec

Neural Kernels Without Tangents [article]

Vaishaal Shankar, Alex Fang, Wenshuo Guo, Sara Fridovich-Keil, Ludwig Schmidt, Jonathan Ragan-Kelley, Benjamin Recht
2020 arXiv   pre-print
We show that these operations correspond to many of the building blocks of "neural tangent kernels (NTK)".  ...  In particular, we find that compositional kernels outperform NTKs and neural networks outperform both kernel methods.  ...  Acknowledgements We would like to thank Achal Dave for his insights on accelerating kernel operations for the GPU and Eric Jonas for his guidance on parallelizing our kernel operations with AWS Batch.  ... 
arXiv:2003.02237v2 fatcat:gmynxkneszb2pbbtpyq3qehryu

Stability Generalisation of Gradient Descent for Shallow Neural Networks without the Neural Tangent Kernel [article]

Dominic Richards, Ilja Kuzborskij
2021 arXiv   pre-print
We revisit on-average algorithmic stability of GD for training overparameterised shallow neural networks and prove new generalisation and excess risk bounds without the NTK or PL assumptions.  ...  While this was known for kernelised interpolants, our proof applies directly to networks trained by GD without intermediate kernelisation.  ...  Our Contributions In this paper we revisit algorithmic stability of GD for training overparameterised shallow neural networks, and prove new risk bounds without the Neural Tangent Kernel (NTK) or Polyak-Łojasiewicz  ... 
arXiv:2107.12723v2 fatcat:fvohu7lqvzc7vhjfy77i5tdfjq

On the infinite width limit of neural networks with a standard parameterization [article]

Jascha Sohl-Dickstein, Roman Novak, Samuel S. Schoenholz, Jaehoon Lee
2020 arXiv   pre-print
There are currently two parameterizations used to derive fixed kernels corresponding to infinite width neural networks, the NTK (Neural Tangent Kernel) parameterization and the naive standard parameterization  ...  We release code implementing this improved standard parameterization as part of the Neural Tangents library at  ...  Additionally, Monte Carlo validation of the correctness of the introduced kernels is performed as part of the Neural Tangents [24] unit test suite.  ... 
arXiv:2001.07301v3 fatcat:tgseieby2rdfxp5ukzu6573cfq

Gradient Kernel Regression [article]

Matt Calder
2021 arXiv   pre-print
In this article a surprising result is demonstrated using the neural tangent kernel.  ...  This kernel is defined as the inner product of the vector of the gradient of an underlying model evaluated at training points. This kernel is used to perform kernel regression.  ...  Introduction There have been a number of papers written recently on the so called neural tangent kernel [1, 3] .  ... 
arXiv:2104.05874v1 fatcat:6yfsipza6zh2zmtcnyryuhdkwi

Implicit Regularization via Neural Feature Alignment [article]

Aristide Baratin, Thomas George, César Laurent, R Devon Hjelm, Guillaume Lajoie, Pascal Vincent, Simon Lacoste-Julien
2021 arXiv   pre-print
We highlight a regularization effect induced by a dynamical alignment of the neural tangent features introduced by Jacot et al, along a small number of task-relevant directions.  ...  By extrapolating a new analysis of Rademacher complexity bounds for linear models, we motivate and study a heuristic complexity measure that captures this phenomenon, in terms of sequences of tangent kernel  ...  Figure 12 : 12 Same as figure 11 but without centering the kernel.  ... 
arXiv:2008.00938v3 fatcat:xtcsbf4kcnbn3itixjq3ddrwhy

Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel [article]

Stanislav Fort, Gintare Karolina Dziugaite, Mansheej Paul, Sepideh Kharaghani, Daniel M. Roy, Surya Ganguli
2020 arXiv   pre-print
In suitably initialized wide networks, small learning rates transform deep neural networks (DNNs) into neural tangent kernel (NTK) machines, whose training dynamics is well-approximated by a linear weight  ...  In multiple neural architectures and datasets, we find these diverse measures evolve in a highly correlated manner, revealing a universal picture of the deep learning process.  ...  Neural tangent kernels, linearized training and the infinite width limit.  ... 
arXiv:2010.15110v1 fatcat:cgusggzoe5ch3dg3dqnfz7224q

NFT-K: Non-Fungible Tangent Kernels [article]

Sina Alemohammad, Hossein Babaei, CJ Barberan, Naiming Liu, Lorenzo Luzi, Blake Mason, Richard G. Baraniuk
2021 arXiv   pre-print
One type of deep neural network is neural tangent kernel that is similar to a kernel machine that provides some aspect of interpretability.  ...  network individually as opposed to past work which attempts to represent the entire network via a single neural tangent kernel.  ...  λ and a deterministic kernel known as its neural tangent kernel (NTK).  ... 
arXiv:2110.04945v1 fatcat:h7froeung5fjjamyfxryxfk7ry

On the linearity of large non-linear models: when and why the tangent kernel is constant [article]

Chaoyue Liu, Libin Zhu, Mikhail Belkin
2021 arXiv   pre-print
We present a general framework for understanding the constancy of the tangent kernel via Hessian scaling applicable to the standard classes of neural networks.  ...  We show that the transition to linearity of the model and, equivalently, constancy of the (neural) tangent kernel (NTK) result from the scaling properties of the norm of the Hessian matrix of the network  ...  We can see that a narrow bottleneck layer in a wide neural network prevent the neural tangent kernel from being constant during training.  ... 
arXiv:2010.01092v3 fatcat:fmjlqmsjdzaplllq6frahmhgcm

Training Tangent Similarities with N-SVM for Alphanumeric Character Recognition

Hassiba Nemmour, Youcef Chibani
2012 American Journal of Signal Processing  
In addition, we investigate the use of tangent similarities to deal with data variability.  ...  Specifically, a neural SVM (N-SVM) combination is adopted for the classification stage in order to accelerate the running time of SVM classifiers.  ...  In the present work, we aim to introduce the TV concept for alphanumeric character recognition without extending the runtime of SVMs and N-SVM.  ... 
doi:10.5923/j.ajsp.20110101.06 fatcat:unsegeh5x5bwvaljig64hsqbgu

Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks [article]

Sanjeev Arora, Simon S. Du, Zhiyuan Li, Ruslan Salakhutdinov, Ruosong Wang, Dingli Yu
2019 arXiv   pre-print
regression with respect to so-called Neural Tangent Kernels (NTKs) (Jacot et al., 2018).  ...  However, super-quadratic running time of kernel methods makes them best suited for small-data tasks. We report results suggesting neural tangent kernels perform strongly on low-data tasks. 1.  ...  kernel, neural tangent kernel (NTK) (Jacot et al., 2018) .  ... 
arXiv:1910.01663v3 fatcat:7msqn37qyzehfpc4h3r7izatmy

Federated Transfer Learning for EEG Signal Classification [article]

Ce Ju, Dashan Gao, Ravikiran Mane, Ben Tan, Yang Liu, Cuntai Guan
2020 arXiv   pre-print
Space Mapping (TSM): Classification algorithm on the tangent space of SPD manifolds [9] . • Riemannian-based Kernel Method (R-Kernal): Kernel method with the specific kernel derived from Riemannian geometry  ...  W 1 ∈ R 32×4 , W 2 ∈ R 4×4 and W 3 ∈ R 4×4 ), tangent projection layer and federated layer without federated aggregation. The learning rate is initialized to 0.1 with a decay after 50 epochs.  ... 
arXiv:2004.12321v2 fatcat:kg4lpgdcgnbhthjad4jiufbsf4

On the Equivalence between Neural Network and Support Vector Machine [article]

Yilan Chen, Wei Huang, Lam M. Nguyen, Tsui-Wei Weng
2021 arXiv   pre-print
Recent research shows that the dynamics of an infinitely wide neural network (NN) trained by gradient descent can be characterized by Neural Tangent Kernel (NTK) .  ...  Under the squared loss, the infinite-width NN trained by gradient descent with an infinitely small learning rate is equivalent to kernel regression with NTK .  ...  This limiting kernel is called Neural Tangent Kernel (NTK).  ... 
arXiv:2111.06063v1 fatcat:udd3xu6huzavpohrvtkifuw5ni

Towards Understanding the Spectral Bias of Deep Learning [article]

Yuan Cao and Zhiying Fang and Yue Wu and Ding-Xuan Zhou and Quanquan Gu
2020 arXiv   pre-print
In this paper, we give a comprehensive and rigorous explanation for spectral bias and relate it with the neural tangent kernel function proposed in recent work.  ...  We prove that the training process of neural networks can be decomposed along different directions defined by the eigenfunctions of the neural tangent kernel, where each direction has its own convergence  ...  tangent kernel.  ... 
arXiv:1912.01198v3 fatcat:ttb33kgee5eb7mdfez3xgocjla

Label Propagation across Graphs: Node Classification using Graph Neural Tangent Kernels [article]

Artun Bayer, Arindam Chowdhury, Santiago Segarra
2021 arXiv   pre-print
To that end, we employ a graph neural tangent kernel (GNTK) that corresponds to infinitely wide GNNs to find correspondences between nodes in different graphs based on both the topology and the node features  ...  Graph neural networks (GNNs) have achieved superior performance on node classification tasks in the last few years.  ...  Current research has shown that under certain limiting conditions on the neural architectures, GD resembles kernel regression with a specialized deterministic kernel, namely, the Neural Tangent Kernel  ... 
arXiv:2110.03763v1 fatcat:zewfkrt5zjgsnam3557s4bef34

Neural Tangent Kernel: Convergence and Generalization in Neural Networks [article]

Arthur Jacot, Franck Gabriel, Clément Hongler
2020 arXiv   pre-print
vectors) follows the kernel gradient of the functional cost (which is convex, in contrast to the parameter cost) w.r.t. a new kernel: the Neural Tangent Kernel (NTK).  ...  At initialization, artificial neural networks (ANNs) are equivalent to Gaussian processes in the infinite-width limit, thus connecting them to kernel methods.  ...  As a consequence, the derivatives ∂ θp F (L) (θ) and the neural tangent kernel depend on the parameters θ.  ... 
arXiv:1806.07572v4 fatcat:vqgyqhbr4vfnrbgyfcq6uz75h4
« Previous Showing results 1 — 15 out of 9,694 results