A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is
This article studies the Gram random matrix model G = 1 T Σ T Σ, Σ = σ(W X), classically found in the analysis of random feature maps and random neural networks, where X = [x1, . . . , xT ] ∈ R p×T is a (data) matrix of bounded norm, W ∈ R n×p is a matrix of independent zero-mean unit variance entries, and σ : R → R is a Lipschitz continuous (activation) function -σ(W X) being understood entrywise. By means of a key concentration of measure lemma arising from non-asymptotic random matrixdoi:10.1214/17-aap1328 fatcat:m4ctfbrezrcgljxmxk3g3na76u