A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is `application/pdf`

.

## Filters

##
###
Adjoinable Homology
[article]

2014
*
arXiv
*
pre-print

The notion of a duality between two derived functors as well as an extension theorem for derived functors to larger categories in which they need not be defined is introduced. These ideas are then applied to extend and study the coext functors to an arbitrary coalgebra. A new homology theory theory is then built therefrom and is shown to exhibit certain duality relations to the Hochschild cohomology of certain coalgebras. Lastly, a certain exceptional type of coalgebra is introduced and it is

arXiv:1402.4197v1
fatcat:mp3jjzqunjajlcnw3k2uju67am
## more »

... ed to make explicit connections between this new homology theory and the continuous cohomology of this exceptional algebra's pro-finite dual algebra.##
###
Metric Hypertransformers are Universal Adapted Maps
[article]

2022
*
arXiv
*
pre-print

We introduce a universal class of geometric deep learning models, called metric hypertransformers (MHTs), capable of approximating any adapted map F:𝒳^ℤ→𝒴^ℤ with approximable complexity, where 𝒳⊆ℝ^d and 𝒴 is any suitable metric space, and 𝒳^ℤ (resp. 𝒴^ℤ) capture all discrete-time paths on 𝒳 (resp. 𝒴). Suitable spaces 𝒴 include various (adapted) Wasserstein spaces, all Fréchet spaces admitting a Schauder basis, and a variety of Riemannian manifolds arising from information geometry. Even in the

arXiv:2201.13094v1
fatcat:ptqcjo2e2zf25hx4wesdvq3q6e
## more »

... tatic case, where f:𝒳→𝒴 is a Hölder map, our results provide the first (quantitative) universal approximation theorem compatible with any such 𝒳 and 𝒴. Our universal approximation theorems are quantitative, and they depend on the regularity of F, the choice of activation function, the metric entropy and diameter of 𝒳, and on the regularity of the compact set of paths whereon the approximation is performed. Our guiding examples originate from mathematical finance. Notably, the MHT models introduced here are able to approximate a broad range of stochastic processes' kernels, including solutions to SDEs, many processes with arbitrarily long memory, and functions mapping sequential data to sequences of forward rate curves.##
###
The Entropic Measure Transform
[article]

2019
*
arXiv
*
pre-print

We introduce the entropic measure transform (EMT) problem for a general process and prove the existence of a unique optimal measure characterizing the solution. The density process of the optimal measure is characterized using a semimartingale BSDE under general conditions. The EMT is used to reinterpret the conditional entropic risk-measure and to obtain a convenient formula for the conditional expectation of a process which admits an affine representation under a related measure. The entropic

arXiv:1511.06032v2
fatcat:5ziwwa64rvavbdk5zghecek4ia
## more »

... measure transform is then used provide a new characterization of defaultable bond prices, forward prices, and futures prices when the asset is driven by a jump diffusion. The characterization of these pricing problems in terms of the EMT provides economic interpretations as a maximization of returns subject to a penalty for removing financial risk as expressed through the aggregate relative entropy. The EMT is shown to extend the optimal stochastic control characterization of default-free bond prices of Gombani and Runggaldier (Math. Financ. 23(4):659-686, 2013). These methods are illustrated numerically with an example in the defaultable bond setting.##
###
Non-Euclidean Universal Approximation
[article]

2020
*
arXiv
*
pre-print

Modifications to a neural network's input and output layers are often required to accommodate the specificities of most practical learning tasks. However, the impact of such changes on architecture's approximation capabilities is largely not understood. We present general conditions describing feature and readout maps that preserve an architecture's ability to approximate any continuous functions uniformly on compacts. As an application, we show that if an architecture is capable of universal

arXiv:2006.02341v3
fatcat:x46kjbqjsjeblgzx2mgnpx4zdq
## more »

... proximation, then modifying its final layer to produce binary values creates a new architecture capable of deterministically approximating any classifier. In particular, we obtain guarantees for deep CNNs and deep feed-forward networks. Our results also have consequences within the scope of geometric deep learning. Specifically, when the input and output spaces are Cartan-Hadamard manifolds, we obtain geometrically meaningful feature and readout maps satisfying our criteria. Consequently, commonly used non-Euclidean regression models between spaces of symmetric positive definite matrices are extended to universal DNNs. The same result allows us to show that the hyperbolic feed-forward networks, used for hierarchical learning, are universal. Our result is also used to show that the common practice of randomizing all but the last two layers of a DNN produces a universal family of functions with probability one. We also provide conditions on a DNN's first (resp. last) few layer's connections and activation function which guarantee that these layers can have a width equal to the input (resp. output) space's dimension while not negatively affecting the architecture's approximation capabilities.##
###
Noncommutative Algebra and Noncommutative Geometry
[article]

2014
*
arXiv
*
pre-print

Divided into three parts, the first marks out enormous geometric issues with the notion of quasi-freenss of an algebra and seeks to replace this notion of formal smoothness with an approximation by means of a minimal unital commutative algebra's smoothness. The second part of this text is then, devoted to the approximating of properties of nc. schemes through the properties of two uniquely determined (classical) schemes estimating the nc. scheme in question in a maximal way from the inside and

arXiv:1404.0126v3
fatcat:rwleqiua6rdpbckcn5wcyf477a
## more »

... hrough the minimal scheme approximating the nc. scheme in question from the outside. The very brief final par of this exposition, aims to understand and distil the properties at work in constructing any "scheme-like" object over an "appropriate" category, purely out of philosophical interest.##
###
Hochschild Cohomological Dimension is Not Upper Semi-Continuous
[article]

2019
*
arXiv
*
pre-print

It is shown that the Hochschild Cohomological dimension of an associative algebra is not an upper-semi continuous function, showing the semi-continuity theorem is no longer valid for non-commutative algebras. A family of C exhibits this-algebras parameterized by C all but one of which has Hochschild cohomological dimension 2 and the other having Hochschild cohomological dimension 1.

arXiv:1407.4825v2
fatcat:mw6f3e6pmffvhestha3c3nkrai
##
###
Optimizing Optimizers: Regret-optimal gradient descent algorithms
[article]

2021
*
arXiv
*
pre-print

*Kratsios*. . See Corollary 3 for the precise statement. . Let fr(x, y) be the Rosenbrock function on R 2 (Rosenbrock, 1960) . ...

##
###
Partial Uncertainty and Applications to Risk-Averse Valuation
[article]

2019
*
arXiv
*
pre-print

This paper introduces an intermediary between conditional expectation and conditional sublinear expectation, called R-conditioning. The R-conditioning of a random-vector in L^2 is defined as the best L^2-estimate, given a σ-subalgebra and a degree of model uncertainty. When the random vector represents the payoff of derivative security in a complete financial market, its R-conditioning with respect to the risk-neutral measure is interpreted as its risk-averse value. The optimization problem

arXiv:1909.13610v2
fatcat:ibgacybgrnanpgqhwmxddfdowa
## more »

... ning the optimization R-conditioning is shown to be well-posed. We show that the R-conditioning operators can be used to approximate a large class of sublinear expectations to arbitrary precision. We then introduce a novel numerical algorithm for computing the R-conditioning. This algorithm is shown to be strongly convergent. Implementations are used to compare the risk-averse value of a Vanilla option to its traditional risk-neutral value, within the Black-Scholes-Merton framework. Concrete connections to robust finance, sensitivity analysis, and high-dimensional estimation are all treated in this paper.##
###
Universal Regular Conditional Distributions
[article]

2022
*
arXiv
*
pre-print

We introduce a deep learning model which can generically approximate regular conditional distributions (RCDs). The proposed model operates in three phases: first linearizes inputs from a given metric space 𝒳 to ℝ^d via a feature map then, these linearized features are processed by a deep feedforward neural network, and the network's outputs are then translated to the 1-Wasserstein space 𝒫_1(ℝ^D) via a probabilistic extension of the attention mechanism introduced by Bahdanau et al. (2014). We

arXiv:2105.07743v3
fatcat:nnndb37txfcw3okefxsvzt3xqe
## more »

... d that the models built using our framework can approximate any continuous function from ℝ^d to 𝒫_1(ℝ^D) uniformly on compact sets, quantitatively. We identify two ways of avoiding the curse of dimensionality when approximating 𝒫_1(ℝ^D)-valued functions. The first strategy describes functions in C(ℝ^d,𝒫_1(ℝ^D)) which can be efficiently approximated on any compact subset of ℝ^d. The second approach describes compact subsets of ℝ^d, on which any most in C(ℝ^d,𝒫_1(ℝ^D)) can be efficiently approximated. The results are verified experimentally.##
###
Non-Euclidean Conditional Expectation and Filtering
[article]

2018
*
arXiv
*
pre-print

A non-Euclidean generalization of conditional expectation is introduced and characterized as the minimizer of expected intrinsic squared-distance from a manifold-valued target. The computational tractable formulation expresses the non-convex optimization problem as transformations of Euclidean conditional expectation. This gives computationally tractable filtering equations for the dynamics of the intrinsic conditional expectation of a manifold-valued signal and is used to obtain accurate

arXiv:1710.05829v3
fatcat:6h6fzut2fbg6bcqp3l57snwiuu
## more »

... cal forecasts of efficient portfolios by incorporating their geometric structure into the estimates.##
###
Deep Arbitrage-Free Learning in a Generalized HJM Framework via Arbitrage-Regularization

2020
*
Risks
*

As is common practice in machine learning, further details of our code and implementation can be found on

doi:10.3390/risks8020040
fatcat:demlghmnrbfxrkzug3mvbc3vze
*Kratsios*(2019a) . ... A Deep Learning Approach to Arbitrage-Regularization The flexibility of feed-forward neural networks (ffNNs), as described in the universal approximation theorems of Hornik (1991) ;*Kratsios*(2019b) ...##
###
Universal Approximation Under Constraints is Possible with Transformers
[article]

2022
*
arXiv
*
pre-print

ACKNOWLEDGMENTS

arXiv:2110.03303v2
fatcat:2bnbxkaf7fhhrptscjz3453moy
*Anastasis**Kratsios*and Ivan Dokmanić were supported by the European Research Council (ERC) Starting Grant 852821-SWING. ... Thus, the map (*Kratsios*& Papon, 2021, Definition 16 )); thus, (*Kratsios*& Papon, 2021, Corollary 43 ) (activation function parameter α set to α = 0. ... & Bilokopytov (2020) ;*Kratsios*& Papon (2021) for Riemannianmanifold valued functions which does not need explicit charts. ...##
###
Optimal Stochastic Decensoring and Applications to Calibration of Market Models
[article]

2017
*
arXiv
*
pre-print

Typically flat filling, linear or polynomial interpolation methods to generate missing historical data. We introduce a novel optimal method for recreating data generated by a diffusion process. The results are then applied to recreate historical data for stocks.

arXiv:1712.04844v2
fatcat:k7c2ntgdgfd3pouo4vp2mhzykq
##
###
Characterizing the Universal Approximation Property
[article]

2020
*
arXiv
*
pre-print

To better understand the approximation capabilities of various currently available neural network architectures, this paper studies the universal approximation property itself across a broad scope of function spaces. We characterize universal approximators, on most function space of practical interest, as implicitly decomposing that space into topologically regular subspaces on which a transitive dynamical system describes the architecture's structure. We obtain a simple criterion for

arXiv:1910.03344v3
fatcat:j3mhfpj3mbf35iz2ypotvyt5de
## more »

... ng universal approximators as transformations of the feed-forward architecture and we show that every architecture, on most function spaces of practical interest, is approximately constructed in this way. Moreover, we show that most function spaces admit universal approximators built using a single function. The results are used to show that certain activation functions such as Leaky-ReLU, but not ReLU, create expressibility through depth by eventually mixing any two functions' open neighbourhoods. For those activation functions, we obtain improved approximation rates described in terms of the network breadth and depth. We show that feed-forward networks built using such activation functions can encode constraints into their final layers while simultaneously maintaining their universal approximation capabilities. We construct a modification of the feed-forward architecture, which can approximate any continuous function, with a controlled growth rate, uniformly on the entire domain space, and we show that the feed-forward architecture typically cannot.##
###
Do ReLU Networks Have An Edge When Approximating Compactly-Supported Functions?
[article]

2022
*
arXiv
*
pre-print

Bibliography Beatrice Acciaio,

arXiv:2204.11231v2
fatcat:yzbiwihjyvhvfd4c4rq42rxzba
*Anastasis**Kratsios*, and Gudmund Pammer. Metric hypertransformers are universal adapted maps. arXiv preprint arXiv:2201.13094, 2022. ... & Bilokopytov (2020); Siegel & Xu (2020);*Kratsios*& Hyndman (2021);*Kratsios*et al. (2022); Yarotsky ( , Proposition 1.7 (i)) we haveLip(F) ≤ c log 2 (cap(X)) Lip( f ). (12)By Jung's Theorem, there ... Another example of a non-metric universal approximation theorem in the deep learning literature is the universal classification result of (*Kratsios*& Bilokopytov, 2020, Corollary 3.12)). ...
« Previous

*Showing results 1 — 15 out of 31 results*