31 Hits in 0.75 sec

Adjoinable Homology [article]

Anastasis Kratsios
2014 arXiv   pre-print
The notion of a duality between two derived functors as well as an extension theorem for derived functors to larger categories in which they need not be defined is introduced. These ideas are then applied to extend and study the coext functors to an arbitrary coalgebra. A new homology theory theory is then built therefrom and is shown to exhibit certain duality relations to the Hochschild cohomology of certain coalgebras. Lastly, a certain exceptional type of coalgebra is introduced and it is
more » ... ed to make explicit connections between this new homology theory and the continuous cohomology of this exceptional algebra's pro-finite dual algebra.
arXiv:1402.4197v1 fatcat:mp3jjzqunjajlcnw3k2uju67am

Metric Hypertransformers are Universal Adapted Maps [article]

Beatrice Acciaio, Anastasis Kratsios, Gudmund Pammer
2022 arXiv   pre-print
We introduce a universal class of geometric deep learning models, called metric hypertransformers (MHTs), capable of approximating any adapted map F:𝒳^ℤ→𝒴^ℤ with approximable complexity, where 𝒳⊆ℝ^d and 𝒴 is any suitable metric space, and 𝒳^ℤ (resp. 𝒴^ℤ) capture all discrete-time paths on 𝒳 (resp. 𝒴). Suitable spaces 𝒴 include various (adapted) Wasserstein spaces, all Fréchet spaces admitting a Schauder basis, and a variety of Riemannian manifolds arising from information geometry. Even in the
more » ... tatic case, where f:𝒳→𝒴 is a Hölder map, our results provide the first (quantitative) universal approximation theorem compatible with any such 𝒳 and 𝒴. Our universal approximation theorems are quantitative, and they depend on the regularity of F, the choice of activation function, the metric entropy and diameter of 𝒳, and on the regularity of the compact set of paths whereon the approximation is performed. Our guiding examples originate from mathematical finance. Notably, the MHT models introduced here are able to approximate a broad range of stochastic processes' kernels, including solutions to SDEs, many processes with arbitrarily long memory, and functions mapping sequential data to sequences of forward rate curves.
arXiv:2201.13094v1 fatcat:ptqcjo2e2zf25hx4wesdvq3q6e

The Entropic Measure Transform [article]

Renjie Wang, Cody Hyndman, Anastasis Kratsios
2019 arXiv   pre-print
We introduce the entropic measure transform (EMT) problem for a general process and prove the existence of a unique optimal measure characterizing the solution. The density process of the optimal measure is characterized using a semimartingale BSDE under general conditions. The EMT is used to reinterpret the conditional entropic risk-measure and to obtain a convenient formula for the conditional expectation of a process which admits an affine representation under a related measure. The entropic
more » ... measure transform is then used provide a new characterization of defaultable bond prices, forward prices, and futures prices when the asset is driven by a jump diffusion. The characterization of these pricing problems in terms of the EMT provides economic interpretations as a maximization of returns subject to a penalty for removing financial risk as expressed through the aggregate relative entropy. The EMT is shown to extend the optimal stochastic control characterization of default-free bond prices of Gombani and Runggaldier (Math. Financ. 23(4):659-686, 2013). These methods are illustrated numerically with an example in the defaultable bond setting.
arXiv:1511.06032v2 fatcat:5ziwwa64rvavbdk5zghecek4ia

Non-Euclidean Universal Approximation [article]

Anastasis Kratsios, Eugene Bilokopytov
2020 arXiv   pre-print
Modifications to a neural network's input and output layers are often required to accommodate the specificities of most practical learning tasks. However, the impact of such changes on architecture's approximation capabilities is largely not understood. We present general conditions describing feature and readout maps that preserve an architecture's ability to approximate any continuous functions uniformly on compacts. As an application, we show that if an architecture is capable of universal
more » ... proximation, then modifying its final layer to produce binary values creates a new architecture capable of deterministically approximating any classifier. In particular, we obtain guarantees for deep CNNs and deep feed-forward networks. Our results also have consequences within the scope of geometric deep learning. Specifically, when the input and output spaces are Cartan-Hadamard manifolds, we obtain geometrically meaningful feature and readout maps satisfying our criteria. Consequently, commonly used non-Euclidean regression models between spaces of symmetric positive definite matrices are extended to universal DNNs. The same result allows us to show that the hyperbolic feed-forward networks, used for hierarchical learning, are universal. Our result is also used to show that the common practice of randomizing all but the last two layers of a DNN produces a universal family of functions with probability one. We also provide conditions on a DNN's first (resp. last) few layer's connections and activation function which guarantee that these layers can have a width equal to the input (resp. output) space's dimension while not negatively affecting the architecture's approximation capabilities.
arXiv:2006.02341v3 fatcat:x46kjbqjsjeblgzx2mgnpx4zdq

Noncommutative Algebra and Noncommutative Geometry [article]

Anastasis Kratsios
2014 arXiv   pre-print
Divided into three parts, the first marks out enormous geometric issues with the notion of quasi-freenss of an algebra and seeks to replace this notion of formal smoothness with an approximation by means of a minimal unital commutative algebra's smoothness. The second part of this text is then, devoted to the approximating of properties of nc. schemes through the properties of two uniquely determined (classical) schemes estimating the nc. scheme in question in a maximal way from the inside and
more » ... hrough the minimal scheme approximating the nc. scheme in question from the outside. The very brief final par of this exposition, aims to understand and distil the properties at work in constructing any "scheme-like" object over an "appropriate" category, purely out of philosophical interest.
arXiv:1404.0126v3 fatcat:rwleqiua6rdpbckcn5wcyf477a

Hochschild Cohomological Dimension is Not Upper Semi-Continuous [article]

Anastasis Kratsios
2019 arXiv   pre-print
It is shown that the Hochschild Cohomological dimension of an associative algebra is not an upper-semi continuous function, showing the semi-continuity theorem is no longer valid for non-commutative algebras. A family of C exhibits this-algebras parameterized by C all but one of which has Hochschild cohomological dimension 2 and the other having Hochschild cohomological dimension 1.
arXiv:1407.4825v2 fatcat:mw6f3e6pmffvhestha3c3nkrai

Optimizing Optimizers: Regret-optimal gradient descent algorithms [article]

Philippe Casgrain, Anastasis Kratsios
2021 arXiv   pre-print
Kratsios. . See Corollary 3 for the precise statement. . Let fr(x, y) be the Rosenbrock function on R 2 (Rosenbrock, 1960) .  ... 
arXiv:2101.00041v2 fatcat:o6jzea6iyja4bjvveltgslhclq

Partial Uncertainty and Applications to Risk-Averse Valuation [article]

Anastasis Kratsios
2019 arXiv   pre-print
This paper introduces an intermediary between conditional expectation and conditional sublinear expectation, called R-conditioning. The R-conditioning of a random-vector in L^2 is defined as the best L^2-estimate, given a σ-subalgebra and a degree of model uncertainty. When the random vector represents the payoff of derivative security in a complete financial market, its R-conditioning with respect to the risk-neutral measure is interpreted as its risk-averse value. The optimization problem
more » ... ning the optimization R-conditioning is shown to be well-posed. We show that the R-conditioning operators can be used to approximate a large class of sublinear expectations to arbitrary precision. We then introduce a novel numerical algorithm for computing the R-conditioning. This algorithm is shown to be strongly convergent. Implementations are used to compare the risk-averse value of a Vanilla option to its traditional risk-neutral value, within the Black-Scholes-Merton framework. Concrete connections to robust finance, sensitivity analysis, and high-dimensional estimation are all treated in this paper.
arXiv:1909.13610v2 fatcat:ibgacybgrnanpgqhwmxddfdowa

Universal Regular Conditional Distributions [article]

Anastasis Kratsios
2022 arXiv   pre-print
We introduce a deep learning model which can generically approximate regular conditional distributions (RCDs). The proposed model operates in three phases: first linearizes inputs from a given metric space 𝒳 to ℝ^d via a feature map then, these linearized features are processed by a deep feedforward neural network, and the network's outputs are then translated to the 1-Wasserstein space 𝒫_1(ℝ^D) via a probabilistic extension of the attention mechanism introduced by Bahdanau et al. (2014). We
more » ... d that the models built using our framework can approximate any continuous function from ℝ^d to 𝒫_1(ℝ^D) uniformly on compact sets, quantitatively. We identify two ways of avoiding the curse of dimensionality when approximating 𝒫_1(ℝ^D)-valued functions. The first strategy describes functions in C(ℝ^d,𝒫_1(ℝ^D)) which can be efficiently approximated on any compact subset of ℝ^d. The second approach describes compact subsets of ℝ^d, on which any most in C(ℝ^d,𝒫_1(ℝ^D)) can be efficiently approximated. The results are verified experimentally.
arXiv:2105.07743v3 fatcat:nnndb37txfcw3okefxsvzt3xqe

Non-Euclidean Conditional Expectation and Filtering [article]

Anastasis Kratsios, Cody B. Hyndman
2018 arXiv   pre-print
A non-Euclidean generalization of conditional expectation is introduced and characterized as the minimizer of expected intrinsic squared-distance from a manifold-valued target. The computational tractable formulation expresses the non-convex optimization problem as transformations of Euclidean conditional expectation. This gives computationally tractable filtering equations for the dynamics of the intrinsic conditional expectation of a manifold-valued signal and is used to obtain accurate
more » ... cal forecasts of efficient portfolios by incorporating their geometric structure into the estimates.
arXiv:1710.05829v3 fatcat:6h6fzut2fbg6bcqp3l57snwiuu

Deep Arbitrage-Free Learning in a Generalized HJM Framework via Arbitrage-Regularization

Anastasis Kratsios, Cody Hyndman
2020 Risks  
As is common practice in machine learning, further details of our code and implementation can be found on Kratsios (2019a) .  ...  A Deep Learning Approach to Arbitrage-Regularization The flexibility of feed-forward neural networks (ffNNs), as described in the universal approximation theorems of Hornik (1991) ; Kratsios (2019b)  ... 
doi:10.3390/risks8020040 fatcat:demlghmnrbfxrkzug3mvbc3vze

Universal Approximation Under Constraints is Possible with Transformers [article]

Anastasis Kratsios, Behnoosh Zamanlooy, Tianlin Liu, Ivan Dokmanić
2022 arXiv   pre-print
ACKNOWLEDGMENTS Anastasis Kratsios and Ivan Dokmanić were supported by the European Research Council (ERC) Starting Grant 852821-SWING.  ...  Thus, the map (Kratsios & Papon, 2021, Definition 16 )); thus, (Kratsios & Papon, 2021, Corollary 43 ) (activation function parameter α set to α = 0.  ...  & Bilokopytov (2020) ; Kratsios & Papon (2021) for Riemannianmanifold valued functions which does not need explicit charts.  ... 
arXiv:2110.03303v2 fatcat:2bnbxkaf7fhhrptscjz3453moy

Optimal Stochastic Decensoring and Applications to Calibration of Market Models [article]

Anastasis Kratsios
2017 arXiv   pre-print
Typically flat filling, linear or polynomial interpolation methods to generate missing historical data. We introduce a novel optimal method for recreating data generated by a diffusion process. The results are then applied to recreate historical data for stocks.
arXiv:1712.04844v2 fatcat:k7c2ntgdgfd3pouo4vp2mhzykq

Characterizing the Universal Approximation Property [article]

Anastasis Kratsios
2020 arXiv   pre-print
To better understand the approximation capabilities of various currently available neural network architectures, this paper studies the universal approximation property itself across a broad scope of function spaces. We characterize universal approximators, on most function space of practical interest, as implicitly decomposing that space into topologically regular subspaces on which a transitive dynamical system describes the architecture's structure. We obtain a simple criterion for
more » ... ng universal approximators as transformations of the feed-forward architecture and we show that every architecture, on most function spaces of practical interest, is approximately constructed in this way. Moreover, we show that most function spaces admit universal approximators built using a single function. The results are used to show that certain activation functions such as Leaky-ReLU, but not ReLU, create expressibility through depth by eventually mixing any two functions' open neighbourhoods. For those activation functions, we obtain improved approximation rates described in terms of the network breadth and depth. We show that feed-forward networks built using such activation functions can encode constraints into their final layers while simultaneously maintaining their universal approximation capabilities. We construct a modification of the feed-forward architecture, which can approximate any continuous function, with a controlled growth rate, uniformly on the entire domain space, and we show that the feed-forward architecture typically cannot.
arXiv:1910.03344v3 fatcat:j3mhfpj3mbf35iz2ypotvyt5de

Do ReLU Networks Have An Edge When Approximating Compactly-Supported Functions? [article]

Anastasis Kratsios, Behnoosh Zamanlooy
2022 arXiv   pre-print
Bibliography Beatrice Acciaio, Anastasis Kratsios, and Gudmund Pammer. Metric hypertransformers are universal adapted maps. arXiv preprint arXiv:2201.13094, 2022.  ...  & Bilokopytov (2020); Siegel & Xu (2020); Kratsios & Hyndman (2021); Kratsios et al. (2022); Yarotsky ( , Proposition 1.7 (i)) we haveLip(F) ≤ c log 2 (cap(X)) Lip( f ). (12)By Jung's Theorem, there  ...  Another example of a non-metric universal approximation theorem in the deep learning literature is the universal classification result of (Kratsios & Bilokopytov, 2020, Corollary 3.12)).  ... 
arXiv:2204.11231v2 fatcat:yzbiwihjyvhvfd4c4rq42rxzba
« Previous Showing results 1 — 15 out of 31 results