Universal Regular Conditional Distributions [article]

Anastasis Kratsios
2023 arXiv   pre-print
We introduce a deep learning model that can universally approximate regular conditional distributions (RCDs). The proposed model operates in three phases: first, it linearizes inputs from a given metric space 𝒳 to ℝ^d via a feature map, then a deep feedforward neural network processes these linearized features, and then the network's outputs are then transformed to the 1-Wasserstein space 𝒫_1(ℝ^D) via a probabilistic extension of the attention mechanism of Bahdanau et al. (2014). Our model,
more » ... ed the probabilistic transformer (PT), can approximate any continuous function from ℝ^d to 𝒫_1(ℝ^D) uniformly on compact sets, quantitatively. We identify two ways in which the PT avoids the curse of dimensionality when approximating 𝒫_1(ℝ^D)-valued functions. The first strategy builds functions in C(ℝ^d,𝒫_1(ℝ^D)) which can be efficiently approximated by a PT, uniformly on any given compact subset of ℝ^d. In the second approach, given any function f in C(ℝ^d,𝒫_1(ℝ^D)), we build compact subsets of ℝ^d whereon f can be efficiently approximated by a PT.
arXiv:2105.07743v5 fatcat:symidn7upfd5fkyj7w3armeheu