IA Scholar Query: Analysis of the Spectral Properties of the Radon Transform for the Design of Optimal Sampling Grids.
https://scholar.archive.org/
Internet Archive Scholar query results feedeninfo@archive.orgThu, 11 Aug 2022 00:00:00 GMTfatcat-scholarhttps://scholar.archive.org/help1440Theory of Deep Learning: Neural Tangent Kernel and Beyond
https://scholar.archive.org/work/tebuyad2ijgbde2twqk5tbtl3u
In the recent years, Deep Neural Networks (DNNs) have managed to succeed at tasks that previously appeared impossible, such as human-level object recognition, text synthesis, translation, playing games, and many more. In spite of these major achievements, our understanding of these models, in particular of what happens during their training, remains very limited. This PhD started with the introduction of the Neural Tangent Kernel (NTK) to describe the evolution of the function represented by the network during training. In the infinite-width limit, i.e. when the number of neurons in the layers of the network grows to infinity, the NTK converges to a deterministic and time-independent limit, leading to a simple yet complete description of the dynamics of infinitely-wide DNNs. This allowed one to give the first general proof of convergence of DNNs to a global minimum, and yielded the first description of the limiting spectrum of the Hessian of the loss surface of DNNs throughout training. More importantly, the NTK plays a crucial role in describing the generalization abilities of DNNs, i.e. the performance of the trained network on unseen data. The NTK analysis uncovered a direct link between the function learned by infinitely wide DNNs and Kernel Ridge Regression predictors, whose generalization properties are studied in this thesis using tools of random matrix theory. Our analysis of KRR reveals the importance of the eigendecomposition of the NTK, which is affected by a number of architectural choices. In very deep networks, an ordered regime and a chaotic regime appear, determined by the choice of non-linearity and the balance between the weights and bias parameters; these two phases are characterized by different speeds of decay of the eigenvalues of the NTK, leading to a tradeoff between convergence speed and generalization. In practical contexts such as Generative Adversarial Networks or Topology Optimization, the network architecture can be chosen to guarantee certain properties of the NTK and its spectrum. These results give an almost complete description of infinitely-wide DNNs in the NTK regime. It is then natural to wonder how it extends to finite-width networks used in practice. In the NTK regime, the discrepancy between finite-and infinite-widths DNNs is mainly a result of the variance with respect to the sampling of the parameters, as shown empirically and mathematically, relying on the similarity between DNNs and random feature models. In contrast to the NTK regime, where the NTK remains constant during training, there exist so-called active regimes, where the evolution of the NTK is significant, and which appear in a number of settings. We describe one such regime in Deep Linear Networks with a very small initialization, where the training dynamics approaches a sequence of saddle-points, representing linear maps of increasing rank, leading to a low-rank bias which is absent in the NTK regime.Arthur Ulysse Jacot-Guillarmodwork_tebuyad2ijgbde2twqk5tbtl3uThu, 11 Aug 2022 00:00:00 GMTGeometric Methods for Sampling, Optimisation, Inference and Adaptive Agents
https://scholar.archive.org/work/cipvt3b4qjacjexqwwdhptlmfm
In this chapter, we identify fundamental geometric structures that underlie the problems of sampling, optimisation, inference and adaptive decision-making. Based on this identification, we derive algorithms that exploit these geometric structures to solve these problems efficiently. We show that a wide range of geometric theories emerge naturally in these fields, ranging from measure-preserving processes, information divergences, Poisson geometry, and geometric integration. Specifically, we explain how (i) leveraging the symplectic geometry of Hamiltonian systems enable us to construct (accelerated) sampling and optimisation methods, (ii) the theory of Hilbertian subspaces and Stein operators provides a general methodology to obtain robust estimators, (iii) preserving the information geometry of decision-making yields adaptive agents that perform active inference. Throughout, we emphasise the rich connections between these fields; e.g., inference draws on sampling and optimisation, and adaptive decision-making assesses decisions by inferring their counterfactual consequences. Our exposition provides a conceptual overview of underlying ideas, rather than a technical discussion, which can be found in the references herein.Alessandro Barp, Lancelot Da Costa, Guilherme França, Karl Friston, Mark Girolami, Michael I. Jordan, Grigorios A. Pavliotiswork_cipvt3b4qjacjexqwwdhptlmfmMon, 25 Jul 2022 00:00:00 GMTSurface and borehole geophysical analysis of structures within the Callide Basin, eastern Central Queensland
https://scholar.archive.org/work/dlfqfjg3tvgthap2kwueadza7a
Traditional geophysical techniques, such as electrical, magnetic, seismic and gamma spectroscopic methods, have been deployed across the Callide Basin, Eastern Central Queensland, intent on delineating basin -wide structures. Further, innovative surface and borehole geophysical techniques have been applied for coal mine -scale exploration and production with the intention of reducing global geological ambiguity and optimising exploration resources at Callide Coalfields. A very low frequency electromagnetic surface impedance mapping method, the SIROLOG downhole technique, acoustic scanning, electromagnetic tomography and full wave -form sonic borehole logging have been trialed for geological hazard and mine design applications at Callide Coalfields as the precursor to their wider application and acceptance in the Australian coal industry. In this thesis, the theoretical basis for these techniques is provided. However, more importantly, the case studies presented demonstrate the role that these geophysical techniques have played in identifying geological structures critical to mining. Reverse faults that daylight in highwalls and intrusions constitute geological hazards that affect safety, costs and scheduling in mining operations. Identification of the limit of oxidation of coal seams (coal subcrop) is critical in mine design. During the course of this thesis, the application of geophysical techniques resulted in: a) a major structure (the "Trap Gully Monocline") being redefined from its original interpretation as a normal fault to a monocline that is stress -relieved by minor scale thrust faulting; b) two previously unidentified intrusions (the Kilburnie "Homestead" plug and The Hut "Crater" plug) that impinge on mining have been discovered; c) the delineation of two coal subcrop lines has resulted in the discovery of an additional 1.5 million tonnes of coal reserve at Boundary Hill mine and the successful redesign of mining strips at The Hut Central Valley and Eastern Hillside brownfield sites; and d) [...]Wesley James Foi Nicholswork_dlfqfjg3tvgthap2kwueadza7aTue, 19 Jul 2022 00:00:00 GMTEfficient approximation of high-dimensional exponentials by tensornetworks
https://scholar.archive.org/work/ksn4rx4tdjf5fawafnkj7knk7m
In this work a general approach to compute a compressed representation of the exponential exp(h) of a high-dimensional function h is presented. Such exponential functions play an important role in several problems in Uncertainty Quantification, e.g. the approximation of log-normal random fields or the evaluation of Bayesian posterior measures. Usually, these high-dimensional objects are intractable numerically and can only be accessed pointwise in sampling methods. In contrast, the proposed method constructs a functional representation of the exponential by exploiting its nature as a solution of an ordinary differential equation. The application of a Petrov–Galerkin scheme to this equation provides a tensor train representation of the solution for which we derive an efficient and reliable a posteriori error estimator. Numerical experiments with a log-normal random field and a Bayesian likelihood illustrate the performance of the approach in comparison to other recent low-rank representations for the respective applications. Although the present work considers only a specific differential equation, the presented method can be applied in a more general setting. We show that the composition of a generic holonomic function and a high-dimensional function corresponds to a differential equation that can be used in our method. Moreover, the differential equation can be modified to adapt the norm in the a posteriori error estimates to the problem at hand.Martin Eigel, Nando Farchmin, Sebastian Heidenreich, Philipp Trunschkework_ksn4rx4tdjf5fawafnkj7knk7mMon, 18 Jul 2022 00:00:00 GMTGreedy Training Algorithms for Neural Networks and Applications to PDEs
https://scholar.archive.org/work/njh6brufnzbqdbxo7lrpotahhy
Recently, neural networks have been widely applied for solving partial differential equations (PDEs). However, with current training algorithms the numerical convergence of neural networks when solving PDEs has not been empirically observed. The primary difficulty lies in solving the highly non-convex optimization problems resulting from the neural network discretization. Theoretically analyzing the optimization process presents significant difficulties and empirical experiments require extensive hyperparameter tuning to achieve acceptable results. In order to conquer this challenge, we develop a novel greedy training algorithm for shallow neural networks in this paper. We also analyze the resulting method and obtain a priori error bounds when solving PDEs from the function class defined by shallow networks. This rigorously establishes the convergence of the method as the network size increases. Finally, we test the algorithm on several benchmark examples, including high dimensional PDEs, to confirm the theoretical convergence rate and to establish its efficiency and robustness. An advantage of this method is its straightforward applicability to high-order equations on general domains.Jonathan W. Siegel, Qingguo Hong, Xianlin Jin, Wenrui Hao, Jinchao Xuwork_njh6brufnzbqdbxo7lrpotahhyThu, 14 Jul 2022 00:00:00 GMTA Spectral Representation of Kernel Stein Discrepancy with Application to Goodness-of-Fit Tests for Measures on Infinite Dimensional Hilbert Spaces
https://scholar.archive.org/work/uuz2ejljd5enpfnq7gnvypd3wq
Kernel Stein discrepancy (KSD) is a widely used kernel-based measure of discrepancy between probability measures. It is often employed in the scenario where a user has a collection of samples from a candidate probability measure and wishes to compare them against a specified target probability measure. A useful property of KSD is that it may be calculated with samples from only the candidate measure and without knowledge of the normalising constant of the target measure. KSD has been employed in a range of settings including goodness-of-fit testing, parametric inference, MCMC output assessment and generative modelling. Two main issues with current KSD methodology are (i) the lack of applicability beyond the finite dimensional Euclidean setting and (ii) a lack of clarity on what influences KSD performance. This paper provides a novel spectral representation of KSD which remedies both of these, making KSD applicable to Hilbert-valued data and revealing the impact of kernel and Stein operator choice on the KSD. We demonstrate the efficacy of the proposed methodology by performing goodness-of-fit tests for various Gaussian and non-Gaussian functional models in a number of synthetic data experiments.George Wynne, Mikołaj Kasprzak, Andrew B. Duncanwork_uuz2ejljd5enpfnq7gnvypd3wqThu, 14 Jul 2022 00:00:00 GMTBridging Mean-Field Games and Normalizing Flows with Trajectory Regularization
https://scholar.archive.org/work/fspbmmbponcbdfv342yvuzs4de
Mean-field games (MFGs) are a modeling framework for systems with a large number of interacting agents. They have applications in economics, finance, and game theory. Normalizing flows (NFs) are a family of deep generative models that compute data likelihoods by using an invertible mapping, which is typically parameterized by using neural networks. They are useful for density modeling and data generation. While active research has been conducted on both models, few noted the relationship between the two. In this work, we unravel the connections between MFGs and NFs by contextualizing the training of an NF as solving the MFG. This is achieved by reformulating the MFG problem in terms of agent trajectories and parameterizing a discretization of the resulting MFG with flow architectures. With this connection, we explore two research directions. First, we employ expressive NF architectures to accurately solve high-dimensional MFGs, sidestepping the curse of dimensionality in traditional numerical methods. Compared with other deep learning approaches, our trajectory-based formulation encodes the continuity equation in the neural network, resulting in a better approximation of the population dynamics. Second, we regularize the training of NFs with transport costs and show the effectiveness on controlling the model's Lipschitz bound, resulting in better generalization performance. We demonstrate numerical results through comprehensive experiments on a variety of synthetic and real-life datasets.Han Huang and Jiajia Yu and Jie Chen and Rongjie Laiwork_fspbmmbponcbdfv342yvuzs4deThu, 30 Jun 2022 00:00:00 GMTDissecting U-net for Seismic Application: An In-Depth Study on Deep Learning Multiple Removal
https://scholar.archive.org/work/7hc45wcjnzadpcariu5qdlto74
Seismic processing often requires suppressing multiples that appear when collecting data. To tackle these artifacts, practitioners usually rely on Radon transform-based algorithms as post-migration gather conditioning. However, such traditional approaches are both time-consuming and parameter-dependent, making them fairly complex. In this work, we present a deep learning-based alternative that provides competitive results, while reducing its usage's complexity, and hence democratizing its applicability. We observe an excellent performance of our network when inferring complex field data, despite the fact of being solely trained on synthetics. Furthermore, extensive experiments show that our proposal can preserve the inherent characteristics of the data, avoiding undesired over-smoothed results, while removing the multiples. Finally, we conduct an in-depth analysis of the model, where we pinpoint the effects of the main hyperparameters with physical events. To the best of our knowledge, this study pioneers the unboxing of neural networks for the demultiple process, helping the user to gain insights into the inside running of the network.Ricard Durall, Ammar Ghanim, Norman Ettrich, Janis Keuperwork_7hc45wcjnzadpcariu5qdlto74Fri, 24 Jun 2022 00:00:00 GMTSimulation Based Software and Hardware Development for the Active Reduction of Muon Induced Background in the Liquid Scintillator Detectors JUNO and OSIRIS
https://scholar.archive.org/work/ye3uflygdfe5rmohologwl3nfq
The Jiangmen Underground Neutrino Observatory (JUNO) is a 20 kton liquid scintillator neutrino detector currently under construction in southern China. Its main goal is the determination of the neutrino mass ordering by measuring the energy spectrum of reactor electron antineutrinos from nearby nuclear power plants. In addition, JUNO pursues a broad physics program by observing neutrinos from terrestrial and extra-terrestrial sources, including the sun, supernovae, the atmosphere and potentially dark matter. Besides the unprecedented energy resolution of 3% sqrt(Evis(MeV)), a detailed understanding and reduction of background signals in the detector is necessary. Among the main background sources are muon induced cosmogenic isotopes and radioactive contamination in the liquid scintillator. The first part of this thesis is dedicated to the rejection of cosmogenic background produced in events with multiple muons in the detector. For this event type, a dedicated algorithm is developed which aims to reconstruct the muon tracks in this event type with high precision and reliability. The performance of the algorithm is tested with simulated muon events and it is demonstrated that the method can contribute essentially to the rejection of cosmogenic background. In the second part of the thesis, the estimation of cosmogenic background and the development of a muon veto system for the JUNO pre-detector OSIRIS (Online Scintillator Internal Radioactivity Investigation System) is presented. The purpose of the sub-system is the monitoring of radioactive contamination in the liquid scintillator, before it is filled into the JUNO detector. As part of this thesis, the OSIRIS simulation framework is extended to address the production of cosmogenic isotopes in the detector, important features in the detector geometry and the components of the veto system. Based on comprehensive muon simulations, it is found that muon induced background contributes significantly to the OSIRIS sensitivity and its active rejection with a veto system [...]Axel Müller, Universitaet Tuebingen, Lachenmaier, Tobias (Prof. Dr.)work_ye3uflygdfe5rmohologwl3nfqTue, 21 Jun 2022 00:00:00 GMTResearch Advancements in Key Technologies for Space-Based Situational Awareness
https://scholar.archive.org/work/uwwokahz4bchjeeugth2ct5j4u
The space environment has become highly congested due to the increasing space debris, seriously threatening the safety of orbiting spacecraft. Space-based situational awareness, as a comprehensive capability of threat knowledge, analysis, and decision-making, is of significant importance to ensure space security and maintain normal order. Various space situational awareness systems have been designed and launched. Data acquisition, target recognition, and monitoring constituting key technologies make major contributions, and various advanced algorithms are explored as technical supports. However, comprehensive reviews of these technologies and specific algorithms rarely emerge. It disadvantages the future development of space situational awareness. Therefore, this paper further reviews and analyzes research advancements in key technologies for space situational awareness, emphasizing target recognition and monitoring. Many mature and emerging methods are presented for these technologies while discussing application advantages and limitations. Specially, the research prospects of multiagent and synergetic constellation technologies are expected for future situational awareness. This paper indicates the future directions of the key technologies, aiming to provide references for space-based situational awareness to realize space sustainability.Beichao Wang, Shuang Li, Jinzhen Mu, Xiaolong Hao, Wenshan Zhu, Jiaqian Huwork_uwwokahz4bchjeeugth2ct5j4uSat, 18 Jun 2022 00:00:00 GMTBayesian Fixed-domain Asymptotics for Covariance Parameters in a Gaussian Process Model
https://scholar.archive.org/work/wi3m43pzn5fblhvut64pnxrdy4
Gaussian process models typically contain finite dimensional parameters in the covariance function that need to be estimated from the data. We study the Bayesian fixed-domain asymptotics for the covariance parameters in a universal kriging model with an isotropic Matern covariance function, which has many applications in spatial statistics. We show that when the dimension of domain is less than or equal to three, the joint posterior distribution of the microergodic parameter and the range parameter can be factored independently into the product of their marginal posteriors under fixed-domain asymptotics. The posterior of the microergodic parameter is asymptotically close in total variation distance to a normal distribution with shrinking variance, while the posterior distribution of the range parameter does not converge to any point mass distribution in general. Our theory allows an unbounded prior support for the range parameter and flexible designs of sampling points. We further study the asymptotic efficiency and convergence rates in posterior prediction for the Bayesian kriging predictor with covariance parameters randomly drawn from their posterior distribution. In the special case of one-dimensional Ornstein-Uhlenbeck process, we derive explicitly the limiting posterior of the range parameter and the posterior convergence rate for asymptotic efficiency in posterior prediction. We verify these asymptotic results in numerical experiments.Cheng Liwork_wi3m43pzn5fblhvut64pnxrdy4Tue, 14 Jun 2022 00:00:00 GMTScalability and robustness of spectral embedding: landmark diffusion is all you need
https://scholar.archive.org/work/e4litrwiuncoxe3k4iwrrmsaju
While spectral embedding is a widely applied dimension reduction technique in various fields, so far it is still challenging to make it scalable to handle "big data". On the other hand, the robustness property is less explored and there exists only limited theoretical results. Motivated by the need of handling such data, recently we proposed a novel spectral embedding algorithm, which we coined Robust and Scalable Embedding via Landmark Diffusion (ROSELAND). In short, we measure the affinity between two points via a set of landmarks, which is composed of a small number of points, and "diffuse" on the dataset via the landmark set to achieve a spectral embedding. Roseland can be viewed as a generalization of the commonly applied spectral embedding algorithm, the diffusion map (DM), in the sense that it shares various properties of DM. In this paper, we show that Roseland is not only numerically scalable, but also preserves the geometric properties via its diffusion nature under the manifold setup; that is, we theoretically explore the asymptotic behavior of Roseland under the manifold setup, including handling the U-statistics-like quantities, and provide a L^∞ spectral convergence with a rate. Moreover, we offer a high dimensional noise analysis and show that Roseland is robust to noise. We also compare Roseland with other existing algorithms with numerical simulations.Chao Shen, Hau-Tieng Wuwork_e4litrwiuncoxe3k4iwrrmsajuFri, 10 Jun 2022 00:00:00 GMT2022 Review of Data-Driven Plasma Science
https://scholar.archive.org/work/22ybajli5fa33dpewomz6hi7cu
Data science and technology offer transformative tools and methods to science. This review article highlights latest development and progress in the interdisciplinary field of data-driven plasma science (DDPS). A large amount of data and machine learning algorithms go hand in hand. Most plasma data, whether experimental, observational or computational, are generated or collected by machines today. It is now becoming impractical for humans to analyze all the data manually. Therefore, it is imperative to train machines to analyze and interpret (eventually) such data as intelligently as humans but far more efficiently in quantity. Despite the recent impressive progress in applications of data science to plasma science and technology, the emerging field of DDPS is still in its infancy. Fueled by some of the most challenging problems such as fusion energy, plasma processing of materials, and fundamental understanding of the universe through observable plasma phenomena, it is expected that DDPS continues to benefit significantly from the interdisciplinary marriage between plasma science and data science into the foreseeable future.Rushil Anirudh, Rick Archibald, M. Salman Asif, Markus M. Becker, Sadruddin Benkadda, Peer-Timo Bremer, Rick H.S. Budé, C.S. Chang, Lei Chen, R. M. Churchill, Jonathan Citrin, Jim A Gaffney, Ana Gainaru, Walter Gekelman, Tom Gibbs, Satoshi Hamaguchi, Christian Hill, Kelli Humbird, Sören Jalas, Satoru Kawaguchi, Gon-Ho Kim, Manuel Kirchen, Scott Klasky, John L. Kline, Karl Krushelnick, Bogdan Kustowski, Giovanni Lapenta, Wenting Li, Tammy Ma, Nigel J. Mason, Ali Mesbah, Craig Michoski, Todd Munson, Izumi Murakami, Habib N. Najm, K. Erik J. Olofsson, Seolhye Park, J. Luc Peterson, Michael Probst, Dave Pugmire, Brian Sammuli, Kapil Sawlani, Alexander Scheinker, David P. Schissel, Rob J. Shalloo, Jun Shinagawa, Jaegu Seong, Brian K. Spears, Jonathan Tennyson, Jayaraman Thiagarajan, Catalin M. Ticoş, Jan Trieschmann, Jan van Dijk, Brian Van Essen, Peter Ventzek, Haimin Wang, Jason T. L. Wang, Zhehui Wang, Kristian Wende, Xueqiao Xu, Hiroshi Yamada, Tatsuya Yokoyama, Xinhua Zhangwork_22ybajli5fa33dpewomz6hi7cuTue, 31 May 2022 00:00:00 GMTDo price trajectory data increase the efficiency of market impact estimation?
https://scholar.archive.org/work/zncctkbs25blvo2hprmxoonvjm
Market impact is an important problem faced by large institutional investor and active market participant. In this paper, we rigorously investigate whether price trajectory data from the metaorder increases the efficiency of estimation, from an asymptotic view of statistical estimation. Perhaps surprisingly, we show that, for the popular market models, partial price trajectory data or even just two points, can outperform established estimation methods (e.g., VWAP-based) asymptotically. We provide theoretical and empirical analysis on such phenomenon, which could be incorporate into practice.Fengpei Li, Vitalii Ihnatiuk, Ryan Kinnear, Anderson Schneider, Yuriy Nevmyvakawork_zncctkbs25blvo2hprmxoonvjmThu, 26 May 2022 00:00:00 GMTMultiscale derivation, analysis and simulation of collective dynamics models: geometrical aspects and applications
https://scholar.archive.org/work/xmg4xzvixfbhdooukuq2wv5bki
This thesis is a contribution to the study of swarming phenomena from the point of view of mathematical kinetic theory. This multiscale approach starts from stochastic individual based (or particle) models and aims at the derivation of partial differential equation models on statistical quantities when the number of particles tends to infinity. This latter class of models is better suited for mathematical analysis in order to reveal and explain large-scale emerging phenomena observed in various biological systems such as flocks of birds or swarms of bacteria. Within this objective, a large part of this thesis is dedicated to the study of a body-attitude coordination model and, through this example, of the influence of geometry on self-organisation. The first part of the thesis deals with the rigorous derivation of partial differential equation models from particle systems with mean-field interactions. After a review of the literature, in particular on the notion of propagation of chaos, a rigorous convergence result is proved for a large class of geometrically enriched piecewise deterministic particle models towards local BGK-type equations. In addition, the method developed is applied to the design and analysis of a new particle-based algorithm for sampling. This first part also addresses the question of the efficient simulation of particle systems using recent GPU routines. The second part of the thesis is devoted to kinetic and fluid models for body-oriented particles. The kinetic model is rigorously derived as the mean-field limit of a particle system. In the spatially homogeneous case, a phase transition phenomenon is investigated which discriminates, depending on the parameters of the model, between a "disordered" dynamics and a self-organised "ordered" dynamics. The fluid (or macroscopic) model was derived as the hydrodynamic limit of the kinetic model a few years ago by Degond et al. The analytical and numerical study of this model reveal the existence of new self-organised phenomena which are confirmed a [...]Antoine Diez, Pierre Degond, Sara Merino-Aceituno, EPSRCwork_xmg4xzvixfbhdooukuq2wv5bkiWed, 25 May 2022 00:00:00 GMTAdaptive and non-intrusive uncertainty quantification for high-dimensional parametric PDEs
https://scholar.archive.org/work/ovslbteoh5fajgq5hrfi57ybwe
This thesis concerns the combination of dependable error control and data based approximation to derive non-intrusive and reliable algorithms for uncertainty quantification in forward and inverse problems. In particular Bayesian inverse problems subject to high-dimensional parametric forward models driven by partial differential equations are the main focus. Access to stochastic moments or marginals of the Bayesian posterior typically requires many evaluations of the time-consuming forward model due to slowconverging sampling methods or high-dimensional numerical quadrature. Alternative surrogate modelling of the forward process through polynomial expansions succumbs to the curse of dimensionality, i.e. the exponential dependence of the number of expansion terms with respect to the parameter dimension. Hierarchical tensor representations, in particular the Tensor Train format, are employed to alleviate this exponential scaling under the assumption of a low-rank representability of the sought functions. To reduce the computational complexity of the forward model even further, adaptive strategies based on a posteriori error estimators are employed and investigated with regard to solvability, convergence and stability in the low-rank tensor format. The non-intrusiveness of the presented methods is ensured by linear and non-linear regression techniques that solely rely on pointwise evaluations of the parametric forward model and are equivalent to a Galerkin approximation with high probability if enough samples are used. Both the Bayesian framework under the assumption of Gaussian noise and the computation of the error estimator for the case of a lognormal coefficient field involve the exponentiation of functions naturally accessible in the hierarchical tensor format. For those exponentials, a Galerkin-type method, which yields computable upper and lower bounds for the approximation error for any discrete function in an induced energy norm, is derived. These results are applied to two different types of real-world par [...]Nando Farchmin, Technische Universität Berlin, Reinhold Schneider, Martin Eigel, Sebastian Heidenreichwork_ovslbteoh5fajgq5hrfi57ybweTue, 17 May 2022 00:00:00 GMTTopics in conditional causal inference
https://scholar.archive.org/work/bteo4fapincghduqlfm3w3mqo4
With the growth of complex experimental designs and large-scale observational data, causal questions arising in applications are now more targeted and precise. For example, one might ask if the treatment is effective at a particular time point, or if the treatment is effective for a particular individual. To answer many questions of this kind, this thesis concerns conditional causal inference, generally referring to techniques of constructing or interpreting conditional distributions or expectations for inference about a causal effect of interest. This thesis consists of five chapters. In Chapter 1, we first review the classical potential outcomes framework and some basic causal inference methods relevant to this thesis, then provide a summary of the problems and methods studied in the following chapters. In Chapter 2, we consider testing causal effects in complex experimental designs via conditional randomization tests (CRTs). The CRTs we define are randomization tests conditioning on a subset of treatment assignments tailored to the effect of interest. Because many potential outcomes are missing in complex designs, a single CRT is rarely powerful. We develop a general theory for constructing multiple jointly valid CRTs in arbitrary designs. Following this theory, we propose practical methods that can collect and combine statistical evidence in different parts of an experiment to test a global effect of interest. Under a general framework of CRTs, we connect and discuss randomization tests developed for different statistical problems in the literature, which may be of independent interest. The following three chapters concern the problem of estimating conditional average treatment effects (CATEs). CATEs quantify individual-level treatment effects by conditioning on individual covariates. In Chapter 3, we consider estimating CATEs in the presence of high-dimensional covariates. We propose a neural network-based dimensionality reduction method that can transform high-dimensional covariates into a low-dimensional a [...]Yao Zhang, Apollo-University Of Cambridge Repository, Van Der Schaar Mihaelawork_bteo4fapincghduqlfm3w3mqo4Tue, 17 May 2022 00:00:00 GMTMachine Learning in Inverse Problems - Learning Regularisation Functionals and Operator Corrections
https://scholar.archive.org/work/6cn2pd2btrfvnouc4jmw6gjawa
In this thesis, we investigate properties of deep neural networks and their application to inverse problems. A successful classical approach to inverse problems is variational regularisation, combining knowledge and modelling of the imaging modality at hand with a regularisation functional that incorporates prior knowledge about solutions to the inverse problem. With the success of deep neural networks in many imaging tasks such as image classification or semantic segmentation, recently algorithms that leverage the power of neural networks have been explored to solve inverse problems. In this thesis, we discuss various approaches to incorporate deep learning into reconstruction algorithms for inverse problems and in particular into variational approaches. We propose and discuss an algorithm to train a neural network as regularisation functional. This is achieved by training the network to tell apart an unregularised pseudo-inverse from ground truth images. The resulting regulariser decreases the Wasserstein distance between reconstructions and ground truth images at an optimal rate. We present computational results for computed tomography (CT) and magnetic resonance imaging (MRI) reconstruction and investigate generalisation properties of the learned regularisation functional. In another line of research, we turn our attention to making use of neural networks to correct for errors in the forward operator. While an approximate model of the forward operator is available in many applications, this model can exhibit artefacts compared to the true behaviour of the imaging modality. We train a neural network to learn how to correct for these shortcomings by learning a correction from data. The aim is to obtain a corrected operator that can be employed within a variational framework for reconstruction. We investigate key challenges of this approach and propose a recursive forward-adjoint algorithm to efficiently train an operator correction for photo-acoustic tomography reconstruction.Sebastian Lunz, Apollo-University Of Cambridge Repository, Carola-Bibiane Schönliebwork_6cn2pd2btrfvnouc4jmw6gjawaFri, 06 May 2022 00:00:00 GMTGaussian Processes and Statistical Decision-making in Non-Euclidean Spaces
https://scholar.archive.org/work/6jlqvqvjnvab7njzxgnaghfnje
Bayesian learning using Gaussian processes provides a foundational framework for making decisions in a manner that balances what is known with what could be learned by gathering data. In this dissertation, we develop techniques for broadening the applicability of Gaussian processes. This is done in two ways. Firstly, we develop pathwise conditioning techniques for Gaussian processes, which allow one to express posterior random functions as prior random functions plus a dependent update term. We introduce a wide class of efficient approximations built from this viewpoint, which can be randomly sampled once in advance, and evaluated at arbitrary locations without any subsequent stochasticity. This key property improves efficiency and makes it simpler to deploy Gaussian process models in decision-making settings. Secondly, we develop a collection of Gaussian process models over non-Euclidean spaces, including Riemannian manifolds and graphs. We derive fully constructive expressions for the covariance kernels of scalar-valued Gaussian processes on Riemannian manifolds and graphs. Building on these ideas, we describe a formalism for defining vector-valued Gaussian processes on Riemannian manifolds. The introduced techniques allow all of these models to be trained using standard computational methods. In total, these contributions make Gaussian processes easier to work with and allow them to be used within a wider class of domains in an effective and principled manner. This, in turn, makes it possible to potentially apply Gaussian processes to novel decision-making settings.Alexander Tereninwork_6jlqvqvjnvab7njzxgnaghfnjeThu, 28 Apr 2022 00:00:00 GMTThe Interplay between Quantum Contextuality and Wigner Negativity
https://scholar.archive.org/work/ozd7f2zrhfh53pex4thoetbtpy
The use of quantum information in technology promises to supersede the so-called classical devices used nowadays. Understanding what features are inherently non-classical is crucial for reaching better-than-classical performance. This thesis focuses on two nonclassical behaviours: quantum contextuality and Wigner negativity. The former is a notion superseding nonlocality that can be exhibited by quantum systems. To date, it has mostly been studied in discrete-variable scenarios. In those scenarios, contextuality has been shown to be necessary and sufficient for advantages in some cases. On the other hand, negativity of the Wigner function is another unsettling non-classical feature of quantum states that originates from phase-space formulation in continuous-variable quantum optics. Continuous-variable scenarios offer promising candidates for implementing quantum computations. Wigner negativity is known to be a necessary resource for quantum speedup with continuous variables. However contextuality has been little understood and studied in continuous-variable scenarios. We first set out a robust framework for properly treating contextuality in continuous variables. We also quantify contextuality in such scenarios by using tools from infinite-dimensional optimisation theory. Building upon this, we show that Wigner negativity is equivalent to contextuality in continuous variables with respect to Pauli measurements thus establishing a continuous-variable analogue of a celebrated result by Howard et al. We then introduce experimentally-friendly witnesses for Wigner negativity of single mode and multimode quantum states, based on fidelities with Fock states, using again tools from infinite-dimensional optimisation theory. We further extend the range of previously known discrete-variable results linking contextuality and advantage into a new territory of information retrieval.Pierre-Emmanuel Emeriauwork_ozd7f2zrhfh53pex4thoetbtpyTue, 19 Apr 2022 00:00:00 GMT