IA Scholar Query: What are the Odds?: probabilistic programming in Scala.
https://scholar.archive.org/
Internet Archive Scholar query results feedeninfo@archive.orgMon, 14 Nov 2022 00:00:00 GMTfatcat-scholarhttps://scholar.archive.org/help1440Robust Deep Learning for Autonomous Driving
https://scholar.archive.org/work/gusqv5cmwnaoxah3q3unxr62wi
The last decade's research in artificial intelligence had a significant impact on the advance of autonomous driving. Yet, safety remains a major concern when it comes to deploying such systems in high-risk environments. The objective of this thesis is to develop methodological tools which provide reliable uncertainty estimates for deep neural networks. First, we introduce a new criterion to reliably estimate model confidence: the true class probability (TCP). We show that TCP offers better properties for failure prediction than current uncertainty measures. Since the true class is by essence unknown at test time, we propose to learn TCP criterion from data with an auxiliary model, introducing a specific learning scheme adapted to this context. The relevance of the proposed approach is validated on image classification and semantic segmentation datasets. Then, we extend our learned confidence approach to the task of domain adaptation where it improves the selection of pseudo-labels in self-training methods. Finally, we tackle the challenge of jointly detecting misclassification and out-of-distributions samples by introducing a new uncertainty measure based on evidential models and defined on the simplex.Charles Corbièrework_gusqv5cmwnaoxah3q3unxr62wiMon, 14 Nov 2022 00:00:00 GMTDetection and Evaluation of Clusters within Sequential Data
https://scholar.archive.org/work/bue3nywa3nbqffglua55yjgbd4
Motivated by theoretical advancements in dimensionality reduction techniques we use a recent model, called Block Markov Chains, to conduct a practical study of clustering in real-world sequential data. Clustering algorithms for Block Markov Chains possess theoretical optimality guarantees and can be deployed in sparse data regimes. Despite these favorable theoretical properties, a thorough evaluation of these algorithms in realistic settings has been lacking. We address this issue and investigate the suitability of these clustering algorithms in exploratory data analysis of real-world sequential data. In particular, our sequential data is derived from human DNA, written text, animal movement data and financial markets. In order to evaluate the determined clusters, and the associated Block Markov Chain model, we further develop a set of evaluation tools. These tools include benchmarking, spectral noise analysis and statistical model selection tools. An efficient implementation of the clustering algorithm and the new evaluation tools is made available together with this paper. Practical challenges associated to real-world data are encountered and discussed. It is ultimately found that the Block Markov Chain model assumption, together with the tools developed here, can indeed produce meaningful insights in exploratory data analyses despite the complexity and sparsity of real-world data.Alexander Van Werde, Albert Senen-Cerda, Gianluca Kosmella, Jaron Sanderswork_bue3nywa3nbqffglua55yjgbd4Tue, 04 Oct 2022 00:00:00 GMTFunctional or imperative? On pleasant semantics for differentiable programming languages
https://scholar.archive.org/work/uayyzwkezze3fby6zo5wsq7gnu
In machine learning (ML), researchers and engineers seem to be at odds. System implementers would prefer models to be declarative, with detailed type information and semantic restrictions that allow models to be optimised, rearranged and parallelised. Yet practitioners show an overwhelming preference for dynamic, imperative languages with mutable state, and much engineering effort is spent bridging the resulting semantic divide. Is there a fundamental conflict? This article explores why imperative and functional styles are used, and how future language designs might get the best of both worlds.Michael Inneswork_uayyzwkezze3fby6zo5wsq7gnuWed, 22 Jun 2022 00:00:00 GMTHow Attacker Knowledge Affects Privacy Risks
https://scholar.archive.org/work/kkgpok7kenhjfbyd27u6zdxgeu
Governments and businesses routinely disclose large amounts of private data on individuals, for data analytics. However, despite attempts by data controllers to anonymise data, attackers frequently deanonymise disclosed data by matching it with their prior knowledge. When is a chosen anonymisation method adequate? For this, a data controller must consider attackers befitting their scenario; how does attacker knowledge affect disclosure risk? We present a multi-dimensional conceptual framework for assessing privacy risks given prior knowledge about data. The framework defines three dimensions: distinctness (of input records), informedness (of attacker), and granularity (of anonymisation program output). We model three well-known types of disclosure risk: identity disclosure, attribute disclosure, and quantitative attribute disclosure. We demonstrate how to apply this framework in a health record privacy scenario: We analyse how informing the attacker with COVID-19 infection rates affects privacy risks. We perform this analysis using Privug, a method that uses probabilistic programming to do standard statistical analysis with Bayesian Inference. CCS CONCEPTS • Security and privacy → Data anonymization and sanitization; Privacy protections; • Mathematics of computing → Bayesian computation; Information theory; • Applied computing → Health informatics.Louise Halvorsen, Siv L. Steffensen, Willard Rafnsson, Oksana Kulyk, Raúl Pardowork_kkgpok7kenhjfbyd27u6zdxgeuMon, 18 Apr 2022 00:00:00 GMTSpatial analysis of the Healthy Built Environment: an application on Cardiovascular Risk Factors in the canton of Geneva
https://scholar.archive.org/work/xzucj2g4ifbsflu3o24whwb3nm
This research uses a variety of spatial and aspatial methodologies to diagnose the healthiness of different elements of the built environment to advocate for health integration in spatial design in its entirety to address disparities. The thesis, in general, identifies inequities in the characteristics of the health-related built environment and the spatial clustering of health issues among the resident population within the study area. Also, the thesis contributes to understanding the association of health data with the built environment depending on the residence location, at multiple spatial scales, and at global and local level in the canton of Geneva.Andrea Salmiwork_xzucj2g4ifbsflu3o24whwb3nmMon, 07 Mar 2022 00:00:00 GMTCompetition-Level Code Generation with AlphaCode
https://scholar.archive.org/work/e5kzzknnm5g6jgehvzwfuqqohe
Programming is a powerful and ubiquitous problem-solving tool. Developing systems that can assist programmers or even generate programs independently could make programming more productive and accessible, yet so far incorporating innovations in AI has proven challenging. Recent large-scale language models have demonstrated an impressive ability to generate code, and are now able to complete simple programming tasks. However, these models still perform poorly when evaluated on more complex, unseen problems that require problem-solving skills beyond simply translating instructions into code. For example, competitive programming problems which require an understanding of algorithms and complex natural language remain extremely challenging. To address this gap, we introduce AlphaCode, a system for code generation that can create novel solutions to these problems that require deeper reasoning. In simulated evaluations on recent programming competitions on the Codeforces platform, AlphaCode achieved on average a ranking of top 54.3% in competitions with more than 5,000 participants. We found that three key components were critical to achieve good and reliable performance: (1) an extensive and clean competitive programming dataset for training and evaluation, (2) large and efficient-to-sample transformer-based architectures, and (3) large-scale model sampling to explore the search space, followed by filtering based on program behavior to a small set of submissions.Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, Thomas Hubert, Peter Choy, Cyprien de Masson d'Autume, Igor Babuschkin, Xinyun Chen, Po-Sen Huang, Johannes Welbl, Sven Gowal, Alexey Cherepanov, James Molloy, Daniel J. Mankowitz, Esme Sutherland Robson, Pushmeet Kohli, Nando de Freitas, Koray Kavukcuoglu, Oriol Vinyalswork_e5kzzknnm5g6jgehvzwfuqqoheTue, 08 Feb 2022 00:00:00 GMTOn Reasonable Space and Time Cost Models for the λ-Calculus
https://scholar.archive.org/work/orewl2uau5crrkrs3nixemlg4e
The λ-calculus is considered the paradigmatic model for functional programming languages. It comes with a beautiful mathematical theory, which has been studied and improved for more than 80 years. The λ-calculus is based on just one rewriting rule, β-reduction. Although rewriting expressions is common in computer science, this is not the way in which programs are executed by the hardware. Hence, there is a distance between the programming language model and the execution model. This gap is closed by compilers, that translate high-level (functional) programs to low-level executable machine code. While the semantics of programs remains unaltered during compilation, the use of resources, in particular time and space, is more difficult to track. First of all, one should be able to measure the use of resources on the source program. Then, this amount of used resources should be preserved by compilation. The definition of resource consumption is typically done on Turing machines (TMs), where time is simply the number of steps a Turing machine needs to halt, and space is the maximum number of tape cells used during the computation. We would like to define some time and space measures on top of the λ-calculus, and we would like them being compatible with those of Turing machines. More formally, Slot and van Emde Boas Invariance Thesis states that a time (respectively, space) cost model is reasonable for a computational model C if there are mutual simulations between TMs and C such that the overhead is polynomial in time (respectively, linear in space). The rationale is that under the Invariance Thesis, complexity classes such as L, P, PSPACE, become robust, i.e. machine independent. More concretely, we can see these simulations as consisting of a compilation phase, followed by an execution phase. The literature on the subject contains a lot of results about time cost model for the λ-calculus. In particular, the number of rewriting steps has been proved a reasonable cost model in the majority of the interesting cases, e.g. in the call-by-name/value/need cases. For space cost models the situation is different: except for a recent partial result by Forster et al., nothing is known. Indeed, the problem is far more difficult w.r.t. the one of finding a reasonable time measure. The main reason of this difficulty is that the required overhead for the space simulations is linear, and not polynomial, i.e. the space consumption should be the same on both the the sides of the simulations, only up to a multiplicative constant factor. This is very difficult to achieve for two different reasons. The former is that typical implementations of the λ-calculus rely on pointers, that give an extra undesired logarithmic overhead. The latter, instead, is that in the λ-calculus there is not distinction between programs and data. This fact does not allow to account for sub-linear complexity classes, if one considers the natural space cost model, i.e. the maximum size of terms encountered during a reduction, that by definition is at least as big as the input (this is what Forster et al. do). In this dissertation, we tackle this problem from different perspectives. We start by considering an unusual evaluation mechanism for the λ-calculus, based on Girard's Geometry of Interaction, that was conjectured to be the key ingredient to obtain a space reasonable cost model. By a fine complexity analysis of this schema, based on new variants of non-idempotent intersection types, we disprove this conjecture. Then, we change the target of our analysis. We consider a variant over Krivine's abstract machine, a standard evaluation mechanism for the call-by-name λ-calculus, optimized for space complexity, and implemented without any pointer. A fine analysis of the execution of (a refined version of) the encoding of TMs into the λ-calculus allows us to conclude that the space consumed by this machine is a reasonable space cost model. In particular, for the first time we are able to measure also sub-linear space complexities. Moreover, we transfer this result also to the call-by-value case. Finally, we provide also an intersection type system that characterizes compositionally this new reasonable space measure. This is done through a minimal, yet non trivial, modification of the original de Carvalho type system. As a disclaimer, this acknowledgment section will mention job related people, only. All the others already know. This thesis would not have been possible without the commitment of Ugo Dal Lago, who has carefully supervised me. Moreover, Beniamino Accattoli has acted a second, although unofficial, supervisor. They have been terrific mentors. Then, I would like to thank all my colleagues in Bologna, Damiano Mazza, who hosted me in Paris, and Zhong Shao, who hosted me at Yale. Finally, I thank Dan Ghica and Delia Kesner, who accepted to review this thesis. Of course, I have met many people during these years, and I would like to thank them all.Gabriele Vanoniwork_orewl2uau5crrkrs3nixemlg4eAbstract book of the XXVII Congresso Nazionale della Società Scientifica FADOI, 21-23 maggio 2022
https://scholar.archive.org/work/i5yulil4mfg73br4q7myi7zpn4
book of the XXVII Congresso Nazionale della Società Scientifica FADOI, 21-23 maggio 2022.D. Manfellottowork_i5yulil4mfg73br4q7myi7zpn4L'individualità del parlante nelle scienze fonetiche: applicazioni tecnologiche e forensi
https://scholar.archive.org/work/ninqc7qhvfhrfk6u6jryr2tdba
Il presente volume, che raccoglie una selezione di contributi presentati al XVII Convegno Nazionale dell'Associazione Italiana di Scienze delle Voce (AISV), ha come oggetto l'individualità del parlante e le sue possibili applicazioni tecnologiche e forensi, tema di crescente attualità nell'ambito delle scienze fonetiche. Com'è noto, le scienze del linguaggio tendono a privilegiare prospettive di ricerca incentrate sull'analisi del sistema linguistico o sulla variazione determinata da fattori sociolinguistici, quali la provenienza geografica e sociale del parlante oppure il contesto situazionale. Le scienze della voce, invece, hanno come oggetto di studio anche gli aspetti materiali della produzione e della percezione dei messaggi verbali. Per questo motivo la variazione a livello individuale -lungi dall'essere una fonte di 'disturbo' o di 'rumore' -diviene essa stessa un fenomeno di interesse. Le dimensioni di variazione legate al parlante acquisiscono, inoltre, una rilevanza centrale per la ricerca applicata nel campo delle tecnologie del parlato (si pensi ad esempio al riconoscimento automatico del linguaggio e del parlante) e, in particolare, nell'ambito della fonetica forense. Il convegno, inizialmente concepito per svolgersi in presenza all'Università di Zurigo, a causa delle restrizioni dovute alla pandemia di coronavirus, si è tenuto online nelle giornate del 4 e 5 febbraio 2021, raggiungendo picchi di partecipazione superiori ai 150 convegnisti. Le due giornate sono state aperte dalle sedute plenarie di Helen Fraser (University of New England) e Kirsty McDougall (University of Cambridge) su temi di fonetica forense, a cui sono dedicati i primi due contributi del presente volume. La fonetica forense è stata oggetto anche della tavola rotonda intitolata "Current trends and issues in forensic phonetics research", che si è svolta in chiusura del convegno e alla quale hanno partecipato -moderati da Peter French -i seguenti relatori (in ordine alfabetico):s.n.work_ninqc7qhvfhrfk6u6jryr2tdbaFri, 31 Dec 2021 00:00:00 GMT30th Annual Computational Neuroscience Meeting: CNS*2021–Meeting Abstracts
https://scholar.archive.org/work/ozw7g43rjjcijn24ospoo65jmy
One of the goals of neuroscience is to understand the computational principles that describe the formation of behaviorally relevant signals in the brain, as well as how these computations are realized within the constraints of biological networks. Currently, most functional models of neural activity are based on firing rates, while the most relevant signals for inter-neuron communication are spikes. Recently, the framework of predictive coding (Sajikumar et al., 2014work_ozw7g43rjjcijn24ospoo65jmyTue, 21 Dec 2021 00:00:00 GMTScalable Handling of Effects (Dagstuhl Seminar 21292)
https://scholar.archive.org/work/3srpco7rxfepvmosa4gzpzrmr4
Built on solid mathematical foundations, effect handlers offer a uniform and elegant approach to programming with user-defined computational effects. They subsume many widely used programming concepts and abstractions, such as actors, async/await, backtracking, coroutines, generators/iterators, and probabilistic programming. As such, they allow language implementers to target a single implementation of effect handlers, freeing language implementers from having to maintain separate ad hoc implementations of each of the features listed above. Due to their wide applicability, effect handlers are enjoying growing interest in academia and industry. For instance, several effect handler oriented research languages are under active development (such as Eff, Frank, and Koka), as are effect handler libraries for mainstream languages (such as C and Java), effect handlers are seeing increasing use in probabilistic programming tools (such as Uber's Pyro), and proposals are in the pipeline to include them natively in low-level languages (such as WebAssembly). Effect handlers are also a key part of Multicore OCaml, which incorporates an efficient implementation of them for uniformly expressing user-definable concurrency models in the language. However, enabling effect handlers to scale requires tackling some hard problems, both in theory and in practice. Inspired by experience of developing, programming with, and reasoning about effect handlers in practice, we identify five key problem areas to be addressed at this Dagstuhl Seminar in order to enable effect handlers to scale: Safety, Modularity, Interoperability, Legibility, and Efficiency. In particular, we seek answers to the following questions: - How can we enforce safe interaction between effect handler programs and external resources? - How can we enable modular use of effect handlers for programming in the large? - How can we support interoperable effect handler programs written in different languages? - How can we write legible effect handler programs in a st [...]Danel Ahman, Amal Ahmed, Sam Lindley, Andreas Rossbergwork_3srpco7rxfepvmosa4gzpzrmr4Wed, 01 Dec 2021 00:00:00 GMT