IA Scholar Query: On the Crossing Number of the Hypercube and the Cube Connected Cycles.
https://scholar.archive.org/
Internet Archive Scholar query results feedeninfo@archive.orgSun, 31 Jul 2022 00:00:00 GMTfatcat-scholarhttps://scholar.archive.org/help1440Noncommutative Instantons in Diverse Dimensions
https://scholar.archive.org/work/6q4rq6qjhne5hbr5i46tgfzklq
This is a mini-review about generalized instantons of noncommutative gauge theories in dimensions 4, 6 and 8, with emphasis on their realizations in type II string theory, their geometric interpretations, and their applications to the enumerative geometry of non-compact toric varieties.Richard J. Szabo, Michelangelo Tirelliwork_6q4rq6qjhne5hbr5i46tgfzklqSun, 31 Jul 2022 00:00:00 GMTExploring Wilderness Characteristics Using Explainable Machine Learning in Satellite Imagery
https://scholar.archive.org/work/bfn2lm4yknbxrmqqey5hmm4s3u
Wilderness areas offer important ecological and social benefits and there are urgent reasons to discover where their positive characteristics and ecological functions are present and able to flourish. We apply a novel explainable machine learning technique to satellite images which show wild and anthropogenic areas in Fennoscandia. Occluding certain activations in an interpretable artificial neural network we complete a comprehensive sensitivity analysis regarding wild and anthropogenic characteristics. This enables us to predict detailed and high-resolution sensitivity maps highlighting these characteristics. Our artificial neural network provides an interpretable activation space increasing confidence in our method. Within the activation space, regions are semantically arranged. Our approach advances explainable machine learning for remote sensing, offers opportunities for comprehensive analyses of existing wilderness, and has practical relevance for conservation efforts.Timo T. Stomberg, Taylor Stone, Johannes Leonhardt, Immanuel Weber, Ribana Roscherwork_bfn2lm4yknbxrmqqey5hmm4s3uTue, 26 Jul 2022 00:00:00 GMTCombinatorial Gray codes-an updated survey
https://scholar.archive.org/work/zryp7sxkrbczrguasg4ugmgfee
A combinatorial Gray code for a class of objects is a listing that contains each object from the class exactly once such that any two consecutive objects in the list differ only by a 'small change'. Such listings are known for many different combinatorial objects, including bitstrings, combinations, permutations, partitions, triangulations, but also for objects defined with respect to a fixed graph, such as spanning trees, perfect matchings or vertex colorings. This survey provides a comprehensive picture of the state-of-the-art of the research on combinatorial Gray codes. In particular, it gives an update on Savage's influential survey [C. D. Savage. A survey of combinatorial Gray codes. SIAM Rev., 39(4):605-629, 1997.], incorporating many more recent developments. We also elaborate on the connections to closely related problems in graph theory, algebra, order theory, geometry and algorithms, which embeds this research area into a broader context. Lastly, we collect and propose a number of challenging research problems, thus stimulating new research endeavors.Torsten Mützework_zryp7sxkrbczrguasg4ugmgfeeTue, 26 Jul 2022 00:00:00 GMTWhen you can't count, sample! Computable entropies beyond equilibrium from basin volumes
https://scholar.archive.org/work/cwdi6sl5j5auxfyeobvd3uwy7m
In statistical mechanics, measuring the number of available states and their probabilities, and thus the system's entropy, enables the prediction of the macroscopic properties of a physical system at equilibrium. This predictive capacity hinges on the knowledge of the a priori probabilities of observing the states of the system, given by the Boltzmann distribution. Unfortunately, the successes of equilibrium statistical mechanics are hard to replicate out of equilibrium, where the a-priori probabilities of observing states are in general not known, precluding the naïve application of usual tools. In the last decade, exciting developments have occurred that enable the direct numerical estimation of the entropy and density of states of athermal and non-equilibrium systems, thanks to significant methodological advances in the computation of the volume of high-dimensional basins of attraction. Here, we provide a detailed account of these methods, underscoring the challenges that lie in such estimations, recent progress on the matter, and promising directions for future work.Mathias Casiulis, Stefano Martinianiwork_cwdi6sl5j5auxfyeobvd3uwy7mSun, 17 Jul 2022 00:00:00 GMTCELLoGeNe - an Energy Landscape Framework for Logical Networks Controlling Cell Decisions
https://scholar.archive.org/work/ruhbam366netbagl45q3rf43lm
Experimental and computational efforts are constantly made to elucidate mechanisms controlling cell fate decisions during development and reprogramming. One powerful computational method is to consider cell commitment and reprogramming as movements in an energy landscape. Here, we develop Computation of Energy Landscapes of Logical Gene Networks (CELLoGeNe), which maps Boolean implementation of gene regulatory networks (GRNs) into energy landscapes. CELLoGeNe removes inadvertent symmetries in the energy landscapes normally arising from standard Boolean operators. Furthermore, CELLoGeNe provides tools to visualize and stochastically analyze the shapes of multi-dimensional energy landscapes corresponding to epigenetic landscapes for development and reprogramming. We demonstrate CELLoGeNe on two GRNs governing different aspects of induced pluripotent stem cells, identifying experimentally validated attractors and revealing potential reprogramming roadblocks. CELLoGeNe is a general framework that can be applied to various biological systems offering a broad picture of intracellular dynamics otherwise inaccessible with existing methods.Emil Andersson, Mattias Sjö, Keisuke Kaji, Victor Olariuwork_ruhbam366netbagl45q3rf43lmThu, 14 Jul 2022 00:00:00 GMTForest groups I: between Jones' subfactors and R. Thompson's groups
https://scholar.archive.org/work/h7zvflb5nngpnags6ux5q65mou
Vaughan Jones discovered unexpected connections with Richard Thompson's group while attempting to systematically construct conformal field theories (in short CFT) from subfactors. New field theories were created and also Jones' technology: a powerful new method for constructing actions of fraction groups from their underlying category. Numerous applications arose in mathematical physics, operator algebras, group theory but also knot theory and noncommutative probability theory. We outline a program in the vein of Jones' work but where the Thompson group is replaced by a family of groups that we name forest groups. These groups are constructed from forest categories made of planar diagrams. They capture key aspects of the Thompson group but also aim to better connect subfactors with CFT. They are tailor-made for using Jones' technology admitting powerful skein theoretical descriptions. Apart from strengthening Jones' vision our program produces a plethora of explicit groups satisfying interesting and rare properties. In this first article we introduce the general theory of forest categories and their associated forest groups, provide criteria of existence of these groups, construct two canonical actions of them (on a simplicial complex and a totally ordered set), derive explicit presentations, establish a topological finiteness theorem, and finish by giving a large class of explicit examples.Arnaud Brothierwork_h7zvflb5nngpnags6ux5q65mouThu, 07 Jul 2022 00:00:00 GMTGraphical Designs and Gale Duality
https://scholar.archive.org/work/qgmgqrmh3zddzdj3mwd5nblhei
A graphical design is a subset of graph vertices such that the weighted averages of certain graph eigenvectors over the design agree with their global averages. We use Gale duality to show that positively weighted graphical designs in regular graphs are in bijection with the faces of a generalized eigenpolytope of the graph. This connection can be used to organize, compute and optimize designs. We illustrate the power of this tool on three families of Cayley graphs -- cocktail party graphs, cycles, and graphs of hypercubes -- by computing or bounding the smallest designs that average all but the last eigenspace in frequency order.Catherine Babecki, Rekha R. Thomaswork_qgmgqrmh3zddzdj3mwd5nblheiTue, 05 Jul 2022 00:00:00 GMTIntegration of Multi-Omic Datasets on Antimicrobial Resistance for Large-Scale Biomedical Data Science
https://scholar.archive.org/work/ghfhkr6e4jfkjau4kfcmblijvq
Antimicrobial resistance (AMR) results in tremendous health risks, causing the World Health Organization to designate it as one of the significant burdens for modern society. Owing to ineffective antibiotics, once everyday surgeries will become life-threatening interventions. Rigorous governmental measurements are supposed to supervise administration of antimicrobials, hence controlling AMR dissemination. The intervention of healthcare stakeholders and responsible application in human and veterinary medicine is urgently required. In this light, wastewater-based epidemiology has been established to examine various environmental factors promoting AMR and monitoring their development population-wide. Antibiotic residuals in human excrements are a significant driver for AMR, and assessing in- and effluent of wastewater treatment plants is evident. Treated wastewater is ultimately released in rivers, lakes, or the sea, elevating AMR from a local to a global health concern. Thus, researchers consider increasingly fresh and salt waters for comprehensive AMR surveys. In this light, recreational waters could be a significant health risk if strained with resistant bacteria. Indeed, freshwater-based epidemiology ascertained hot spots in Asian lakes, underpinning the urgency for timely and consistent AMR surveillance worldwide. However, data consistency is hampered due to a great variety of bioanalytical methods. For this reason, as part of this thesis, we integrated, examined, and evaluated standardized samples from numerous European freshwater lakes. Baseline levels of AMR have been detected, which facilitates future monitoring on a large scale. The results further emphasized that multi-resistant pathogens require alternative therapeutic options beyond conventional antibiotics. Therefore, scientists study antimicrobial peptides (AMPs). To date, several AMPs advanced in clinical trials or gained market maturity. The success encouraged researchers to develop advanced machine learning (ML) methods for high-throughput AMP scre [...]Sebastian Spänig, Heider, Dominik (Prof. Dr.), Mathematik Und Informatikwork_ghfhkr6e4jfkjau4kfcmblijvqMon, 04 Jul 2022 00:00:00 GMTConnected Square Network Graphs
https://scholar.archive.org/work/q6hemv4wkrhdzipaujmlwcrwwu
In this study, connected square network graphs are introduced and two different definitions are given. Firstly, connected square network graphs are shown to be a Hamilton graph. Further, the labelling algorithm of this graph is obtained by using gray code. Finally, its topological properties are obtained, and conclusion are given.Burhan SELÇUKwork_q6hemv4wkrhdzipaujmlwcrwwuThu, 30 Jun 2022 00:00:00 GMTnD-PointCloud Data Management
https://scholar.archive.org/work/uqp6g5hfkzdjxcfzx4ipzb6pgu
In the Geomatics domain, a point cloud refers to a data set that records the coordinates and other attributes of a huge number of points. Conceptually, each of the attributes can be regarded as a dimension to represent a specific type of information, such as time and Level of Importance (LoI). Drastically increasing collection of high dimensional point clouds raises essential demand for smart and highly efficient data management solutions. However, effective tools are missing. File-based solutions require substantial development of data structures and algorithms. Also, with such solutions, enormous effort has to be made to integrate different data types, formats and libraries. By contrast, state-of-the-art DataBase Management Systems (DBMSs) avoid these issues, because they are initially devised for generic use of data. However, DBMSs still present limitations on efficiently indexing non-uniformly distributed points, supporting continuous LoI, and operating high dimensional data. These problems motivate the PhD research which focuses on developing a new DBMS solution. It is aimed at efficiently managing and querying massive nD point clouds to support different types of applications.Haicheng Liuwork_uqp6g5hfkzdjxcfzx4ipzb6pguTue, 28 Jun 2022 00:00:00 GMTSelf-optimizing neural network in the classification of real valued data
https://scholar.archive.org/work/y2abvkjxz5hwdjjtxccgrgjtjq
The classification of multi-dimensional patterns is one of the most popular and often most challenging problems of machine learning. That is why some new approaches are being tried, expected to improve existing ones. The article proposes a new technique based on the decision network called self-optimizing neural networks (SONN). The proposed approach works on discretized data. Using a special procedure, we assign a feature vector to each element of the real-valued dataset. Later the feature vectors are analyzed, and decision patterns are created using so-called discriminants. We focus on how these discriminants are used and influence the final classifier prediction. Moreover, we also discuss the influence of the neighborhood topology. In the article, we use three different datasets with different properties. All results obtained by derived methods are compared with those obtained with the well-known support vector machine (SVM) approach. The results prove that the proposed solutions give better results than SVM. We can see that the information obtained from a training set is better generalized, and the final accuracy of the classifier is higher.Alicja Miniak-Górecka, Krzysztof Podlaski, Tomasz Gwizdałławork_y2abvkjxz5hwdjjtxccgrgjtjqTue, 28 Jun 2022 00:00:00 GMTPanconnectivity Algorithm for Eisenstein-Jacobi Networks
https://scholar.archive.org/work/gcizwcrhijg6babdy46mukh53q
The cycles in an interconnection network are one of the communication types that are considered as a factor to measure the efficiency and reliability of the networks' topology. The network is said to be panconnected if there are cycles of length l between two nodes u and v, for all l = d(u, v), d(u, v) +1, d(u, v) +2, ..., n-1 where d(u, v) is the shortest distance between u and v in a given network, and n is the total number of nodes in the network. In this paper, we propose an algorithm that generates and proves the panconnectivity of Eisenstein-Jacobi networks by constructing all cycles between any two nodes in the network of length l such that 3 <= l < n. The correctness of the proposed algorithm is given with the time complexity O(n^4).Mohammad Awadh, Zaid Hussain, Hesham Almansouriwork_gcizwcrhijg6babdy46mukh53qMon, 27 Jun 2022 00:00:00 GMTThe Gaussian conditional independence inference problem
https://scholar.archive.org/work/hzmnfg7kmbe77ijyxwv7efiryu
Die vorliegende Dissertation beschäftigt sich mit Strukturen Gaußscher bedingter Unabhängigkeit und ihrem Inferenzproblem. Bedingte Unabhängigkeit (engl. conditional independence, CI) ist ein Begriff aus der Wahrscheinlichkeits- und Informationstheorie und "Gaußsch" bezieht sich auf die bekannte multivariate Normalverteilung. Die CI-Relation einer multivariaten Zufallsvariable , deren Komponenten durch eine endliche Menge N indiziert sind, enthält Informationen darüber, welche Komponenten I die Verteilung anderer Komponenten J beeinflussen, wenn der Wert wieder anderer Komponenten K bekannt ist. Diese Relation wird als [ I ?? J j K] oder kurz (I; JjK) geschrieben. Bedingte Unabhängigkeit ist also eine dreiwertige Relation auf Teilvektoren von , die komplexe Abhängigkeiten zwischen den Variablen in kodiert. CI-Relationen werden formal in einem Zweig der künstlichen Intelligenz über logische Inferenzregeln studiert. Solche Inferenzregeln nehmen die folgende Form an: "wenn bestimmte bedingte Unabhängigkeiten gelten, welche (Disjunktionen von) anderen Unabhängigkeiten müssen ebenfalls gelten?" Kenntnis dieser Regeln erlaubt die automatische Deduktion von Informationen über die Abhängigkeitsstruktur von beobachteten Zufallsvariablen. Die Regeln, welche für CI-Relationen gelten, hängen von der Art der Wahrscheinlichkeitsverteilung ab. Binäre Verteilungen erfüllen beispielsweise andere Inferenzregeln als die kontinuierlichen Gaußschen Verteilungen. Eine multivariat Gauß-verteilte Zufallsvariable ist vollständig durch ihre Parameter, den Mittelwert 2 RN und die Kovarianzmatrix Σ 2 PDN, bestimmt. Unter dieser speziellen Annahme ist die bedingte Unabhängigkeitsaussage [ I ?? J j K] äquivalent zu einer Rangbedingung an die Teilmatrix von Σ mit Zeilen I [ K und Spalten J [ K, nämlich dass diese Matrix Rang jKj hat. Dieses Kriterium erlaubt die Behandlung von Gaußscher CI mit Mitteln der kommutativen Algebra, da die Rangbedingung als das Verschwinden einer Reihe von Polynomen in den Einträgen von Σ formuliert werden kann. Das [...]Tobias Boege, Universitäts- Und Landesbibliothek Sachsen-Anhalt, Martin-Luther Universität, Thomas Kahle, Volker Kaibelwork_hzmnfg7kmbe77ijyxwv7efiryuMon, 27 Jun 2022 00:00:00 GMTImproved bounds for 1-independent percolation on ℤ^n
https://scholar.archive.org/work/6l7sy3wsxfb7lkz2zvuccgfhyq
A 1-independent bond percolation model on a graph G is a probability distribution on the spanning subgraphs of G in which, for all vertex-disjoint sets of edges S_1 and S_2, the states of the edges in S_1 are independent of the states of the edges in S_2. Such a model is said to percolate if the random subgraph has an infinite component with positive probability. In 2012 the first author and Bollobás defined p_max(G) to be the supremum of those p for which there exists a 1-independent bond percolation model on G in which each edge is present in the random subgraph with probability at least p but which does not percolate. A fundamental and challenging problem in this area is to determine the value of p_max(G) when G is the lattice graph ℤ^2. Since p_max(ℤ^n)≤ p_max(ℤ^n-1), it is also of interest to establish the value of lim_n→∞ p_max(ℤ^n). In this paper we significantly improve the best known upper bound on this limit and obtain better upper and lower bounds on p_max(ℤ^2). In proving these results, we also give an upper bound on the critical probability for a 1-independent model on the hypercube graph to contain a giant component asymptotically almost surely.Paul Balister, Tom Johnston, Michael Savery, Alex Scottwork_6l7sy3wsxfb7lkz2zvuccgfhyqFri, 24 Jun 2022 00:00:00 GMTBordered manifolds with torus boundary and the link surgery formula
https://scholar.archive.org/work/hdcrngaydfhb7bc4xozqxhfnlm
In this paper, we develop a theory of bordered 𝐻𝐹^- using the link surgery formula of Manolescu and Ozsváth. We interpret their link surgery complexes as type-D modules over an associative algebra 𝒦, which we introduce. We prove a connected sum formula, which we interpret as an A_∞-tensor product over our algebra 𝒦. Topologically, this connected sum formula may be viewed as a formula for gluing along torus boundary components. We compute several important examples. We show that the dual knot formula of Hedden–Levine and Eftekhary may be interpreted as the DA-bimodule for a particular diffeomorphism of the torus. As another example, if K_1 and K_2 are knots in S^3, and Y is obtained by gluing the complements of K_1 and K_2 together using an orientation reversing diffeomorphism of their boundaries, then our theory may be used to compute 𝐶𝐹^-(Y) from 𝐶𝐹𝐾^∞(K_1) and 𝐶𝐹𝐾^∞(K_2). We additionally compute the type-D modules for rationally framed solid tori.Ian Zemkework_hdcrngaydfhb7bc4xozqxhfnlmTue, 21 Jun 2022 00:00:00 GMTDiscrete time crystals beyond the MBL paradigm
https://scholar.archive.org/work/ucwhtk4wjfeyldq6n7ogk5mp74
Discrete time crystals (DTCs) are systems that, subject to a periodic forcing, respond with a period larger than that of the drive. Breaking the discrete time-translational symmetry of the underlying equations, DTCs maintain an infinite autocorrelation time, avoid ergodicity, and realize a novel nonequilibrium phase of matter. In most previous proposals of DTCs, this peculiar behavior relied on the presence of (strong) disorder. Indeed, according to the celebrated mechanism of many-body localization (MBL), disorder can avert the otherwise generally expected 'heat death' to a featureless infinite temperature state in a driven system. And yet, it has recently been discovered that alternative mechanisms do exist through which thermalization can be avoided or significantly slowed down, such as confinement from long-range interactions, so-called quantum scars, or dynamical localization. This raises a number of natural questions: To what extent is MBL needed to observe nontrivial dynamics? What classifies a dynamics as nontrivial? What mechanisms can stabilize what phenomenologies of time crystallinity? Are DTCs possible in a classical setting and in which sense? In this dissertation, we address these questions proposing and investigating various remarkable notions of DTCs beyond the MBL-paradigm. Our journey across the zoology of time crystallinity embraces both the quantum and the classical realms, and discusses DTCs in their quasi, higher-order, fractional, and classical-stochastic flavours. All these exotic phenomena are encompassed by a unifying framework that we develop. Following this common thread, we justify and emphasise the key elements that, we think, should characterise DTCs, namely their many-body nature and the concept of universality in the nonequilibrium setting. Bringing together problems from different fields such as condensed matter physics, statistical physics, dynamical system theory, and epidemiology, we unveil striking ramifications of these remarkable dynamical phases of matter, advance our cur [...]Andrea Pizzi, Apollo-University Of Cambridge Repository, Andreas Nunnenkamp, Austen Lamacraftwork_ucwhtk4wjfeyldq6n7ogk5mp74Fri, 17 Jun 2022 00:00:00 GMTImproving Schrödinger Equation Implementations with Gray Code for Adiabatic Quantum Computers
https://scholar.archive.org/work/ebnustzinjdy3kqulypiv7lj6e
We reformulate the continuous space Schrödinger equation in terms of spin Hamiltonians. For the kinetic energy operator, the critical concept facilitating the reduction in model complexity is the idea of position encoding. Binary encoding of position produces a Heisenberg-like model and yields exponential improvement in space complexity when compared to classical computing. Encoding with a binary reflected Gray code, and a Hamming distance 2 Gray code yields the additional effect of reducing the spin model down to the XZ and transverse Ising model respectively. We also identify the bijective mapping between diagonal unitaries and the Walsh series, producing the mapping of any real potential to a series of k-local Ising models through the fast Walsh transform. Finally, in a finite volume, we provide some numerical evidence to support the claim that the total time needed for adiabatic evolution is protected by the infrared cutoff of the system. As a result, initial state preparation from a free-field wavefunction to an interacting system is expected to exhibit polynomial time complexity with volume and constant scaling with respect to lattice discretization for all encodings. For the Hamming distance 2 Gray code, the evolution starts with the transverse Hamiltonian before introducing penalties such that the low lying spectrum reproduces the energy levels of the Laplacian. The adiabatic evolution of the penalty Hamiltonian is therefore sensitive to the ultraviolet scale. It is expected to exhibit polynomial time complexity with lattice discretization, or exponential time complexity with respect to the number of qubits given a fixed volume.Chia Cheng Chang, Kenneth S. McElvain, Ermal Rrapaj, Yantao Wuwork_ebnustzinjdy3kqulypiv7lj6eTue, 14 Jun 2022 00:00:00 GMTTopologically penalized regression on manifolds
https://scholar.archive.org/work/j3oo4jujknft3jusg5ng7wvlve
We study a regression problem on a compact manifold M. In order to take advantage of the underlying geometry and topology of the data, the regression task is performed on the basis of the first several eigenfunctions of the Laplace-Beltrami operator of the manifold, that are regularized with topological penalties. The proposed penalties are based on the topology of the sub-level sets of either the eigenfunctions or the estimated function. The overall approach is shown to yield promising and competitive performance on various applications to both synthetic and real data sets. We also provide theoretical guarantees on the regression function estimates, on both its prediction error and its smoothness (in a topological sense). Taken together, these results support the relevance of our approach in the case where the targeted function is "topologically smooth".Olympio Hacquardwork_j3oo4jujknft3jusg5ng7wvlveFri, 10 Jun 2022 00:00:00 GMTMining and Learning With Graphs and Tensors
https://scholar.archive.org/work/dtj22tcaa5culibtnexa7giqw4
Data generated in diverse contexts can be modeled as graphs. Examples are numerous, from citation and social networks to theWorldWideWeb. Many realworld networks are multi-aspect, where multiple types of entities interact with each other via various relations. Also, many of them are dynamic, modeling relationships among entities and their features that evolve over time. These real-world networks with rich side information (e.g., node and edge types, and edge timestamps) are naturally modeled as tensors (i.e., multi-dimensional arrays). Given graphs and tensors, how can we understand them, and utilize them for downstream tasks? Specifically, how can we analyze and model large realworld networks, and gain a better understanding of how they form and evolve? Also, how can we design algorithms that leverage graphs and tensors for important applications such as recommendation and ranking? This thesis focuses on these fundamental problems by developing effective and efficient methods for mining and learning with graphs and tensors. In the first part of the thesis, we focus on addressing important mining and learning tasks for static graphs and tensors. We first propose novel graph-regularized semi-supervised algorithms for estimating node importance in a knowledge graph, which achieve up to 25% higher accuracy than the best baseline.Then we develop distributed frameworks for large-scale tensor factorization, which decompose and summarize large tensors up to 180x faster than existing methods, with near-linear scalability. We also design a meta-learning based approach for automatic graph learning model selection, which is up to 15x more accurate than using popular methods consistently. In addition, we develop a method that explains product recommendations, up to 21% more accurately than the best baseline, by performing personalized inference over a product graph. In the second part of the thesis, we focus on modeling and reasoning with dynamic graphs and tensors, which represent various types of time-evolving networks and [...]Namyong Parkwork_dtj22tcaa5culibtnexa7giqw4Mon, 06 Jun 2022 00:00:00 GMTThe geometry of integration in text classification RNNs
https://scholar.archive.org/work/rislliikr5exhpjk23qynqyyw4
Despite the widespread application of recurrent neural networks (RNNs) across a variety of tasks, a unified understanding of how RNNs solve these tasks remains elusive. In particular, it is unclear what dynamical patterns arise in trained RNNs, and how those patterns depend on the training dataset or task. This work addresses these questions in the context of a specific natural language processing task: text classification. Using tools from dynamical systems analysis, we study recurrent networks trained on a battery of both natural and synthetic text classification tasks. We find the dynamics of these trained RNNs to be both interpretable and low-dimensional. Specifically, across architectures and datasets, RNNs accumulate evidence for each class as they process the text, using a low-dimensional attractor manifold as the underlying mechanism. Moreover, the dimensionality and geometry of the attractor manifold are determined by the structure of the training dataset; in particular, we describe how simple word-count statistics computed on the training dataset can be used to predict these properties. Our observations span multiple architectures and datasets, reflecting a common mechanism RNNs employ to perform text classification. To the degree that integration of evidence towards a decision is a common computational primitive, this work lays the foundation for using dynamical systems techniques to study the inner workings of RNNs.Kyle Aitken, Vinay V. Ramasesh, Ankush Garg, Yuan Cao, David Sussillo, Niru Maheswaranathanwork_rislliikr5exhpjk23qynqyyw4Fri, 03 Jun 2022 00:00:00 GMT