Filters








300 Hits in 1.3 sec

Landmark Ordinal Embedding [article]

Nikhil Ghosh, Yuxin Chen, Yisong Yue
2019 arXiv   pre-print
In this paper, we aim to learn a low-dimensional Euclidean representation from a set of constraints of the form "item j is closer to item i than item k". Existing approaches for this "ordinal embedding" problem require expensive optimization procedures, which cannot scale to handle increasingly larger datasets. To address this issue, we propose a landmark-based strategy, which we call Landmark Ordinal Embedding (LOE). Our approach trades off statistical efficiency for computational efficiency
more » ... exploiting the low-dimensionality of the latent embedding. We derive bounds establishing the statistical consistency of LOE under the popular Bradley-Terry-Luce noise model. Through a rigorous analysis of the computational complexity, we show that LOE is significantly more efficient than conventional ordinal embedding approaches as the number of items grows. We validate these characterizations empirically on both synthetic and real datasets. We also present a practical approach that achieves the "best of both worlds", by using LOE to warm-start existing methods that are more statistically efficient but computationally expensive.
arXiv:1910.12379v1 fatcat:aqflyaxeqvdwthfvjpdxksgctm

Iterative Amortized Inference [article]

Joseph Marino, Yisong Yue, Stephan Mandt
2018 arXiv   pre-print
Inference models are a key component in scaling variational inference to deep latent variable models, most notably as encoder networks in variational auto-encoders (VAEs). By replacing conventional optimization-based inference with a learned model, inference is amortized over data examples and therefore more computationally efficient. However, standard inference models are restricted to direct mappings from data to approximate posterior estimates. The failure of these models to reach fully
more » ... ized approximate posterior estimates results in an amortization gap. We aim toward closing this gap by proposing iterative inference models, which learn to perform inference optimization through repeatedly encoding gradients. Our approach generalizes standard inference models in VAEs and provides insight into several empirical findings, including top-down inference techniques. We demonstrate the inference optimization capabilities of iterative inference models and show that they outperform standard inference models on several benchmark data sets of images and text.
arXiv:1807.09356v1 fatcat:s44n3ii7nzeelobiabdtltgbnu

Minimax Model Learning [article]

Cameron Voloshin, Nan Jiang, Yisong Yue
2021 arXiv   pre-print
M., Jiang, N., and Yue, Y.  ...  ., and Yue, Y. Robust regression for safe exploration in control. In Learning for Dynamics and Control (L4DC), 2020. Liu, Q., Li, L., Tang, Z., and Zhou, D.  ... 
arXiv:2103.02084v1 fatcat:b25bca45o5b6tkg6mfbdgibal4

Competitive Policy Optimization [article]

Manish Prajapat, Kamyar Azizzadenesheli, Alexander Liniger, Yisong Yue, Anima Anandkumar
2020 arXiv   pre-print
A core challenge in policy optimization in competitive Markov decision processes is the design of efficient optimization methods with desirable convergence and stability properties. To tackle this, we propose competitive policy optimization (CoPO), a novel policy gradient approach that exploits the game-theoretic nature of competitive games to derive policy updates. Motivated by the competitive gradient optimization method, we derive a bilinear approximation of the game objective. In contrast,
more » ... ff-the-shelf policy gradient methods utilize only linear approximations, and hence do not capture interactions among the players. We instantiate CoPO in two ways:(i) competitive policy gradient, and (ii) trust-region competitive policy optimization. We theoretically study these methods, and empirically investigate their behavior on a set of comprehensive, yet challenging, competitive games. We observe that they provide stable optimization, convergence to sophisticated strategies, and higher scores when played against baseline policy gradient methods.
arXiv:2006.10611v1 fatcat:xx2a4ybqqfctxmklm7oijrhsma

Batch Policy Learning under Constraints [article]

Hoang M. Le, Cameron Voloshin, Yisong Yue
2019 arXiv   pre-print
., Yue, Y., and Carr, P. Smooth imitation learning for online sequence prediction.  ... 
arXiv:1903.08738v1 fatcat:6rydrj3xcjgylmqvb5sbq72rey

Co-training for Policy Learning [article]

Jialin Song, Ravi Lanka, Yisong Yue, Masahiro Ono
2019 arXiv   pre-print
We study the problem of learning sequential decision-making policies in settings with multiple state-action representations. Such settings naturally arise in many domains, such as planning (e.g., multiple integer programming formulations) and various combinatorial optimization problems (e.g., those with both integer programming and graph-based formulations). Inspired by the classical co-training framework for classification, we study the problem of co-training for policy learning. We present
more » ... ficient conditions under which learning from two views can improve upon learning from a single view alone. Motivated by these theoretical insights, we present a meta-algorithm for co-training for sequential decision making. Our framework is compatible with both reinforcement learning and imitation learning. We validate the effectiveness of our approach across a wide range of tasks, including discrete/continuous control and combinatorial optimization.
arXiv:1907.04484v1 fatcat:lhje4hhvpvcw5bgfkukulsnl2m

Architecture Agnostic Neural Networks [article]

Sabera Talukder, Guruprasad Raghavan, Yisong Yue
2020 arXiv   pre-print
In this paper, we explore an alternate method for synthesizing neural network architectures, inspired by the brain's stochastic synaptic pruning. During a person's lifetime, numerous distinct neuronal architectures are responsible for performing the same tasks. This indicates that biological neural networks are, to some degree, architecture agnostic. However, artificial networks rely on their fine-tuned weights and hand-crafted architectures for their remarkable performance. This contrast begs
more » ... he question: Can we build artificial architecture agnostic neural networks? To ground this study we utilize sparse, binary neural networks that parallel the brain's circuits. Within this sparse, binary paradigm we sample many binary architectures to create families of architecture agnostic neural networks not trained via backpropagation. These high-performing network families share the same sparsity, distribution of binary weights, and succeed in both static and dynamic tasks. In summation, we create an architecture manifold search procedure to discover families or architecture agnostic neural networks.
arXiv:2011.02712v2 fatcat:wm7gml465bfefhg7f2zdtalbnu

A General Method for Amortizing Variational Filtering [article]

Joseph Marino, Milan Cvitkovic, Yisong Yue
2018 arXiv   pre-print
We introduce the variational filtering EM algorithm, a simple, general-purpose method for performing variational inference in dynamical latent variable models using information from only past and present variables, i.e. filtering. The algorithm is derived from the variational objective in the filtering setting and consists of an optimization procedure at each time step. By performing each inference optimization procedure with an iterative amortized inference model, we obtain a computationally
more » ... ficient implementation of the algorithm, which we call amortized variational filtering. We present experiments demonstrating that this general-purpose method improves performance across several deep dynamical latent variable models.
arXiv:1811.05090v1 fatcat:q57icsvogvatlpsh4saju6t55y

Multi-dueling Bandits with Dependent Arms [article]

Yanan Sui, Vincent Zhuang, Joel W. Burdick, Yisong Yue
2017 arXiv   pre-print
In other words, once t > C for some problem-dependent constant C, the regret of INDSELFSPARRING matches informationtheoretic bounds up to constant factors (see Yue et al. (2012) for lower bound analysis  ...  Compared to the assumptions of Yue et al. (2012) , Approximate Linearity is a stricter requirement than strong stochastic transitivity, and is a complementary requirement to stochastic triangle inequality  ... 
arXiv:1705.00253v1 fatcat:6yynr7sxsfbgbowb2qplhibtuy

Unsupervised Learning of Neurosymbolic Encoders [article]

Eric Zhan, Jennifer J. Sun, Ann Kennedy, Yisong Yue, Swarat Chaudhuri
2021 arXiv   pre-print
We present a framework for the unsupervised learning of neurosymbolic encoders, i.e., encoders obtained by composing neural networks with symbolic programs from a domain-specific language. Such a framework can naturally incorporate symbolic expert knowledge into the learning process and lead to more interpretable and factorized latent representations than fully neural encoders. Also, models learned this way can have downstream impact, as many analysis workflows can benefit from having clean
more » ... rammatic descriptions. We ground our learning algorithm in the variational autoencoding (VAE) framework, where we aim to learn a neurosymbolic encoder in conjunction with a standard decoder. Our algorithm integrates standard VAE-style training with modern program synthesis techniques. We evaluate our method on learning latent representations for real-world trajectory data from animal biology and sports analytics. We show that our approach offers significantly better separation than standard VAEs and leads to practical gains on downstream tasks.
arXiv:2107.13132v1 fatcat:hkxaiu2i3nev3j2f2pc67ckbda

Hierarchical Exploration for Accelerating Contextual Bandits [article]

Yisong Yue, Sue Ann Hong (Carnegie Mellon University), Carlos Guestrin
2012 arXiv   pre-print
Experimental Setting We employ the submodular bandit extension of linear stochastic bandits (Yue & Guestrin, 2011) to model the news recommendation setting.  ...  Simulations We performed simulation evaluations using data collected from a previous user study in personalized news recommendation by (Yue & Guestrin, 2011) .  ... 
arXiv:1206.6454v1 fatcat:h2zoninebfhptnfttrpwxqkg5a

Investigating Generalization by Controlling Normalized Margin [article]

Alexander Farhang, Jeremy Bernstein, Kushal Tirumala, Yang Liu, Yisong Yue
2022 arXiv   pre-print
Weight norm w and margin γ participate in learning theory via the normalized margin γ/w. Since standard neural net optimizers do not control normalized margin, it is hard to test whether this quantity causally relates to generalization. This paper designs a series of experimental studies that explicitly control normalized margin and thereby tackle two central questions. First: does normalized margin always have a causal effect on generalization? The paper finds that no – networks can be
more » ... where normalized margin has seemingly no relationship with generalization, counter to the theory of Bartlett et al. (2017). Second: does normalized margin ever have a causal effect on generalization? The paper finds that yes – in a standard training setup, test performance closely tracks normalized margin. The paper suggests a Gaussian process model as a promising explanation for this behavior.
arXiv:2205.03940v1 fatcat:jdf73mrcsfhyri3cot7ngbsbua

Generating Long-term Trajectories Using Deep Hierarchical Networks [article]

Stephan Zheng, Yisong Yue, Patrick Lucey
2017 arXiv   pre-print
We study the problem of modeling spatiotemporal trajectories over long time horizons using expert demonstrations. For instance, in sports, agents often choose action sequences with long-term goals in mind, such as achieving a certain strategic position. Conventional policy learning approaches, such as those based on Markov decision processes, generally fail at learning cohesive long-term behavior in such high-dimensional state spaces, and are only effective when myopic modeling lead to the
more » ... ed behavior. The key difficulty is that conventional approaches are "shallow" models that only learn a single state-action policy. We instead propose a hierarchical policy class that automatically reasons about both long-term and short-term goals, which we instantiate as a hierarchical neural network. We showcase our approach in a case study on learning to imitate demonstrated basketball trajectories, and show that it generates significantly more realistic trajectories compared to non-hierarchical baselines as judged by professional sports analysts.
arXiv:1706.07138v1 fatcat:v3tbnzuzzvc27nv4pw6kn4pcva

Learning Policies for Contextual Submodular Prediction [article]

Stephane Ross, Jiaji Zhou, Yisong Yue, Debadeepta Dey, J. Andrew Bagnell
2013 arXiv   pre-print
Yisong Yue was also supported in part by ONR (PECASE) N000141010672 and ONR Young Investigator Program N00014-08-1-0752. We gratefully thank Martial Hebert for valuable discussions and support.  ...  The first approach (Yue & Joachims, 2008; Yue & Guestrin, 2011; Lin & Bilmes, 2012; Raman et al., 2012) involves identifying the parameterization that best matches the submodular rewards of the training  ...  Personalized News Recommendation We built a stochastic user simulation based on 75 user preferences derived from a user study in (Yue & Guestrin, 2011) .  ... 
arXiv:1305.2532v1 fatcat:55o4wscba5bdtcv2jz26t76aq4

Iterative Amortized Policy Optimization [article]

Joseph Marino, Alexandre Piché, Alessandro Davide Ialongo, Yisong Yue
2021 arXiv   pre-print
Policy networks are a central feature of deep reinforcement learning (RL) algorithms for continuous control, enabling the estimation and sampling of high-value actions. From the variational inference perspective on RL, policy networks, when used with entropy or KL regularization, are a form of amortized optimization, optimizing network parameters rather than the policy distributions directly. However, direct amortized mappings can yield suboptimal policy estimates and restricted distributions,
more » ... imiting performance and exploration. Given this perspective, we consider the more flexible class of iterative amortized optimizers. We demonstrate that the resulting technique, iterative amortized policy optimization, yields performance improvements over direct amortization on benchmark continuous control tasks.
arXiv:2010.10670v2 fatcat:hsk5x7qaundofazpzmayd4crjq
« Previous Showing results 1 — 15 out of 300 results