Filters








39 Hits in 8.2 sec

KNAS: Green Neural Architecture Search [article]

Jingjing Xu, Liang Zhao, Junyang Lin, Rundong Gao, Xu Sun, Hongxia Yang
2021 arXiv   pre-print
Experiments show that KNAS achieves competitive results with orders of magnitude faster than "train-then-test" paradigms on image classification tasks.  ...  Furthermore, the extremely low search cost enables its wide applications. The searched network also outperforms strong baseline RoBERTA-large on two text classification tasks. Codes are available at .  ...  In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July Falkner, S., Klein, A., and  ... 
arXiv:2111.13293v1 fatcat:fz4p37bk2jgxnn2qfc6ffvvfwu

Unsupervised Text Summarization via Mixed Model Back-Translation [article]

Yacine Jernite
2019 arXiv   pre-print
In this work, we extend the paradigm to the problem of learning a sentence summarization system from unaligned data.  ...  We present several initial models which rely on the asymmetrical nature of the task to perform the first back-translation step, and demonstrate the value of combining the data created by these diverse  ...  In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, pages 448-456. Nal Kalchbrenner and Phil Blunsom. 2013.  ... 
arXiv:1908.08566v1 fatcat:nx5i2l3bhbhnjb4n3hzp2gzkai

Score-Based Generative Classifiers [article]

Roland S. Zimmermann, Lukas Schott, Yang Song, Benjamin A. Dunn, David A. Klindt
2021 arXiv   pre-print
Additionally, on natural image datasets, previous results have suggested a trade-off between the likelihood of the data and classification accuracy.  ...  While they do not yet deliver on the promise of adversarial and out-of-domain robustness, they provide a different approach to classification that warrants further research.  ...  Blei, editors, Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, volume 37 of JMLR Workshop and Conference Proceedings, pages 2256  ... 
arXiv:2110.00473v2 fatcat:6yer6cgkxnbf7kg2m2xmxjtre4

Learning Bilingual Projections of Embeddings for Vocabulary Expansion in Machine Translation

Pranava Swaroop Madhyastha, Cristina España-Bonet
2017 Proceedings of the 2nd Workshop on Representation Learning for NLP  
We integrate these translation options into a standard phrase-based statistical machine translation system and obtain consistent improvements in translation quality on the English-Spanish language pair  ...  Given an out-of-vocabulary source word, the model generates a probabilistic list of possible translations in the target language using the trained bilingual embeddings.  ...  In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015. pages 748-756. Nizar Habash. 2008.  ... 
doi:10.18653/v1/w17-2617 dblp:conf/rep4nlp/MadhyasthaE17 fatcat:efm2eq2fq5epldcxofj5y3t5xq

Mitigating Catastrophic Forgetting in Scheduled Sampling with Elastic Weight Consolidation in Neural Machine Translation [article]

Michalis Korakakis, Andreas Vlachos
2021 arXiv   pre-print
We also observe that as a side-effect, it worsens performance when the model-generated prefix is correct, a form of catastrophic forgetting.  ...  In this paper, we conduct systematic experiments and find that it ameliorates exposure bias by increasing model reliance on the input sequence.  ...  In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, volume 37 of JMLR Workshop and Conference Proceedings, pages 2058-2066. JMLR.org.  ... 
arXiv:2109.06308v1 fatcat:55ij4ulbufbpxlyykwal2eopte

Accelerating Safe Reinforcement Learning with Constraint-mismatched Policies [article]

Tsung-Yen Yang and Justinian Rosca and Karthik Narasimhan and Peter J. Ramadge
2021 arXiv   pre-print
We consider the problem of reinforcement learning when provided with (1) a baseline control policy and (2) a set of constraints that the learner must satisfy.  ...  In our experiments on five different control tasks, our algorithm consistently outperforms several state-of-the-art baselines, achieving 10 times fewer constraint violations and 40% higher reward on average  ...  Acknowledgements The authors would like to thank members of the Princeton NLP Group, the anonymous reviewers, and the area chair for their comments.  ... 
arXiv:2006.11645v3 fatcat:ngboxu47t5fm3gojgnym6yqala

A Survey of Deep Active Learning [article]

Pengzhen Ren, Yun Xiao, Xiaojun Chang, Po-Yao Huang, Zhihui Li, Brij B. Gupta, Xiaojiang Chen, Xin Wang
2021 arXiv   pre-print
This is mainly because before the rise of DL, traditional machine learning requires relatively few labeled samples. Therefore, early AL is difficult to reflect the value it deserves.  ...  Active learning (AL) attempts to maximize the performance gain of the model by marking the fewest samples.  ...  In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015 (JMLR Workshop and Conference Proceedings, Vol. 37). JMLR.org, 1861–1869.  ... 
arXiv:2009.00236v2 fatcat:zuk2doushzhlfaufcyhoktxj7e

SurVAE Flows: Surjections to Bridge the Gap between VAEs and Flows [article]

Didrik Nielsen, Priyank Jaini, Emiel Hoogeboom, Ole Winther, Max Welling
2020 arXiv   pre-print
do not provide tractable estimates of the marginal likelihood.  ...  However, they both impose constraints on the models: Normalizing flows use bijective transformations to model densities whereas VAEs learn stochastic transformations that are non-invertible and thus typically  ...  Broader Impact This work constitutes foundational research on generative models/unsupervised learning by providing a unified view on several lines of work and further by introducing new modules that expand  ... 
arXiv:2007.02731v2 fatcat:7cozym4jendrliu2jnppmpprzu

Learning Compressed Transforms with Low Displacement Rank [article]

Anna T. Thomas and Albert Gu and Tri Dao and Atri Rudra and Christopher Ré
2019 arXiv   pre-print
We prove bounds on the VC dimension of multi-layer neural networks with structured weight matrices and show empirically that our compact parameterization can reduce the sample complexity of learning.  ...  We introduce a class of LDR matrices with more general displacement operators, and explicitly learn over both the operators and the low-rank component.  ...  , of DARPA, NIH, ONR, or the U.S.  ... 
arXiv:1810.02309v3 fatcat:hf2tf74hn5hojfmhzo54gt6fhm

Parameter Space Noise for Exploration [article]

Matthias Plappert, Rein Houthooft, Prafulla Dhariwal, Szymon Sidor, Richard Y. Chen, Xi Chen, Tamim Asfour, Pieter Abbeel, Marcin Andrychowicz
2018 arXiv   pre-print
Deep reinforcement learning (RL) methods generally engage in exploratory behavior through noise injection in the action space.  ...  We demonstrate that both off- and on-policy methods benefit from this approach through experimental comparison of DQN, DDPG, and TRPO on high-dimensional discrete action environments as well as continuous  ...  In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, pp. 1889-1897, 2015b. URL http://jmlr.org/proceedings/papers/ v37/schulman15.html.  ... 
arXiv:1706.01905v2 fatcat:rdplj3fkebhttapjqhkqpr2gty

High Fidelity Visualization of What Your Self-Supervised Representation Knows About [article]

Florian Bordes, Randall Balestriero, Pascal Vincent
2021 arXiv   pre-print
However, relying only on such downstream task can limit our understanding of how much information is retained in the representation of a given input.  ...  We further demonstrate how this model's generation quality is on par with state-of-the-art generative models while being faithful to the representation used as conditioning.  ...  .), Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, volume 37 of JMLR Workshop and Conference Proceedings, pp. 2256–2265.  ... 
arXiv:2112.09164v1 fatcat:6lk3n7w3crbgpomjdjanm7xhju

Improving the Robustness of Deep Neural Networks via Stability Training

Stephan Zheng, Yang Song, Thomas Leung, Ian Goodfellow
2016 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
In Proceedings of the 32nd International Conference on Ma- chine Learning, ICML 2015, Lille, France, 6-11 July 2015, pages 448–456, 2015. 5 [3] A. Krizhevsky, I. Sutskever, and G. E.  ...  Watch and learn: Semi-supervised learning for object detectors from video. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015. 2 [6] T. Miyato, S.-i.  ... 
doi:10.1109/cvpr.2016.485 dblp:conf/cvpr/ZhengSLG16 fatcat:rhlsyrmek5durapxai52e4ortm

The Statistical Cost of Robust Kernel Hyperparameter Tuning [article]

Raphael A. Meyer, Christopher Musco
2020 arXiv   pre-print
We provide finite-sample guarantees for the problem, characterizing how increasing the complexity of the kernel class increases the complexity of learning kernel hyperparameters.  ...  This paper studies the statistical complexity of kernel hyperparameter tuning in the setting of active regression under adversarial noise.  ...  Acknowledgements The authors would like to thank Xue Chen for valuable discussion in the early stages of this work.  ... 
arXiv:2006.08035v1 fatcat:qrnnbcccyratjcn74mdcbinofa

MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer [article]

Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder
2020 arXiv   pre-print
MAD-X outperforms the state of the art in cross-lingual transfer across a representative set of typologically diverse languages on named entity recognition and causal commonsense reasoning, and achieves  ...  The main goal behind state-of-the-art pre-trained multilingual models such as multilingual BERT and XLM-R is enabling and bootstrapping NLP applications in low-resource languages through zero-shot or few-shot  ...  The work of Ivan Vulić is supported by the ERC Consolidator Grant LEXICAL: Lexical Acquisition Across Languages (no 648909). We thank Laura Rimell for feedback on a draft.  ... 
arXiv:2005.00052v3 fatcat:kymxwptblnfophjkju3ow32bde

Estimating Model Uncertainty of Neural Networks in Sparse Information Form [article]

Jongseok Lee, Matthias Humt, Jianxiang Feng, Rudolph Triebel
2020 arXiv   pre-print
Our exhaustive theoretical analysis and empirical evaluations on various benchmarks show the competitiveness of our approach over the current methods.  ...  The key insight of our work is that the information matrix, i.e. the inverse of the covariance matrix tends to be sparse in its spectrum.  ...  The authors acknowledge the support of Helmholtz Association, the project ARCHES (contract number ZT-0033) and the EUproject AUTOPILOT (contract number 731993).  ... 
arXiv:2006.11631v1 fatcat:2fwwrpi7ere2djavxzcz627xmy
« Previous Showing results 1 — 15 out of 39 results