Filters








48 Hits in 1.5 sec

Metric Learning for Dynamic Text Classification [article]

Jeremy Wohlwend, Ethan R. Elenberg, Samuel Altschul, Shawn Henry, Tao Lei
2019 arXiv   pre-print
Traditional text classifiers are limited to predicting over a fixed set of labels. However, in many real-world applications the label set is frequently changing. For example, in intent classification, new intents may be added over time while others are removed. We propose to address the problem of dynamic text classification by replacing the traditional, fixed-size output layer with a learned, semantically meaningful metric space. Here the distances between textual inputs are optimized to
more » ... m nearest-neighbor classification across overlapping label sets. Changing the label set does not involve removing parameters, but rather simply adding or removing support points in the metric space. Then the learned metric can be fine-tuned with only a few additional training examples. We demonstrate that this simple strategy is robust to changes in the label space. Furthermore, our results show that learning a non-Euclidean metric can improve performance in the low data regime, suggesting that further work on metric spaces may benefit low-resource research.
arXiv:1911.01026v1 fatcat:gfi53kme3vdtfajt2vsm4xbszu

Structured Pruning of Large Language Models [article]

Ziheng Wang, Jeremy Wohlwend, Tao Lei
2019 arXiv   pre-print
Large language models have recently achieved state of the art performance across a wide variety of natural language tasks. Meanwhile, the size of these models and their latency have significantly increased, which makes their usage costly, and raises an interesting question: do language models need to be large? We study this question through the lens of model compression. We present a novel, structured pruning approach based on low rank factorization and augmented Lagrangian L0 norm
more » ... n. Our structured approach achieves significant inference speedups while matching or outperforming our unstructured pruning baseline at various sparsity levels. We apply our method to state of the art models on the enwiki8 dataset and obtain a 1.19 perplexity score with just 5M parameters, vastly outperforming a model of the same size trained from scratch. We also demonstrate that our method can be applied to language model fine-tuning by pruning the BERT model on several downstream classification benchmarks.
arXiv:1910.04732v1 fatcat:o2daer4ftraalg4jfnvssv6tgq

Autoregressive Knowledge Distillation through Imitation Learning [article]

Alexander Lin, Jeremy Wohlwend, Howard Chen, Tao Lei
2020 arXiv   pre-print
Our experiments were conducted using Flambé, a PyTorch-based model training and evaluation library (Wohlwend et al., 2019) .  ... 
arXiv:2009.07253v2 fatcat:2pk5mt46zjemznssyle2zwxua4

ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition [article]

Jing Pan, Joshua Shapiro, Jeremy Wohlwend, Kyu J. Han, Tao Lei, Tao Ma
2020 arXiv   pre-print
In this paper we present state-of-the-art (SOTA) performance on the LibriSpeech corpus with two novel neural network architectures, a multistream CNN for acoustic modeling and a self-attentive simple recurrent unit (SRU) for language modeling. In the hybrid ASR framework, the multistream CNN acoustic model processes an input of speech frames in multiple parallel pipelines where each stream has a unique dilation rate for diversity. Trained with the SpecAugment data augmentation method, it
more » ... s relative word error rate (WER) improvements of 4% on test-clean and 14% on test-other. We further improve the performance via N-best rescoring using a 24-layer self-attentive SRU language model, achieving WERs of 1.75% on test-clean and 4.46% on test-other.
arXiv:2005.10469v1 fatcat:2w2bphgbjfhzzinnnshf4knnba

Flambé: A Customizable Framework for Machine Learning Experiments

Jeremy Wohlwend, Nicholas Matthews, Ivan Itzcovich
2019 Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations  
Flambé is a machine learning experimentation framework built to accelerate the entire research life cycle. Flambé's main objective is to provide a unified interface for prototyping models, running experiments containing complex pipelines, monitoring those experiments in real-time, reporting results, and deploying a final model for inference. Flambé achieves both flexibility and simplicity by allowing users to write custom code but instantly include that code as a component in a larger system
more » ... ch is represented by a concise configuration file format. We demonstrate the application of the framework through a cuttingedge multistage use case: fine-tuning and distillation of a state of the art pretrained language model used for text classification. 1
doi:10.18653/v1/p19-3029 dblp:conf/acl/WohlwendMI19 fatcat:42bvlor74vbm3pklpcsdueydoa

ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition

Jing Pan, Joshua Shapiro, Jeremy Wohlwend, Kyu J. Han, Tao Lei, Tao Ma
2020 Interspeech 2020  
In this paper we present state-of-the-art (SOTA) performance on the LibriSpeech corpus with two novel neural network architectures, a multistream CNN for acoustic modeling and a selfattentive simple recurrent unit (SRU) for language modeling. In the hybrid ASR framework, the multistream CNN acoustic model processes an input of speech frames in multiple parallel pipelines where each stream has a unique dilation rate for diversity. Trained with the SpecAugment data augmentation method, it
more » ... relative word error rate (WER) improvements of 4% on test-clean and 14% on test-other. We further improve the performance via N -best rescoring using a 24-layer self-attentive SRU language model, achieving WERs of 1.75% on test-clean and 4.46% on test-other.
doi:10.21437/interspeech.2020-2947 dblp:conf/interspeech/PanSWH0M20 fatcat:pfvnsjbilfdzhmadi24doeql5e

Metric Learning for Dynamic Text Classification

Jeremy Wohlwend, Ethan R. Elenberg, Sam Altschul, Shawn Henry, Tao Lei
2019 Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)  
Traditional text classifiers are limited to predicting over a fixed set of labels. However, in many real-world applications the label set is frequently changing. For example, in intent classification, new intents may be added over time while others are removed. We propose to address the problem of dynamic text classification by replacing the traditional, fixed-size output layer with a learned, semantically meaningful metric space. Here the distances between textual inputs are optimized to
more » ... m nearest-neighbor classification across overlapping label sets. Changing the label set does not involve removing parameters, but rather simply adding or removing support points in the metric space. Then the learned metric can be fine-tuned with only a few additional training examples. We demonstrate that this simple strategy is robust to changes in the label space. Furthermore, our results show that learning a non-Euclidean metric can improve performance in the low data regime, suggesting that further work on metric spaces may benefit lowresource research.
doi:10.18653/v1/d19-6116 dblp:conf/acl-deeplo/WohlwendEAHL19 fatcat:tfzon7rbbzgmvksw4ebywbdoju

Feasibility of 3D Reconstruction of Neural Morphology Using Expansion Microscopy and Barcode-Guided Agglomeration

Young-Gyu Yoon, Peilun Dai, Jeremy Wohlwend, Jae-Byum Chang, Adam H. Marblestone, Edward S. Boyden
2017 Frontiers in Computational Neuroscience  
Copyright © 2017 Yoon, Dai, Wohlwend, Chang, Marblestone and Boyden. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY).  ... 
doi:10.3389/fncom.2017.00097 pmid:29114215 pmcid:PMC5660712 fatcat:vn2e3cympba6zoov74jtsgnbbi

Iterative Refinement Graph Neural Network for Antibody Sequence-Structure Co-design [article]

Wengong Jin, Jeremy Wohlwend, Regina Barzilay, Tommi Jaakkola
2022 arXiv   pre-print
Wohlwend, Regina Barzilay, Tommi Jaakkola arXiv:2110.04624v3 [q-bio.BM] 27 Jan 2022 CSAIL, Massachusetts Institute of Technology  ...  Eric and Wendy Schmidt Center, Broad Institute of MIT and Harvard wengong@csail.mit.edu Jeremy  ... 
arXiv:2110.04624v3 fatcat:7lm2oowyvbfnll253oagjsgzky

Structured Pruning of Large Language Models

Ziheng Wang, Jeremy Wohlwend, Tao Lei
2020 Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)   unpublished
Large language models have recently achieved state of the art performance across a wide variety of natural language tasks. Meanwhile, the size of these models and their latency have significantly increased, which makes their usage costly, and raises an interesting question: do language models need to be large? We study this question through the lens of model compression. We present a generic, structured pruning approach by parameterizing each weight matrix using its low-rank factorization, and
more » ... daptively removing rank-1 components during training. On language modeling tasks, our structured approach outperforms other unstructured and block-structured pruning baselines at various compression levels, while achieving significant speedups during both training and inference. We also demonstrate that our method can be applied to pruning adaptive word embeddings in large language models, and to pruning the BERT model on several downstream fine-tuning classification benchmarks. 1
doi:10.18653/v1/2020.emnlp-main.496 fatcat:n4rj2e6carcy3kiuzm3rmv355m

Autoregressive Knowledge Distillation through Imitation Learning

Alexander Lin, Jeremy Wohlwend, Howard Chen, Tao Lei
2020 Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)   unpublished
Our experiments were conducted using Flambé, a PyTorch-based model training and evaluation library (Wohlwend et al., 2019) .  ... 
doi:10.18653/v1/2020.emnlp-main.494 fatcat:ywgz4k3ewbfinegy3i2l5mrrwe

Introduction to The International Rule of Law [chapter]

2021 The International Rule of Law  
Denise Wohlwend -9781789907421 Downloaded from Elgar Online at 05/29/2021 01:17:29PM via free access Denise Wohlwend -9781789907421 Downloaded from Elgar Online at 05/29/2021 01:17:29PM via free access  ...  This has led some scholars, most notably Jeremy Waldron, 25 to criticize the conventional approach to the IROL as a 'misconceived' analogy and to propose a new approach to the IROL as benefiting individuals  ... 
doi:10.4337/9781789907421.00006 fatcat:xzkutew47va7rgzqhvnzbihcny

Beauty-ful Inferiority: Female Subservience in Disney's Beauty and the Beast

Jeremy Chow
2013 LUX  
(Wohlwend 2009, 58) The marketability of Disney products includes an endless supply of film-referenced objects that target a juvenile audience and prey on the purse-strings of parents.  ...  Wohlwend (2009) explains the omnipotence of the Disney corporation: The entire [Disney] franchise produced $4 billion in global retail sales for 2007, offering a bedazzling collection of pastel products  ... 
doi:10.5642/lux.201301.07 fatcat:rcg4roowrnazhdbyrk7slvd4my

Investigating the Influence of Dramatic Arts on Young Children's Social and Academic Development in the World of "Jack and the Beanstalk"

Kathryn F Whitmore
2018 Journal for Learning through the Arts  
Jeremy looks Megan right in the eye and says his name very loudly. Megan and Collin laugh and smile through the game.  ...  Wohlwend (2011) observed kindergarten children engaged in play for an academic year and learned that play allows young children to design and transform text and relational identities.  ... 
doi:10.21977/d913119751 fatcat:dxbcosyxpvfibb3wqhmenq2354

Interaction of phospholipid vesicles with smooth metal-oxide surfaces

Gábor Csúcs, Jeremy J. Ramsden
1998 Biochimica et Biophysica Acta - Biomembranes  
Acknowledgements We thank Mr Hans Krebs for having constructed the Langmuir trough, Mr Max Wohlwend for having designed and built the electronics driving its motors and the feedback circuitry, and Mr Michael  ... 
doi:10.1016/s0005-2736(97)00209-5 pmid:9556348 fatcat:hzjejjlbfrabbk724yeuv3x7dq
« Previous Showing results 1 — 15 out of 48 results