3,225 Hits in 7.7 sec

Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models [article]

Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, Jing Yi, Weilin Zhao (+8 others)
2022 arXiv   pre-print
Recent studies have demonstrated that a series of delta tuning methods with distinct tuned parameter selection could achieve performance on a par with full-parameter fine-tuning, suggesting a new promising  ...  In fact, fine-tuning all the parameters of a colossal model and retaining separate instances for different tasks are practically infeasible.  ...  Thanks to all the pioneering researchers who developed the structures, objectives, and delta tuning methods for pre-trained models. Ning Ding is supported by Baidu Scholarship.  ... 
arXiv:2203.06904v2 fatcat:yk2v44f74zbe7hfw4lw2nq7eju

Bi-tuning of Pre-trained Representations [article]

Jincheng Zhong, Ximei Wang, Zhi Kou, Jianmin Wang, Mingsheng Long
2020 arXiv   pre-print
Comprehensive experiments confirm that Bi-tuning achieves state-of-the-art results for fine-tuning tasks of both supervised and unsupervised pre-trained models by large margins (e.g. 10.7\% absolute rise  ...  It is common within the deep learning community to first pre-train a deep neural network from a large-scale dataset and then fine-tune the pre-trained model to a specific downstream task.  ...  During the past years, a few fine-tuning methods have been proposed to exploit the inductive bias of pre-trained models: L2-SP drives the weight parameters of target task to the pre-trained values by  ... 
arXiv:2011.06182v1 fatcat:aebujnhitrapdfhcbhvaav5sea

Self-Tuning for Data-Efficient Deep Learning [article]

Ximei Wang, Jinghan Gao, Mingsheng Long, Jianmin Wang
2021 arXiv   pre-print
of fine-tuning a pre-trained model to the target data.  ...  To escape from this dilemma, we present Self-Tuning to enable data-efficient deep learning by unifying the exploration of labeled and unlabeled data and the transfer of a pre-trained model, as well as  ...  Plan of China.  ... 
arXiv:2102.12903v2 fatcat:65f2nikoxbc2tewqz6zrb3ot7e

Neural Networks for Delta Hedging [article]

Guijin Son, Joocheol Kim
2021 arXiv   pre-print
Lastly, we construct NNHedge, a deep learning framework that provides seamless pipelines for model development and assessment for the experiments.  ...  The Black-Scholes model, defined under the assumption of a perfect financial market, theoretically creates a flawless hedging strategy allowing the trader to evade risks in a portfolio of options.  ...  We use two training steps Pre-Training and Fine-Tuning for the SNN_Pretrained model.  ... 
arXiv:2112.10084v1 fatcat:lo2dd3rkgjdmllbcrmtghh3e6a

Fine-Tuning Data Structures for Analytical Query Processing [article]

Amir Shaikhha, Marios Kelepeshis, Mahdi Ghorbani
2021 arXiv   pre-print
This language is designed around the notion of dictionaries, and allows for a more fine-grained choice of its low-level implementation.  ...  The dictionary cost model is learned using a regression model trained over the profiling dataset of dictionary operations on a given hardware architecture.  ...  the prediction of 8 different regression models trained under various methods with operation run- ning times.  ... 
arXiv:2112.13099v1 fatcat:nsusnsmfmjburlisefcbzp6ooy

From novel to familiar: Tuning the brain for metaphors

Eileen R. Cardillo, Christine E. Watson, Gwenda L. Schmidt, Alexander Kranjec, Anjan Chatterjee
2012 NeuroImage  
We investigated the neural career of metaphors in a functional magnetic resonance imaging study using extensively normed new metaphors and simulated the ordinary, gradual experience of metaphor conventionalization  ...  These results support theoretical accounts attributing a role for the right hemisphere in processing novel, low salience figurative meanings, but also show that conventionalization of metaphoric meaning  ...  Acknowledgments We thank Bianca Bromberger and Matt Lehet for their assistance with behavioral data collection.  ... 
doi:10.1016/j.neuroimage.2011.11.079 pmid:22155328 pmcid:PMC3288556 fatcat:a4uxkh2hmrhtfntnbqs5sosv4i

Rapid tuning shifts in human auditory cortex enhance speech intelligibility

Christopher R. Holdgraf, Wendy de Heer, Brian Pasley, Jochem Rieger, Nathan Crone, Jack J. Lin, Robert T. Knight, Frédéric E. Theunissen
2016 Nature Communications  
Spectrotemporal receptive field (STRF) mapping describes the neural response to acoustic features, and has been used to study contextual effects on auditory receptive fields in animal models.  ...  This plasticity reflects increased sensitivity to spectrotemporal features, enhancing the extraction of more speech-like features from a degraded stimulus and providing the physiological basis for the  ...  Acknowledgements We would like to thank Fionnuala Howell and Rebecca Krasnoff for running the behavioural studies, Matar Haller for brainstorming analysis and troubleshooting ideas, Elena Golumbic for  ... 
doi:10.1038/ncomms13654 pmid:27996965 pmcid:PMC5187445 fatcat:a2lnjws4vnawhi7qoz6p3otw5a

Tuning pathological brain oscillations with neurofeedback: a systems neuroscience framework

Tomas Ros, Bernard J. Baars, Ruth A. Lanius, Patrik Vuilleumier
2014 Frontiers in Human Neuroscience  
The present work attempts to bring together various concepts from neurobiology, engineering, and dynamical systems so as to propose a contemporary theoretical framework for the mechanistic effects of NFB  ...  The central thesis put forward is that NFB tunes brain oscillations toward a homeostatic set-point which affords an optimal balance between network flexibility and stability (i.e., self-organised criticality  ...  ACKNOWLEDGMENTS We are grateful to Gil Sharvit, Kallia Apazoglou and Naomi Steiner for helpful comments.  ... 
doi:10.3389/fnhum.2014.01008 pmid:25566028 pmcid:PMC4270171 fatcat:ptdqnjimzncdtd7ytkvyazzboi

Introduction to the Kalman Filter and Tuning its Statistics for Near Optimal Estimates and Cramer Rao Bound [article]

Shyam Mohan M, Naren Naik, R.M.O. Gemson, M.R. Ananthasayanam
2015 arXiv   pre-print
For a Kalman filter design to provide optimal estimates tuning of its statistics namely initial state and covariance, unknown parameters, and state and measurement noise covariances is important.  ...  Simulation studies of a constant signal, a ramp, a spring, mass, damper system with a weak non linear spring constant, longitudinal and lateral motion of an airplane was followed by similar but more involved  ...  It also gives a bound on the estimated parameters which is a check for efficiency of any estimator.  ... 
arXiv:1503.04313v1 fatcat:zviraqyufzdhnipmfl57z7musa

Generalized and Transferable Patient Language Representation for Phenotyping with Limited Data [article]

Yuqi Si, Elmer V Bernstam, Kirk Roberts
2021 arXiv   pre-print
In this work, we propose a multi-task pre-training and fine-tuning approach for learning generalized and transferable patient representations from medical language.  ...  We find multi-task pre-training increases learning efficiency and achieves consistently high performance across the majority of phenotypes.  ...  However, unlike the "pretraining and fine-tuning" approach, which adapts all model parameters, the feature extraction approach would obtain a fixed vector from layers of the language model.  ... 
arXiv:2103.00482v1 fatcat:uqvafuml2rbyxom5yvexelepnm

On Continual Model Refinement in Out-of-Distribution Data Streams [article]

Bill Yuchen Lin, Sida Wang, Xi Victoria Lin, Robin Jia, Lin Xiao, Xiang Ren, Wen-tau Yih
2022 arXiv   pre-print
For benchmarking and analysis, we propose a general sampling algorithm to obtain dynamic OOD data streams with controllable non-stationarity, as well as a suite of metrics measuring various aspects of  ...  Our experiments and detailed analysis reveal the promise and challenges of the CMR problem, supporting that studying CMR in dynamic OOD streams can benefit the longevity of deployed NLP models in production  ...  Introduction Fine-tuning large pre-trained language models (LMs) has become the de facto standard for training models of a variety of tasks in natural language processing (NLP).  ... 
arXiv:2205.02014v1 fatcat:2gpdcjmkkbbjhfkxhp6rczoene

On the Representation Collapse of Sparse Mixture of Experts [article]

Zewen Chi, Li Dong, Shaohan Huang, Damai Dai, Shuming Ma, Barun Patra, Saksham Singhal, Payal Bajaj, Xia Song, Furu Wei
2022 arXiv   pre-print
We conduct extensive experiments on cross-lingual language model pre-training and fine-tuning on downstream tasks.  ...  We also present a comprehensive analysis on the representation and routing behaviors of our models.  ...  Acknowledgement We would like to acknowledge Bo Zheng and Zhiliang Peng for the helpful discussions.  ... 
arXiv:2204.09179v2 fatcat:6przwrggjzg5rgudsbafnc2dwi

CTAL: Pre-training Cross-modal Transformer for Audio-and-Language Representations [article]

Hang Li, Yu Kang, Tianqiao Liu, Wenbiao Ding, Zitao Liu
2021 arXiv   pre-print
tasks on a large amount of audio-and-language pairs: masked language modeling and masked cross-modal acoustic modeling.  ...  Lastly, we demonstrate detailed ablation studies to prove that both our novel cross-modality fusion component and audio-language pre-training methods significantly contribute to the promising results.  ...  Overall, our final pre-training objective is to minimize the sum of the losses above. Fine-Tuning CTAL CTAL is designed to be a generic pre-training model for various audio-language tasks.  ... 
arXiv:2109.00181v1 fatcat:w46shqs27ba27nrb4wv7cjauh4

Meta Transfer Learning for Emotion Recognition [article]

Dung Nguyen, Sridha Sridharan, Duc Thanh Nguyen, Simon Denman, David Dean, Clinton Fookes
2020 arXiv   pre-print
/pre-trained models based transfer learning methods.  ...  To mitigate this challenge, transfer learning performing fine-tuning on pre-trained models has been applied.  ...  It begins with a pre-trained model on the source task and further trains it on the target task. Ima-geNet pre-trained models are commonly used for fine-tuning.  ... 
arXiv:2006.13211v1 fatcat:vjbvwah3zbetncwxv6uq4zu4hu

Deep Learning Bidirectional LSTM based Detection of Prolongation and Repetition in Stuttered Speech using Weighted MFCC

Sakshi Gupta, Ravi S., Rajesh K., Rajesh Verma
2020 International Journal of Advanced Computer Science and Applications  
The labeled speech samples are parameterized to Weighted MFCC feature vectors. Then extracted features are inputted to the Bidirectional-LSTM network for training and testing of the model.  ...  The effect of different hyper-parameters on classification results is examined. The test results show that the proposed method reaches the best accuracy of 96.67%, as compared to the LSTM model.  ...  It also requires a large number of parameters and data for building and training the model [31] .  ... 
doi:10.14569/ijacsa.2020.0110941 fatcat:js6bgnpqirhd7m7s7hev23mqti
« Previous Showing results 1 — 15 out of 3,225 results