Filters








71 Hits in 3.2 sec

LSTM: A Search Space Odyssey

Klaus Greff, Rupesh K. Srivastava, Jan Koutnik, Bas R. Steunebrink, Jurgen Schmidhuber
2017 IEEE Transactions on Neural Networks and Learning Systems  
This has led to a renewed interest in understanding the role and utility of various computational components of typical LSTM variants.  ...  The hyperparameters of all LSTM variants for each task were optimized separately using random search, and their importance was assessed using the powerful fANOVA framework.  ...  Traditionally this would require a full hyperparameter grid search, whereas here the hyperparameter space can be sampled at random.  ... 
doi:10.1109/tnnls.2016.2582924 pmid:27411231 fatcat:yyodhxbsfjdllgmkam2l7nehlq

Deep Recurrent Neural Network for Protein Function Prediction from Sequence [article]

Xueliang Liu
2017 arXiv   pre-print
For proteins, accurate prediction of their functions directly from their primary amino-acid sequences has been a long standing challenge.  ...  The RNN models containing long-short-term-memory (LSTM) units trained on public, annotated datasets from UniProt achieved high performance for in-class prediction of four important protein functions tested  ...  The author would like to acknowledge the Harvard Odyssey Computing Cluster for providing the computational resources for this work.  ... 
arXiv:1701.08318v1 fatcat:5thkvgptlbga7ogqtfoxhotc2y

Composing Text and Image for Image Retrieval - An Empirical Odyssey [article]

Nam Vo, Lu Jiang, Chen Sun, Kevin Murphy, Li-Jia Li, Li Fei-Fei, James Hays
2018 arXiv   pre-print
To tackle this task, we learn a similarity metric between a target image and a source image plus source text, an embedding and composing function such that target image feature is close to the source image  ...  We propose a new way to combine image and text using such function that is designed for the retrieval task.  ...  There are several ways of formulating the concept as a search query, such as a text string, a similar image, or even a sketch, or some combination of the above.  ... 
arXiv:1812.07119v1 fatcat:3lewqrypfnc6fhcx66basrkpjq

Deep Recurrent Neural Network for Protein Function Prediction from Sequence [article]

Xueliang (Leon) Liu
2017 bioRxiv   pre-print
For proteins, accurate prediction of their functions directly from their primary amino-acid sequences has been a long standing challenge.  ...  The RNN models containing long-short-term-memory (LSTM) units trained on public, annotated datasets from UniProt achieved high performance for in-class prediction of four important protein functions tested  ...  The author would like to acknowledge the Harvard Odyssey Computing Cluster for providing the computational resources for this work.  ... 
doi:10.1101/103994 fatcat:6cumbwco5zbspcp3rxye7wsajy

Long Short-Term Memory Neural Network for Financial Time Series [article]

Carmina Fjellström
2022 arXiv   pre-print
With a straightforward trading strategy, comparisons with a randomly chosen portfolio and a portfolio containing all the stocks in the index show that the portfolio resulting from the LSTM ensemble provides  ...  A binary classification problem based on the median of returns is used, and the ensemble's forecast depends on a threshold value, which is the minimum number of LSTMs required to agree upon the result.  ...  Lstm: A search space odyssey. IEEE Transactions on Neural Networks and Learning Systems, 28(10):2222– 2232, 2017. [17] J. B. Heaton, N. G. Polson, and J. H. Witte. Deep learning in finance.  ... 
arXiv:2201.08218v1 fatcat:r6vfrzv5jjc3fksecnzblv5fti

Language Modeling with Deep Transformers

Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney
2019 Interspeech 2019  
We show that well configured Transformer models outperform our baseline models based on the shallow stack of LSTM recurrent neural network layers.  ...  However, in autoregressive setup, as is the case for language modeling, the amount of information increases along the position dimension, which is a positional signal by its own.  ...  Hyper-parameters in Transformers The Transformer architecture is a new search space Odyssey [32] .  ... 
doi:10.21437/interspeech.2019-2225 dblp:conf/interspeech/IrieZSN19 fatcat:n2e7sisi6bealgl6ro7nm7m3li

Language Modeling with Deep Transformers [article]

Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney
2019 arXiv   pre-print
We show that well configured Transformer models outperform our baseline models based on the shallow stack of LSTM recurrent neural network layers.  ...  However, in autoregressive setup, as is the case for language modeling, the amount of information increases along the position dimension, which is a positional signal by its own.  ...  This work has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No 694537, project "SEQCLAS") and from a  ... 
arXiv:1905.04226v1 fatcat:56bo4zykqfellm46eyxvmhi33u

Quantity doesn't buy quality syntax with neural language models [article]

Marten van Schijndel, Aaron Mueller, Tal Linzen
2019 arXiv   pre-print
A comparison to GPT and BERT, Transformer-based models trained on billions of words, reveals that these models perform even more poorly than our LSTMs in some constructions.  ...  We find that gains from increasing network size are minimal beyond a certain point.  ...  LSTM: A search space odyssey. IEEE Transac- tions on Neural Networks and Learning Systems, 28(10):2222-2232.  ... 
arXiv:1909.00111v1 fatcat:sge565sgajhzpbcmm5r74xz4ly

HeterPS: Distributed Deep Learning With Reinforcement Learning Based Scheduling in Heterogeneous Environments [article]

Ji Liu, Zhihua Wu, Dianhai Yu, Yanjun Ma, Danlei Feng, Minxu Zhang, Xinxuan Wu, Xuefeng Yao, Dejing Dou
2022 arXiv   pre-print
To efficiently train a DNN model using the heterogeneous computing resources, we propose a distributed framework, i.e., Paddle-Heterogeneous Parameter Server (Paddle-HeterPS), composed of a distributed  ...  architecture and a Reinforcement Learning (RL)-based scheduling method.  ...  Lstm: A search space odyssey. IEEE Li, M., Andersen, D. G., Park, J. W., Smola, A.  ... 
arXiv:2111.10635v2 fatcat:bhfnem6gqzetnffma4sihfqbsa

Large Scale Subject Category Classification of Scholarly Papers With Deep Attentive Neural Networks

Bharath Kandimalla, Shaurya Rohatgi, Jian Wu, C. Lee Giles
2021 Frontiers in Research Metrics and Analytics  
Subject category classification is a prerequisite for bibliometric studies, organizing scientific publications for domain knowledge extraction, and facilitating faceted searches for digital library search  ...  We also determine the subject category distribution in CiteSeerX by classifying a random sample of one million academic papers.  ...  LSTM: a search space odyssey. IEEE Trans. Neural Networks Learn. Syst.  ... 
doi:10.3389/frma.2020.600382 pmid:33870061 pmcid:PMC8025978 fatcat:7j5axyejf5fyrekzhcfy7wduhi

Recurrent Neural Networks for anomaly detection in the Post-Mortem time series of LHC superconducting magnets [article]

Maciej Wielgosz and Andrzej Skoczeń and Matej Mertik
2017 arXiv   pre-print
This paper presents a model based on Deep Learning algorithms of LSTM and GRU for facilitating an anomaly detection in Large Hadron Collider superconducting magnets.  ...  We used high resolution data available in Post Mortem database to train a set of models and chose the best possible set of their hyper-parameters.  ...  Schmidhuber, LSTM: A Search Space Odyssey (2015). arXiv:1503.04069. [16] S. Hochreiter, J.  ... 
arXiv:1702.00833v1 fatcat:m5mecqmh7rgaxjryuhcc3hlv4u

Simple Recurrent Units for Highly Parallelizable Recurrence

Tao Lei, Yu Zhang, Sida I. Wang, Hui Dai, Yoav Artzi
2018 Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing  
In this work, we propose the Simple Recurrent Unit (SRU), a light recurrent unit that balances model capacity and scalability.  ...  SRU achieves 5-9x speed-up over cuDNN-optimized LSTM on classification and question answering datasets, and delivers stronger results than LSTM and convolutional models.  ...  A special thanks to Hugh Perkins for his support on the experimental environment setup and Runqi Yang for answering questions about his code.  ... 
doi:10.18653/v1/d18-1477 dblp:conf/emnlp/LeiZWDA18 fatcat:55hgrm6vjjbejanbzk5o435bke

Underwater Target Noise Recognition and Classification Technology based on Multi-Classes Feature Fusion

Shaokang Zhang, Chao Wang, Qindong Sun
2020 Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University  
In order to solve this problem, a multi-layers LSTM underwater acoustic target noise feature extraction model is established by using the long short term memory network.  ...  The information features such as time-domain envelope of target noise, Demon line spectrum and Mel frequency cepstrum coefficient are extracted, and a subset of multi-classes features is constructed.  ...  LSTM Search Space Odyssey[ J] . IEEE Trans on Neural Networks and Learning Systems, 2017, 28(10) : 2222•2232 [19] ALEX Graves. Learning Precise Timing with LSTM Recurrent Networks[ J] .  ... 
doi:10.1051/jnwpu/20203820366 fatcat:7xwoci6to5hjra2itohtxrxoc4

SentimentArcs: A Novel Method for Self-Supervised Sentiment Analysis of Time Series Shows SOTA Transformers Can Struggle Finding Narrative Arcs [article]

Jon Chun
2021 arXiv   pre-print
A large ensemble of diverse models provides a synthetic ground truth for self-supervised learning. Novel metrics jointly optimize an exhaustive search across every possible corpus:model combination.  ...  This paper introduces SentimentArcs, a new self-supervised time series sentiment analysis methodology that addresses the two main limitations of traditional supervised sentiment analysis: limited labeled  ...  Literary scholars are averse to the very concept of a single 'ground truth,' and this is why SentimentArcs is designed to search the problem space of all corpus:model combinations exhaustively, to provide  ... 
arXiv:2110.09454v1 fatcat:2lmkzp3suvegjizvxyrieppiza

Improving the Gating Mechanism of Recurrent Neural Networks [article]

Albert Gu, Caglar Gulcehre, Tom Le Paine, Matt Hoffman, Razvan Pascanu
2020 arXiv   pre-print
Empirically, our simple gating mechanisms robustly improve the performance of recurrent models on a range of applications, including synthetic memorization tasks, sequential image classification, language  ...  LSTM: A search space odyssey. IEEE transactions on neural networks and learning systems, 28(10):2222-2232, 2016. Gulcehre, C., Moczulski, M., Denil, M., and Bengio, Y. Noisy activation functions.  ...  The agent navigates a 3D world using observations from a first person camera. The task has three phases. In phase 1, the agent must search for a colored cue.  ... 
arXiv:1910.09890v2 fatcat:qd4x7mjajfa75hvjrh76zxhbky
« Previous Showing results 1 — 15 out of 71 results