436 Hits in 3.8 sec

Multi-timescale Representation Learning in LSTM Language Models [article]

Shivangi Mahto, Vy A. Vo, Javier S. Turek, Alexander G. Huth
2021 arXiv   pre-print
Experiments then showed that LSTM language models trained on natural English text learn to approximate this theoretical distribution.  ...  In this work, we derived a theory for how the memory gating mechanism in long short-term memory (LSTM) language models can capture power law decay.  ...  MULTI-TIMESCALE LANGUAGE MODELS TIMESCALE OF INFORMATION We are interested in understanding how LSTM language models capture dependencies across time.  ... 
arXiv:2009.12727v2 fatcat:zwl5z77mf5gefogimp6yve7tkq

Interpretable multi-timescale models for predicting fMRI responses to continuous natural speech [article]

Shailee Jain, Vy A Vo, Shivangi Mahto, Amanda LeBel, Javier S Turek, Alexander G Huth
2020 bioRxiv   pre-print
In this work we construct interpretable multi-timescale representations by forcing individual units in an LSTM LM to integrate information over specific temporal scales.  ...  language models (LMs).  ...  Huth also holds a position at Caseforge, Inc., whose products were used in the fMRI experiment.  ... 
doi:10.1101/2020.10.02.324392 fatcat:xqlm26rimnfbrak2jsz5sa3ydy

SyntaxNet Models for the CoNLL 2017 Shared Task [article]

Chris Alberti, Daniel Andor, Ivan Bogatyy, Michael Collins, Dan Gillick, Lingpeng Kong, Terry Koo, Ji Ma, Mark Omernick, Slav Petrov, Chayut Thanapirom, Zora Tung (+1 others)
2017 arXiv   pre-print
This system, which we call "ParseySaurus," uses the DRAGNN framework [Kong et al, 2017] to combine transition-based recurrent parsing and tagging with character-based word representations.  ...  On the v1.3 Universal Dependencies Treebanks, the new system outpeforms the publicly available, state-of-the-art "Parsey's Cousins" models by 3.47% absolute Labeled Accuracy Score (LAS) across 52 treebanks  ...  Instead of modeling each word explicitly, they allow the model to learn a hierarchical "multi-timescale" representation of the input, where each layer corresponds to a (learned) larger timescale.  ... 
arXiv:1703.04929v1 fatcat:xkiwgfhp6bhb3ha6h6gwvi55ca

Multi-Timescale Long Short-Term Memory Neural Network for Modelling Sentences and Documents

Pengfei Liu, Xipeng Qiu, Xinchi Chen, Shiyu Wu, Xuanjing Huang
2015 Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing  
In this paper, we propose a multi-timescale long short-term memory (MT-LSTM) neural network to model long texts. MT-LSTM partitions the hidden states of the standard LSTM into several groups.  ...  Thus, MT-LSTM can model very long documents as well as short sentences. Experiments on four benchmark datasets show that our model outperforms the other neural models in text classification task.  ...  In this paper, we propose a multi-timescale long short-term memory (MT-LSTM) to capture the valuable information with different timescales.  ... 
doi:10.18653/v1/d15-1280 dblp:conf/emnlp/LiuQCWH15 fatcat:dggi5afy4feqtaw2mzaazioygu

Mapping the Timescale Organization of Neural Language Models [article]

Hsiang-Yun Sherry Chien, Jinhan Zhang, Christopher. J. Honey
2021 arXiv   pre-print
Therefore, we applied tools developed in neuroscience to map the "processing timescales" of individual units within a word-level LSTM language model.  ...  neural language models.  ...  How do humans and neural language models encode such multi-scale context information?  ... 
arXiv:2012.06717v2 fatcat:hlzybkpmnbcylbjw3sajzupgpm

Multi-scale discrepancy adversarial network for crosscorpus speech emotion recognition

Wanlu Zheng, Wenming Zheng, Yuan Zong
2021 Virtual Reality & Intelligent Hardware  
In each timescale, the domain discriminator and the feature extrator compete against each other to learn features that minimize the discrepancy between the two domains by fooling the discriminator.  ...  Methods This paper introduces a novel multi-scale discrepancy adversarial (MSDA) network for conducting multiple timescales domain adaptation for cross-corpus SER, i. e., integrating domain discriminators  ...  By projecting data onto the learned transfer component, an out-of-sample generalization representation can be learned in the subspace.  ... 
doi:10.1016/j.vrih.2020.11.006 fatcat:tnzdsoivvfbmlavj7i6rjqtzyq

A Single Long Short-Term Memory Network can Predict Rainfall-Runoff at Multiple Timescales

Mani Manavalan, Naresh Babu Bynagari
2015 Zenodo  
With this research, we suggest a pair of Multi-Time Scale LSTM or MTS-LSTM frameworks that collaboratively forecast a multiplicity of timescales inside a single model.  ...  Juxtaposed with naive forecasts that have distinctive LSTM for each time scale, multi-timescale designs will be computationally the more efficient party, suffering no loss of correctness.  ...  This study is a representation of one step in the direction of the development of operational hydrologic approaches spinning off from the deep learning model.  ... 
doi:10.5281/zenodo.5622588 fatcat:kr3yrow3wbbjfgepzehimgu52y

3G structure for image caption generation

Aihong Yuan, Xuelong Li, Xiaoqiang Lu
2019 Neurocomputing  
In this paper, we propose a model with 3-gated model which fuses the global and local image features together for the task of image caption generation.  ...  With the latter two gates, the relationship between image and text can be well explored, which improves the performance of the language part as well as the multi-modal embedding part.  ...  Acknowledgement This work was supported in part by the National Natural Science Foun- References  ... 
doi:10.1016/j.neucom.2018.10.059 fatcat:alb6cwbg65ayfpuekk7fwhzize

Continuous Learning in a Hierarchical Multiscale Neural Network

Thomas Wolf, Julien Chaumond, Clement Delangue
2018 Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)  
We reformulate the problem of encoding a multi-scale representation of a sequence in a language model by casting it in a continuous learning framework.  ...  We propose a hierarchical multi-scale language model in which short time-scale dependencies are encoded in the hidden state of a lower-level recurrent neural network while longer time-scale dependencies  ...  As a consequence, we would like our model to encode information in a multi-scale hierarchical representation where 1. short time-scale dependencies can be encoded in fast-updated neural activations (hidden  ... 
doi:10.18653/v1/p18-2001 dblp:conf/acl/WolfCD18 fatcat:adke423kuneujkwuughjtp7suy

Deep Learning Based Text Classification: A Comprehensive Review [article]

Shervin Minaee, Nal Kalchbrenner, Erik Cambria, Narjes Nikzad, Meysam Chenaghlu, Jianfeng Gao
2021 arXiv   pre-print
Deep learning based models have surpassed classical machine learning based approaches in various text classification tasks, including sentiment analysis, news categorization, question answering, and natural  ...  In this paper, we provide a comprehensive review of more than 150 deep learning based models for text classification developed in recent years, and discuss their technical contributions, similarities,  ...  The Multi-Timescale LSTM (MT-LSTM) neural network [18] is also designed to model long texts, such as sentences and documents, by capturing valuable information with different timescales.  ... 
arXiv:2004.03705v3 fatcat:al5hstylsbhfpldvokuvlpomam

Chat Discrimination for Intelligent Conversational Agents with a Hybrid CNN-LMTGRU Network

Dennis Singh Moirangthem, Minho Lee
2018 Proceedings of The Third Workshop on Representation Learning for NLP  
In order to address this issue and to realize such smart hybrid dialogue systems, we develop a model to discriminate user utterance between task-oriented and chit-chat conversations.  ...  We introduce a hybrid of convolutional neural network (CNN) and a lateral multiple timescale gated recurrent units (LMTGRU) that can represent multiple temporal scale dependencies for the discrimination  ...  Deep learning based models have achieved great success in many NLP tasks, including learning distributed word, sentence and document representation (Mikolov et al., 2013; Le and Mikolov, 2014) , parsing  ... 
doi:10.18653/v1/w18-3004 dblp:conf/rep4nlp/MoirangthemL18 fatcat:pj4qiefowzgvhoh4dgvno7ep4q

Action-Agnostic Human Pose Forecasting [article]

Hsu-kuang Chiu, Ehsan Adeli, Borui Wang, De-An Huang, Juan Carlos Niebles
2018 arXiv   pre-print
To this end, we propose a new recurrent neural network for modeling the hierarchical and multi-scale characteristics of the human dynamics, denoted by triangular-prism RNN (TP-RNN).  ...  Our model captures the latent hierarchical structure embedded in temporal human pose sequences by encoding the temporal dependencies with different time-scales.  ...  This architecture is able to learn the latent representation of natural language sequences in different hierarchies (e.g., words, phrases, and sentences) to build character-level language models for predicting  ... 
arXiv:1810.09676v1 fatcat:pms3wo6iyvbsrh2vkcdqjfdgza

Crossmodal Language Grounding in an Embodied Neurocognitive Model [article]

Stefan Heinrich, Yuan Yao, Tobias Hinz, Zhiyuan Liu, Thomas Hummel, Matthias Kerzel, Cornelius Weber, Stefan Wermter
2020 arXiv   pre-print
In this paper, we present a neurocognitive model for language grounding which reflects bio-inspired mechanisms such as an implicit adaptation of timescales as well as end-to-end multimodal abstraction.  ...  The model analysis shows that crossmodally integrated representations are sufficient for acquiring language merely from sensory input through interaction with objects in an environment.  ...  Zhiyuan Liu, Cornelius Weber, and Stefan Wermter helped in writing and revising the paper.  ... 
arXiv:2006.13546v1 fatcat:ok7lhtpparg3ni33bxagjuoyae

Learning Molecular Dynamics with Simple Language Model built upon Long Short-Term Memory Neural Network [article]

Sun-Ting Tsai, En-Jui Kuo, Pratyush Tiwary
2020 arXiv   pre-print
Specifically, we use a character-level language model based on LSTM.  ...  We show that the model can not only capture the Boltzmann statistics of the system but it also reproduce kinetics at a large spectrum of timescales.  ...  We also thank Deepthought2, MARCC and XSEDE (projects CHE180007P and CHE180027P) for computational resources used in this work.  ... 
arXiv:2004.12360v2 fatcat:bnhcbqbennaxdfwfx7rglep7le

Temporal Pyramid Recurrent Neural Network

Qianli Ma, Zhenxi Lin, Enhuan Chen, Garrison Cottrell
Learning long-term and multi-scale dependencies in sequential data is a challenging task for recurrent neural networks (RNNs).  ...  In this way, TP-RNN can explicitly learn multi-scale dependencies with multi-scale input sequences of different layers, and shorten the input sequence and gradient feedback paths of each layer.  ...  The work described in this paper was partially funded by the National Natural Science Foundation of China (Grant Nos. 61502174, 61872148), the Natural Science Foundation of Guangdong Province (Grant Nos  ... 
doi:10.1609/aaai.v34i04.5947 fatcat:emjvnmlq2fg5ffxkj2pkkkzxba
« Previous Showing results 1 — 15 out of 436 results