Filters








59 Hits in 4.2 sec

An Empirical Exploration of Skip Connections for Sequential Tagging [article]

Huijia Wu, Jiajun Zhang, Chengqing Zong
2016 arXiv   pre-print
Based on this novel skip connections, we successfully train deep stacked bidirectional LSTM models and obtain state-of-the-art results on CCG supertagging and comparable results on POS tagging.  ...  In this paper, we empirically explore the effects of various kinds of skip connections in stacked bidirectional LSTMs for sequential tagging.  ...  The core idea of Long Short-Term Memory networks is to replace (1) with the following equation: c t = f (x t , h t−1 ) + c t−1 (4) where c t is the internal state of the memory cell, which is designed  ... 
arXiv:1610.03167v1 fatcat:mx4vxvvtonh23exgz2ear5m5ki

Supertagging With LSTMs

Ashish Vaswani, Yonatan Bisk, Kenji Sagae, Ryan Musa
2016 Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies  
necessary for the long range syntactic information encoded in supertags.  ...  In this paper we present new state-of-the-art performance on CCG supertagging and parsing. Our model outperforms existing approaches by an absolute gain of 1.5%.  ...  In this paper, we show that Bidirectional Long Short-Term Memory recurrent neural networks (bi-LSTMs) (Graves, 2013; Zaremba et al., 2014) , which can use information from the entire sentence, are a natural  ... 
doi:10.18653/v1/n16-1027 dblp:conf/naacl/VaswaniBSM16 fatcat:egctav6bnnhhnalchvzju2akea

A Dynamic Window Neural Network for CCG Supertagging [article]

Huijia Wu, Jiajun Zhang, Chengqing Zong
2016 arXiv   pre-print
We use this approach to demonstrate the state-of-the-art CCG supertagging performance on the standard test set.  ...  Combinatory Category Grammar (CCG) supertagging is a task to assign lexical categories to each word in a sentence. Almost all previous methods use fixed context window sizes as input features.  ...  Conclusion We presented a dynamic window approach for CCG supertagging. Our model uses logistic gates to filter the context window surrounding the center word.  ... 
arXiv:1610.02749v1 fatcat:jbkyzog4yrapnes7akrfu7drna

A Dynamic Window Neural Network for CCG Supertagging

Huijia Wu, Jiajun Zhang, Chengqing Zong
2017 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
We use this approach to demonstrate the state-of-the-art CCG supertagging performance on the standard test set.  ...  Combinatory Category Grammar (CCG) supertagging is a task to assign lexical categories to each word in a sentence. Almost all previous methods use fixed context window sizes to encode input tokens.  ...  Then we observe that the gating mechanism of long short-term memory (LSTM) blocks, especially the input gate, can determine when to enter into the block.  ... 
doi:10.1609/aaai.v31i1.10992 fatcat:twpy7hsapbgdbj6wqyokociavm

Deconstructing Supertagging into Multi-Task Sequence Prediction

Zhenqi Zhu, Anoop Sarkar
2019 Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)  
Our experimental results show that our multi-task approach significantly improves TAG supertagging with a new state-of-the-art accuracy score of 91.39% on the Penn treebank supertagging dataset.  ...  Supertagging is a sequence prediction task where each word is assigned a complex syntactic structure called a supertag.  ...  In Chapter 3, we review the basics of neural networks and further explain RNN, long short-term memory (LSTM) networks, character embeddings, Glove word embeddings, and highway connections as they are used  ... 
doi:10.18653/v1/k19-1002 dblp:conf/conll/ZhuS19 fatcat:lpnyd63yp5bgjhnyw4hoe6ia4q

Deep multi-task learning with low level tasks supervised at lower layers

Anders Søgaard, Yoav Goldberg
2016 Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)  
We present experiments in syntactic chunking and CCG supertagging, coupled with the additional task of POS-tagging.  ...  We present a multi-task learning architecture with deep bi-directional RNNs, where different tasks supervision can happen at different layers.  ...  Deep bi-RNNs We use a specific flavor of Recurrent Neural Networks (RNNs) (Elman, 1990) called long short-term memory networks (LSTMs) (Hochreiter and Schmidhuber, 1997) .  ... 
doi:10.18653/v1/p16-2038 dblp:conf/acl/SogaardG16 fatcat:4xzuehnq3fbx5obww65x3c3fpm

Shortcut Sequence Tagging [article]

Huijia Wu, Jiajun Zhang, Chengqing Zong
2017 arXiv   pre-print
Adding shortcut connections across different layers is a common way to ease the training of stacked networks. However, extra shortcuts make the recurrent step more complicated.  ...  Based on this architecture, we obtain a 6% relatively improvement over the state-of-the-art on CCGbank supertagging dataset. We also get comparable results on POS tagging task.  ...  The horizontal hierarchy of LSTMs with bidirectional processing can remember the long-range dependencies without affecting the short-term storage.  ... 
arXiv:1701.00576v1 fatcat:qwljvbwpc5gkxcohvafcjvlnta

Hierarchically-Refined Label Attention Network for Sequence Labeling

Leyang Cui, Yue Zhang
2019 Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)  
For better representing label sequences, we investigate a hierarchically-refined label attention network, which explicitly leverages label embeddings and captures potential long-term label dependency by  ...  Results on POS tagging, NER and CCG supertagging show that the proposed model not only improves the overall tagging accuracy with similar number of parameters, but also significantly speeds up the training  ...  In contrast, BiLSTM-LAN can capture potential long-term dependency and better determine the supertags based on global label information.  ... 
doi:10.18653/v1/d19-1422 dblp:conf/emnlp/CuiZ19 fatcat:ak44s6mdgvbb5g5rsoyra2aohi

Keystroke dynamics as signal for shallow syntactic parsing [article]

Barbara Plank
2016 arXiv   pre-print
To test this hypothesis, we explore labels derived from keystroke logs as auxiliary task in a multi-task bidirectional Long Short-Term Memory (bi-LSTM).  ...  Our results show promising results on two shallow syntactic parsing tasks, chunking and CCG supertagging.  ...  Bidirectional Long-Short Term Memory Models Our model is a a hierarchical bi-LSTM as illustrated in Figure 5 .  ... 
arXiv:1610.03321v1 fatcat:54fntg4zcfgprl32tttxnemzpu

LSTM Shift-Reduce CCG Parsing

Wenduan Xu
2016 Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing  
We describe a neural shift-reduce parsing model for CCG, factored into four unidirectional LSTMs and one bidirectional LSTM.  ...  This factorization allows the linearization of the complete parsing history, and results in a highly accurate greedy parser that outperforms all previous beam-search shift-reduce parsers for CCG.  ...  In this paper, we present a neural architecture for shift-reduce CCG parsing based on long short-term memories (LSTMs; Hochreiter and Schmidhuber, 1997) .  ... 
doi:10.18653/v1/d16-1181 dblp:conf/emnlp/Xu16 fatcat:raympzlaf5fdhltqx7jaeh7mwy

German and French Neural Supertagging Experiments for LTAG Parsing

Tatiana Bladier, Andreas van Cranenburgh, Younes Samih, Laura Kallmeyer
2018 Proceedings of ACL 2018, Student Research Workshop  
We present ongoing work on data-driven parsing of German and French with Lexicalized Tree Adjoining Grammars. We use a supertagging approach combined with deep learning.  ...  of n-best supertagging for French and German.  ...  For the recurrent layer we use either bidirectional Long Short Term Memory (LSTM) or Gated Recurrent Units (GRU).  ... 
doi:10.18653/v1/p18-3009 dblp:conf/acl/BladierCSK18 fatcat:fjhbm4pv2bdixmckta3t55yzzy

End-to-End Graph-Based TAG Parsing with Neural Networks

Jungo Kasai, Robert Frank, Pauli Xu, William Merrill, Owen Rambow
2018 Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)  
Our best end-to-end parser, which jointly performs supertagging, POS tagging, and parsing, outperforms the previously reported best results by more than 2.2 LAS and UAS points.  ...  Deep Highway BiLSTM The core of the supertagging model is a deep bidirectional Long Short-Term Memory network (Graves and Schmidhuber, 2005) .  ...  Supertagging Model Recent work has explored neural network models for supertagging in TAG and CCG (Xu et al., 2015; Lewis et al., 2016; Vaswani et al., 2016; Xu, 2016) , and has shown that such models  ... 
doi:10.18653/v1/n18-1107 dblp:conf/naacl/KasaiFXMR18 fatcat:sntxt3ijgjgene7xacibzril7i

End-to-end Graph-based TAG Parsing with Neural Networks [article]

Jungo Kasai and Robert Frank and Pauli Xu and William Merrill and Owen Rambow
2018 arXiv   pre-print
Our best end-to-end parser, which jointly performs supertagging, POS tagging, and parsing, outperforms the previously reported best results by more than 2.2 LAS and UAS points.  ...  Deep Highway BiLSTM The core of the supertagging model is a deep bidirectional Long Short-Term Memory network (Graves and Schmidhuber, 2005) .  ...  Supertagging Model Recent work has explored neural network models for supertagging in TAG and CCG (Xu et al., 2015; Lewis et al., 2016; Vaswani et al., 2016; Xu, 2016) , and has shown that such models  ... 
arXiv:1804.06610v3 fatcat:eoy6lpducbcxhksg555qmqwlte

Valency-Augmented Dependency Parsing

Tianze Shi, Lillian Lee
2018 Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing  
Feature Extraction We adopt bi-directional long short-term memory networks (bi-LSTMs; Hochreiter and Schmidhuber, 1997) as our feature extractors, since they have proven successful in a variety of syntactic  ...  Long short-term memory. Neural Computation, 9(8):1735-1780. Samar Husain, Raghu Pujitha Gade, and Rajeev Sangal. 2011. Linguistically rich graph based data driven parsing for Hindi.  ... 
doi:10.18653/v1/d18-1159 dblp:conf/emnlp/ShiL18 fatcat:p33x22qewzdqrkryzwlarlyebm

Shift-Reduce Constituent Parsing with Neural Lookahead Features

Jiangming Liu, Yue Zhang
2017 Transactions of the Association for Computational Linguistics  
Compared with chart-based models, they leverage richer features by extracting history information from a parser stack, which consists of a sequence of non-local constituents.  ...  In particular, we build a bidirectional LSTM model, which leverages full sentence information to predict the hierarchy of constituents that each word starts and ends.  ...  Long Short-Term Memory (LSTM) mitigates the vanishing gradient problem in RNN training, by introducing gates (i.e., input i, forget f and output o) and a cell memory vector c.  ... 
doi:10.1162/tacl_a_00045 fatcat:xg5gceyifrflxh7kostxall2va
« Previous Showing results 1 — 15 out of 59 results