Filters








152 Hits in 5.9 sec

Combining Textual and Speech Features in the NLI Task Using State-of-the-Art Machine Learning Techniques

Pavel Ircing, Jan Svec, Zbynek Zajic, Barbora Hladka, Martin Holub
2017 Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications  
Our best performing method is based on a set of feed-forward neural networks whose hidden-layer outputs are combined together using a softmax layer.  ...  Combining the input data of two different modalities led to a rather dramatic improvement in classification performance.  ...  In fact, this compact feature set comprises a big amount of statistical information about a huge number of n-grams hidden in the language models consisting of smoothed linear n-grams combinations.  ... 
doi:10.18653/v1/w17-5021 dblp:conf/bea/IrcingSZHH17 fatcat:7jpnwcsyzvcgra76d7er77s72a

A Comparison between Count and Neural Network Models Based on Joint Translation and Reordering Sequences

Andreas Guta, Tamer Alkhouli, Jan-Thorsten Peter, Joern Wuebker, Hermann Ney
2015 Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing  
We investigate the performances of ngram models with modified Kneser-Ney smoothing, feed-forward and recurrent neural network architectures when estimated on JTR sequences, and compare them to the operation  ...  They are constructed in a simple manner while capturing multiple alignments and empty words. JTR sequences can be used to train a variety of models.  ...  Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of DARPA.  ... 
doi:10.18653/v1/d15-1165 dblp:conf/emnlp/GutaAPWN15 fatcat:2frute2wmjat5gddafb34ztdlm

Comparison of feedforward and recurrent neural network language models

M. Sundermeyer, I. Oparin, J.-L. Gauvain, B. Freiberg, R. Schluter, H. Ney
2013 2013 IEEE International Conference on Acoustics, Speech and Signal Processing  
Research on language modeling for speech recognition has increasingly focused on the application of neural networks.  ...  In addition, we propose a simple and efficient method to normalize language model probabilities across different vocabularies, and we show how to speed up training of recurrent neural networks by parallelization  ...  Contrasting these results, in [9] a 10-gram feedforward NNLM performed even slightly better in perplexity than a recurrent network on a large-scale English to French translation task.  ... 
doi:10.1109/icassp.2013.6639310 dblp:conf/icassp/SundermeyerOGFSN13 fatcat:enuwtrs5qjcrtg6rvoiao6cauq

Convolutional Neural Network Language Models

Ngoc-Quan Pham, Germán Kruszewski, Gemma Boleda
2016 Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing  
Second, we gain some understanding of the behavior of the model, showing that CNNs in language act as feature detectors at a high level of abstraction, like in Computer Vision, and that the model can profitably  ...  First, we show that CNNs achieve 11-26% better absolute performance than feed-forward neural language models, demonstrating their potential for language representation even in sequential tasks.  ...  We gratefully acknowledge the support of NVIDIA Corporation with the donation of the GPUs used in our research.  ... 
doi:10.18653/v1/d16-1123 dblp:conf/emnlp/PhamKB16 fatcat:whwx7amqerg6boqcyningkzbry

Long Short-Term Memory for Japanese Word Segmentation [article]

Yoshiaki Kitagawa, Mamoru Komachi
2018 arXiv   pre-print
The experimental results indicate that the proposed model achieves state-of-the-art accuracy with respect to various Japanese corpora.  ...  However, in contrast to Chinese, Japanese includes several character types, such as hiragana, katakana, and kanji, that produce orthographic variations and increase the difficulty of word segmentation.  ...  We thank anonymous reviewers for suggestions and comments, which helped in improving the paper.  ... 
arXiv:1709.08011v3 fatcat:hktze3aflvhfvc645tedlpnkye

Bangla↔English Machine Translation Using Attention-based Multi-Headed Transformer Model

Argha Chandra Dhar, Arna Roy, M. A. H. Akhand, Md Abdus Samad Kamal, Nazmul Siddique
2021 Journal of Computer Science  
The attention-based multi-headed transformer model has been considered in this study due to its significant features of parallelism in input processing.  ...  A transformer model consisting of encoders and decoders is adapted by tuning different parameters (especially, number of heads) to identify the best performing model for Bangla to English and vice versa  ...  In contrast, MT resources on the Bangla language are very limited despite being a major language in the world, the fifth-ranked globally with 228 million native speakers and the first language of Bangladesh  ... 
doi:10.3844/jcssp.2021.1000.1010 fatcat:j75zuuchevdw3p2xlmkf226jne

Word Translation Prediction for Morphologically Rich Languages with Bilingual Neural Networks

Ke M. Tran, Arianna Bisazza, Christof Monz
2014 Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)  
Translating into morphologically rich languages is a particularly difficult problem in machine translation due to the high degree of inflectional ambiguity in the target language, often only poorly captured  ...  by existing word translation models.  ...  We would like to thank Ekaterina Garmash for helping with the error analysis of the English-Russian translations.  ... 
doi:10.3115/v1/d14-1175 dblp:conf/emnlp/TranBM14 fatcat:3gcimp732ndzhm465rxo7gmaia

Compressing Neural Language Models by Sparse Word Representations

Yunchuan Chen, Lili Mou, Yan Xu, Ge Li, Zhi Jin
2016 Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)  
In the experiments, the number of parameters in our model increases very slowly with the growth of the vocabulary size, which is almost imperceptible.  ...  In this paper, we propose to compress neural language models by sparse word representations.  ...  propose to use a feedforward neural network (FFNN) to replace the multinomial parameter estimation in n-gram models.  ... 
doi:10.18653/v1/p16-1022 dblp:conf/acl/ChenMXLJ16 fatcat:6hplbnnkvjgvhppuces4j5alke

Compressing Neural Language Models by Sparse Word Representations [article]

Yunchuan Chen, Lili Mou, Yan Xu, Ge Li, Zhi Jin
2016 arXiv   pre-print
In the experiments, the number of parameters in our model increases very slowly with the growth of the vocabulary size, which is almost imperceptible.  ...  In this paper, we propose to compress neural language models by sparse word representations.  ...  propose to use a feedforward neural network (FFNN) to replace the multinomial parameter estimation in n-gram models.  ... 
arXiv:1610.03950v1 fatcat:cz4qbnfrrbam5grap3fyiau2fy

Monolingual and Cross-Lingual Intent Detection without Training Data in Target Languages

Jurgita Kapočiūtė-Dzikienė, Askars Salimbajevs, Raivis Skadiņš
2021 Electronics  
Due to recent DNN advancements, many NLP problems can be effectively solved using transformer-based models and supervised data. Unfortunately, such data is not available in some languages.  ...  in the target language.  ...  The type of the performed research validation is the analysis.  ... 
doi:10.3390/electronics10121412 fatcat:tjwfqbmimnghzecwy4jh72bvja

Sentence-Level Classification Using Parallel Fuzzy Deep Learning Classifier

Junaid Qadir
2021 figshare.com  
Then, we used the Mamdani Fuzzy System (MFS) as a fuzzy classifier to classify the outcomes of the two used deep (CNN+FFNN) learning models in three classes, which are: Neutral, Negative, and Positive.  ...  In this study, a new Fuzzy Deep Learning Classifier (FDLC) is suggested for improving the performance of data-sentiment classification.  ...  In contrast, the accuracy of DL algorithms raises with respect to the rise in the volume of data.  ... 
doi:10.6084/m9.figshare.13687864.v1 fatcat:34hpmm67kvhghilwjyaak64z3q

Does Commonsense help in detecting Sarcasm? [article]

Somnath Basu Roy Chowdhury, Snigdha Chaturvedi
2021 arXiv   pre-print
For this, we incorporate commonsense knowledge into the prediction process using a graph convolution network with pre-trained language model embeddings as input.  ...  It is a challenging task requiring a deep understanding of language, context, and world knowledge.  ...  We perform an array of analysis experiments to identify where the commonsense infused model outperforms the baseline and where it fails.  ... 
arXiv:2109.08588v1 fatcat:b7twlw2jzvhmvnf57sdxtxptla

A Survey On Neural Word Embeddings [article]

Erhan Sezerer, Selma Tekir
2021 arXiv   pre-print
Finally, we describe benchmark datasets in word embeddings' performance evaluation and downstream tasks along with the performance results of/due to word embeddings.  ...  The study of meaning in natural language processing (NLP) relies on the distributional hypothesis where language elements get meaning from the words that co-occur within contexts.  ...  | 1 ) ( 3 | 2 , 1 )... ( | −1 , ..., 1 ) = =1 ( | −1 , ..., 1 ) (2) In traditional language modeling, the next word's probability is calculated based on the statistics of n-gram occurrences. n-grams are  ... 
arXiv:2110.01804v1 fatcat:rfxwasxwivdvzn6iukbpvvmnai

A Dynamic Speaker Model for Conversational Interactions

Hao Cheng, Hao Fang, Mari Ostendorf
2019 Proceedings of the 2019 Conference of the North  
Initial model training is unsupervised, using context-sensitive language generation as an objective, with the context being the conversation history.  ...  Characterizing these differences can be useful in human-computer interaction, as well as analysis of human-human conversations.  ...  The conclusions and findings are those of the authors and do not necessarily reflect the views of sponsors.  ... 
doi:10.18653/v1/n19-1284 dblp:conf/naacl/00020O19 fatcat:ivlqmevaubahfbsm2kocpj4tpe

Sentence-Level Classification Using Parallel Fuzzy Deep Learning Classifier

Fatima Es-sabery, Abdellatif Hair, Junaid Qadir, Beatriz Sainz-de-Abajo, Begona Garcia-Zapirain, Isabel De La Torre-Diez
2021 IEEE Access  
|D| to speedily lookup the vector representation of the n-gram character in E m Then, for each tweet with t n-gram characters, a embedding matrix M = V nc1 ; V nc2 ; ...; V nci ; ...; V nc|t| has been  ...  In contrast, the features are extracted automatically in the case of DL, which is the powerful point of DL models against ML techniques.  ...  Table 17 , we note that our FDLC is practically always capable of achieving higher average accuracy with a very low mean standard deviation.  ... 
doi:10.1109/access.2021.3053917 fatcat:fgdxu2pgxff2vgdlyxdtwz6o4y
« Previous Showing results 1 — 15 out of 152 results