A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Combining Textual and Speech Features in the NLI Task Using State-of-the-Art Machine Learning Techniques
2017
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications
Our best performing method is based on a set of feed-forward neural networks whose hidden-layer outputs are combined together using a softmax layer. ...
Combining the input data of two different modalities led to a rather dramatic improvement in classification performance. ...
In fact, this compact feature set comprises a big amount of statistical information about a huge number of n-grams hidden in the language models consisting of smoothed linear n-grams combinations. ...
doi:10.18653/v1/w17-5021
dblp:conf/bea/IrcingSZHH17
fatcat:7jpnwcsyzvcgra76d7er77s72a
A Comparison between Count and Neural Network Models Based on Joint Translation and Reordering Sequences
2015
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
We investigate the performances of ngram models with modified Kneser-Ney smoothing, feed-forward and recurrent neural network architectures when estimated on JTR sequences, and compare them to the operation ...
They are constructed in a simple manner while capturing multiple alignments and empty words. JTR sequences can be used to train a variety of models. ...
Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of DARPA. ...
doi:10.18653/v1/d15-1165
dblp:conf/emnlp/GutaAPWN15
fatcat:2frute2wmjat5gddafb34ztdlm
Comparison of feedforward and recurrent neural network language models
2013
2013 IEEE International Conference on Acoustics, Speech and Signal Processing
Research on language modeling for speech recognition has increasingly focused on the application of neural networks. ...
In addition, we propose a simple and efficient method to normalize language model probabilities across different vocabularies, and we show how to speed up training of recurrent neural networks by parallelization ...
Contrasting these results, in [9] a 10-gram feedforward NNLM performed even slightly better in perplexity than a recurrent network on a large-scale English to French translation task. ...
doi:10.1109/icassp.2013.6639310
dblp:conf/icassp/SundermeyerOGFSN13
fatcat:enuwtrs5qjcrtg6rvoiao6cauq
Convolutional Neural Network Language Models
2016
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
Second, we gain some understanding of the behavior of the model, showing that CNNs in language act as feature detectors at a high level of abstraction, like in Computer Vision, and that the model can profitably ...
First, we show that CNNs achieve 11-26% better absolute performance than feed-forward neural language models, demonstrating their potential for language representation even in sequential tasks. ...
We gratefully acknowledge the support of NVIDIA Corporation with the donation of the GPUs used in our research. ...
doi:10.18653/v1/d16-1123
dblp:conf/emnlp/PhamKB16
fatcat:whwx7amqerg6boqcyningkzbry
Long Short-Term Memory for Japanese Word Segmentation
[article]
2018
arXiv
pre-print
The experimental results indicate that the proposed model achieves state-of-the-art accuracy with respect to various Japanese corpora. ...
However, in contrast to Chinese, Japanese includes several character types, such as hiragana, katakana, and kanji, that produce orthographic variations and increase the difficulty of word segmentation. ...
We thank anonymous reviewers for suggestions and comments, which helped in improving the paper. ...
arXiv:1709.08011v3
fatcat:hktze3aflvhfvc645tedlpnkye
Bangla↔English Machine Translation Using Attention-based Multi-Headed Transformer Model
2021
Journal of Computer Science
The attention-based multi-headed transformer model has been considered in this study due to its significant features of parallelism in input processing. ...
A transformer model consisting of encoders and decoders is adapted by tuning different parameters (especially, number of heads) to identify the best performing model for Bangla to English and vice versa ...
In contrast, MT resources on the Bangla language are very limited despite being a major language in the world, the fifth-ranked globally with 228 million native speakers and the first language of Bangladesh ...
doi:10.3844/jcssp.2021.1000.1010
fatcat:j75zuuchevdw3p2xlmkf226jne
Word Translation Prediction for Morphologically Rich Languages with Bilingual Neural Networks
2014
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Translating into morphologically rich languages is a particularly difficult problem in machine translation due to the high degree of inflectional ambiguity in the target language, often only poorly captured ...
by existing word translation models. ...
We would like to thank Ekaterina Garmash for helping with the error analysis of the English-Russian translations. ...
doi:10.3115/v1/d14-1175
dblp:conf/emnlp/TranBM14
fatcat:3gcimp732ndzhm465rxo7gmaia
Compressing Neural Language Models by Sparse Word Representations
2016
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
In the experiments, the number of parameters in our model increases very slowly with the growth of the vocabulary size, which is almost imperceptible. ...
In this paper, we propose to compress neural language models by sparse word representations. ...
propose to use a feedforward neural network (FFNN) to replace the multinomial parameter estimation in n-gram models. ...
doi:10.18653/v1/p16-1022
dblp:conf/acl/ChenMXLJ16
fatcat:6hplbnnkvjgvhppuces4j5alke
Compressing Neural Language Models by Sparse Word Representations
[article]
2016
arXiv
pre-print
In the experiments, the number of parameters in our model increases very slowly with the growth of the vocabulary size, which is almost imperceptible. ...
In this paper, we propose to compress neural language models by sparse word representations. ...
propose to use a feedforward neural network (FFNN) to replace the multinomial parameter estimation in n-gram models. ...
arXiv:1610.03950v1
fatcat:cz4qbnfrrbam5grap3fyiau2fy
Monolingual and Cross-Lingual Intent Detection without Training Data in Target Languages
2021
Electronics
Due to recent DNN advancements, many NLP problems can be effectively solved using transformer-based models and supervised data. Unfortunately, such data is not available in some languages. ...
in the target language. ...
The type of the performed research validation is the analysis. ...
doi:10.3390/electronics10121412
fatcat:tjwfqbmimnghzecwy4jh72bvja
Sentence-Level Classification Using Parallel Fuzzy Deep Learning Classifier
2021
figshare.com
Then, we used the Mamdani Fuzzy System (MFS) as a fuzzy classifier to classify the outcomes of the two used deep (CNN+FFNN) learning models in three classes, which are: Neutral, Negative, and Positive. ...
In this study, a new Fuzzy Deep Learning Classifier (FDLC) is suggested for improving the performance of data-sentiment classification. ...
In contrast, the accuracy of DL algorithms raises with respect to the rise in the volume of data. ...
doi:10.6084/m9.figshare.13687864.v1
fatcat:34hpmm67kvhghilwjyaak64z3q
Does Commonsense help in detecting Sarcasm?
[article]
2021
arXiv
pre-print
For this, we incorporate commonsense knowledge into the prediction process using a graph convolution network with pre-trained language model embeddings as input. ...
It is a challenging task requiring a deep understanding of language, context, and world knowledge. ...
We perform an array of analysis experiments to identify where the commonsense infused model outperforms the baseline and where it fails. ...
arXiv:2109.08588v1
fatcat:b7twlw2jzvhmvnf57sdxtxptla
A Survey On Neural Word Embeddings
[article]
2021
arXiv
pre-print
Finally, we describe benchmark datasets in word embeddings' performance evaluation and downstream tasks along with the performance results of/due to word embeddings. ...
The study of meaning in natural language processing (NLP) relies on the distributional hypothesis where language elements get meaning from the words that co-occur within contexts. ...
| 1 ) ( 3 | 2 , 1 )... ( | −1 , ..., 1 ) = =1 ( | −1 , ..., 1 ) (2) In traditional language modeling, the next word's probability is calculated based on the statistics of n-gram occurrences. n-grams are ...
arXiv:2110.01804v1
fatcat:rfxwasxwivdvzn6iukbpvvmnai
A Dynamic Speaker Model for Conversational Interactions
2019
Proceedings of the 2019 Conference of the North
Initial model training is unsupervised, using context-sensitive language generation as an objective, with the context being the conversation history. ...
Characterizing these differences can be useful in human-computer interaction, as well as analysis of human-human conversations. ...
The conclusions and findings are those of the authors and do not necessarily reflect the views of sponsors. ...
doi:10.18653/v1/n19-1284
dblp:conf/naacl/00020O19
fatcat:ivlqmevaubahfbsm2kocpj4tpe
Sentence-Level Classification Using Parallel Fuzzy Deep Learning Classifier
2021
IEEE Access
|D| to speedily lookup the vector representation of the n-gram character in E m Then, for each tweet with t n-gram characters, a embedding matrix M = V nc1 ; V nc2 ; ...; V nci ; ...; V nc|t| has been ...
In contrast, the features are extracted automatically in the case of DL, which is the powerful point of DL models against ML techniques. ...
Table 17 , we note that our FDLC is practically always capable of achieving higher average accuracy with a very low mean standard deviation. ...
doi:10.1109/access.2021.3053917
fatcat:fgdxu2pgxff2vgdlyxdtwz6o4y
« Previous
Showing results 1 — 15 out of 152 results