A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Uncertainty in Neural Network Word Embedding: Exploration of Threshold for Similarity
[article]
2018
arXiv
pre-print
Word embedding, specially with its recent developments, promises a quantification of the similarity between terms. ...
We first observe and quantify the uncertainty factor of the word embedding models regarding to the similarity value. ...
Uncertainty of Similarity In this section we make a series of practical observations on word embeddings and the similarities computed based on them. ...
arXiv:1606.06086v2
fatcat:pgzs6wouh5cdtm56cchesjsrqu
A Financial Service Chatbot based on Deep Bidirectional Transformers
[article]
2020
arXiv
pre-print
We investigated two uncertainty metrics, information entropy and variance of dropout sampling in BERT, followed by mixed-integer programming to optimize decision thresholds. ...
The proposed approach can be useful for industries seeking similar in-house solutions in their specific business domains. ...
We thank colleagues in Vanguard Retail Group (IT/Digital, Customer Care) for their pioneering effort collecting and curating all the data used in our approach. ...
arXiv:2003.04987v1
fatcat:pa6brq5avnb33hkvs74kpbdsgu
Explaining Financial Uncertainty through Specialized Word Embeddings
2020
ACM/IMS Transactions on Data Science
As a baseline, we use an existing dictionary of financial uncertainty triggers; furthermore, we retrieve related terms in specialized word embedding models to automatically expand this dictionary. ...
To explore this field, we use term weighting methods to detect linguistic uncertainty in a large dataset of financial disclosures. ...
ACKNOWLEDGMENTS We thank the anonymous reviewers for their helpful comments. ...
doi:10.1145/3343039
fatcat:noxcctneczf3noe56oxxjez5ti
AVA: A Financial Service Chatbot Based on Deep Bidirectional Transformers
2021
Frontiers in Applied Mathematics and Statistics
We investigated two uncertainty metrics, information entropy and variance of dropout sampling, in BERT, followed by mixed-integer programming to optimize decision thresholds. ...
The proposed approach can be useful for industries seeking similar in-house solutions in their specific business domains. ...
We use mixed-integer optimization to find a threshold for human escalation of a user query based on the mean prediction and the uncertainty of the prediction. ...
doi:10.3389/fams.2021.604842
fatcat:2c3olripsndttkiuwasgutivza
Exploring Confidence Measures for Word Spotting in Heterogeneous Datasets
[article]
2019
arXiv
pre-print
In this paper, we explore different metrics for quantifying the confidence of a CNN in its predictions, specifically on the retrieval problem of word spotting. ...
We investigate four different approaches that are either based on the network's attribute estimations or make use of a surrogate model. ...
INTRODUCTION Word spotting is a powerful tool for exploring handwritten document collections. ...
arXiv:1903.10930v1
fatcat:3fmpjfx4wbh5dkyztuyitflxhq
Clustering Chinese Product Features with Multilevel Similarity
[chapter]
2015
Lecture Notes in Computer Science
To handle different levels of connections between co-referred product features, we consider three similarity measures, namely the literal similarity, the word embedding-based semantic similarity and the ...
This paper presents an unsupervised hierarchical clustering approach for grouping co-referred features in Chinese product reviews. ...
To approach this, we explore three levels of similarities, namely the literal similarity, the semantic similarity based on word embeddings and the contextual similarity based on explanatory evaluations ...
doi:10.1007/978-3-319-25816-4_28
fatcat:dtxb4q5xmbc4vokvk2o6ixve3u
Word2Box: Learning Word Representation Using Box Embeddings
[article]
2021
arXiv
pre-print
Learning vector representations for words is one of the most fundamental topics in NLP, capable of capturing syntactic and semantic relationships useful in a variety of downstream NLP tasks. ...
We demonstrate improved performance on various word similarity tasks, particularly on less common words, and perform a qualitative analysis exploring the additional unique expressivity provided by Word2Box ...
In this work, we introduce WORD2BOX, a region-based embedding for words where each word is represented by an n-dimensional hyperrectangle or "box". ...
arXiv:2106.14361v1
fatcat:lzpcx7qgzrhqxf2plmin6jgray
Multimodal Word Distributions
[article]
2019
arXiv
pre-print
Word embeddings provide point representations of words containing useful semantic information. ...
We introduce multimodal word distributions formed from Gaussian mixtures, for multiple word meanings, entailment, and rich uncertainty information. ...
Acknowledgements We thank NSF IIS-1563887 for support. ...
arXiv:1704.08424v2
fatcat:xtuohjrfyrhq7nzsgrriztz4hm
Multimodal Word Distributions
2017
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Word embeddings provide point representations of words containing useful semantic information. ...
We introduce multimodal word distributions formed from Gaussian mixtures, for multiple word meanings, entailment, and rich uncertainty information. ...
Acknowledgements We thank NSF IIS-1563887 for support. ...
doi:10.18653/v1/p17-1151
dblp:conf/acl/AthiwaratkunW17
fatcat:42me3nofqra35bab5pluqqoatu
Fluent: An AI Augmented Writing Tool for People who Stutter
[article]
2021
arXiv
pre-print
Fluent embodies a novel active learning based method of identifying words an individual might struggle pronouncing. Such words are highlighted in the interface. ...
On hovering over any such word, Fluent presents a set of alternative words which have similar meaning but are easier to speak. The user is free to accept or ignore these suggestions. ...
Phonetic embeddings map each word to its corresponding vector representation based on the constituting phonemes. Words with similar pronunciation will be closer to each other in the embedding space. ...
arXiv:2108.09918v1
fatcat:4dfrbmkphnhw5ermsmn4zkrs3a
Detecting Emerging Symptoms of COVID-19 using Context-based Twitter Embeddings
[article]
2020
arXiv
pre-print
In this paper, we present an iterative graph-based approach for the detection of symptoms of COVID-19, the pathology of which seems to be evolving. ...
More generally, the method can be applied to finding context-specific words and texts (e.g. symptom mentions) in large imbalanced corpora (e.g. all tweets mentioning #COVID-19). ...
If all words for given depth are explored, the top m words corresponding to that depth are selected based on similarity to CEmb. ...
arXiv:2011.03983v1
fatcat:dg6dxblwyvagxhxdmolw4eidoy
Out-of-Distribution Detection using Multiple Semantic Label Representations
[article]
2019
arXiv
pre-print
Deep Neural Networks are powerful models that attained remarkable results on a variety of tasks. ...
However, it is not clear how a network will act when it is fed with an out-of-distribution example. In this work, we consider the problem of out-of-distribution detection in neural networks. ...
Our model is based on word embedding representations. ...
arXiv:1808.06664v3
fatcat:zwtomwj54vbsfifzdljb7ph67i
Simultaneous Learning of Pivots and Representations for Cross-Domain Sentiment Classification
2020
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
A series of approaches depend on the pivot features that behave similarly for polarity prediction in both domains. ...
Cross-domain sentiment classification aims to leverage useful knowledge from a source domain to mitigate the supervision sparsity in a target domain. ...
Acknowledgments This work was supported by the National Key R&D Program of China (2017YFC1502003) and Natural Science Foundation of China (61772299 and 71690231). ...
doi:10.1609/aaai.v34i05.6336
fatcat:3wbotv24w5gs5oeucrowtbspju
Frequency discrimination in budgerigars (Melopsittacus undulatus): Effects of tone duration and tonal context
2000
Journal of the Acoustical Society of America
FDLs in budgerigars for 20-ms tones embedded in a sequence of six other tones were similar to FDLs measured for tones of the same frequency presented in isolation. ...
Moreover, there was no effect of introducing trial-by-trial variation in the location of the frequency change in the seven-tone complexes for budgerigars, a condition for which humans showed a large decrement ...
In experiment 3, we explored the effects of a surrounding tonal context on discrimination of frequency change in a ͑24-ms͒ tone burst. ...
doi:10.1121/1.428651
pmid:10830387
fatcat:lr3hntbnmzccfegvegr2fhx46m
Analyzing the Role of Model Uncertainty for Electronic Health Records
[article]
2019
arXiv
pre-print
In light of this, we investigate the role of model uncertainty methods in the medical domain. ...
Meanwhile, the presence of significant variability in patient-specific predictions and optimal decisions motivates the need for capturing model uncertainty. ...
For this analysis, we focus on the free-text clinical notes found in the EHR. For each word in the notes vocabulary, we have an associated embeddings distribution formulated as a multivariate Normal. ...
arXiv:1906.03842v2
fatcat:qlaxzgwl6raitbgqt7l6jhg4rm
« Previous
Showing results 1 — 15 out of 46,848 results