Filters








1,850 Hits in 4.2 sec

Query Focused Multi-document Summarisation of Biomedical Texts [article]

Diego Molla, Christopher Jones, Vincent Nguyen
2020 arXiv   pre-print
Our overall framework implements Query focused multi-document extractive summarisation by applying either a classification or a regression layer to the candidate sentence embeddings and to the comparison  ...  We observe the best results when BERT is used to obtain the word embeddings, followed by an LSTM layer to obtain sentence embeddings.  ...  Acknowledgements Research by Vincent Nguyen is supported by the Australian Research Training Program and the CSIRO Postgraduate Scholarship.  ... 
arXiv:2008.11986v1 fatcat:2byseq2senhc7ezlojpot3z2bq

On the impressive performance of randomly weighted encoders in summarization tasks [article]

Jonathan Pilault, Jaehong Park, Christopher Pal
2020 arXiv   pre-print
We hypothesize that random projections of an input text have enough representational power to encode the hierarchical structure of sentences and semantics of documents.  ...  Using a trained decoder to produce abstractive text summaries, we empirically demonstrate that architectures with untrained randomly initialized encoders perform competitively with respect to the equivalent  ...  This work was partially supported by the IVADO Excellence Scholarship and the Canada First Research Excellence Fund.  ... 
arXiv:2002.09084v1 fatcat:kra4gsrabjhxvcmbqi2mmnbzdi

What you can cram into a single vector: Probing sentence embeddings for linguistic properties [article]

Alexis Conneau, German Kruszewski, Guillaume Lample, Loïc Barrault, Marco Baroni
2018 arXiv   pre-print
"Downstream" tasks, often based on sentence classification, are commonly used to evaluate the quality of sentence representations.  ...  We introduce here 10 probing tasks designed to capture simple linguistic features of sentences, and we use them to study embeddings generated by three different encoders trained in eight distinct ways,  ...  Classification performed by a MLP with sigmoid nonlinearity, taking pre-learned sentence embeddings as input (see Appendix for details and logistic regression results).  ... 
arXiv:1805.01070v2 fatcat:dfhb2n4vojczrg7kziykkmguze

More Than Words: Towards Better Quality Interpretations of Text Classifiers [article]

Muhammad Bilal Zafar, Philipp Schmidt, Michele Donini, Cédric Archambeau, Felix Biessmann, Sanjiv Ranjan Das, Krishnaram Kenthapadi
2021 arXiv   pre-print
These issues have led to the adoption of methods like SHAP and Integrated Gradients to explain classification decisions by assigning importance scores to input tokens.  ...  The large size and complex decision mechanisms of state-of-the-art text classifiers make it difficult for humans to understand their predictions, leading to a potential lack of trust by the users.  ...  up new benchmarks for text classification.  ... 
arXiv:2112.12444v1 fatcat:p7glpcebxfbufnscwtfomrwqiq

Cross-topic Argument Mining from Heterogeneous Sources Using Attention-based Neural Networks [article]

Christian Stab and Tristan Miller and Iryna Gurevych
2018 arXiv   pre-print
In this paper, we propose a new sentential annotation scheme that is reliably applicable by crowd workers to arbitrary Web texts.  ...  Despite its usefulness for this task, most current approaches to argument mining are designed for use only with specific text types and fall short when applied to heterogeneous texts.  ...  applied by untrained annotators.  ... 
arXiv:1802.05758v1 fatcat:wg7joeh2tndonirmmiwupm23iu

What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties

Alexis Conneau, German Kruszewski, Guillaume Lample, Loïc Barrault, Marco Baroni
2018 Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)  
"Downstream" tasks, often based on sentence classification, are commonly used to evaluate the quality of sentence representations.  ...  We introduce here 10 probing tasks designed to capture simple linguistic features of sentences, and we use them to study embeddings generated by three different encoders trained in eight distinct ways,  ...  Classification performed by a MLP with sigmoid nonlinearity, taking pre-learned sentence embeddings as input (see Appendix for details and logistic regression results).  ... 
doi:10.18653/v1/p18-1198 dblp:conf/acl/BaroniBLKC18 fatcat:tqjfd266snfyngrftmmvse2qce

Term Extraction via Neural Sequence Labeling a Comparative Evaluation of Strategies Using Recurrent Neural Networks

Maren Kucza, Jan Niehues, Thomas Zenkel, Alex Waibel, Sebastian Stüker
2018 Interspeech 2018  
To do so we have worked with different kinds of recurrent neural networks and word embeddings.  ...  describe how one can built a state-of-theart term extraction systems with this single-stage technique and compare different network types and topologies and also examine the influence of the type of input embedding  ...  These embeddings are usually trained by building a classificator for an auxillary classification task, such as skip-grams [16] .  ... 
doi:10.21437/interspeech.2018-2017 dblp:conf/interspeech/KuczaNZWS18 fatcat:7lucz7dbvvfqvh6n6derbvs2aq

Language Modeling Teaches You More Syntax than Translation Does: Lessons Learned Through Auxiliary Task Analysis [article]

Kelly W. Zhang, Samuel R. Bowman
2019 arXiv   pre-print
We make a fair comparison between the tasks by holding constant the quantity and genre of the training data, as well as the LSTM architecture.  ...  each sentence in a running text.  ...  sentence-level classification tasks.  ... 
arXiv:1809.10040v2 fatcat:prdxahnek5dyxnej3kcj2exyii

Discourse Coherence in the Wild: A Dataset, Evaluation and Methods [article]

Alice Lai, Joel Tetreault
2018 arXiv   pre-print
We analyze these performance differences and discuss patterns we observed in low coherence texts in four domains.  ...  To address this, we present a new corpus of real-world texts (GCDC) as well as the first large-scale evaluation of leading discourse coherence algorithms.  ...  Minority Class Classification One application of a coherence classification system would be to provide feedback to writers by flagging text that is not very coherent.  ... 
arXiv:1805.04993v1 fatcat:2heidfd265eihngdqpuwz33nvy

Discourse Coherence in the Wild: A Dataset, Evaluation and Methods

Alice Lai, Joel Tetreault
2018 Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue  
We analyze these performance differences and discuss patterns we observed in low coherence texts in four domains.  ...  To address this, we present a new corpus of real-world texts (GCDC) as well as the first large-scale evaluation of leading discourse coherence algorithms.  ...  Minority Class Classification One application of a coherence classification system would be to provide feedback to writers by flagging text that is not very coherent.  ... 
doi:10.18653/v1/w18-5023 dblp:conf/sigdial/LaiT18 fatcat:etcfk3hs5vfozc62ul557gmthm

Transformer over Pre-trained Transformer for Neural Text Segmentation with Enhanced Topic Coherence [article]

Kelvin Lo, Yuan Jin, Weicong Tan, Ming Liu, Lan Du, Wray Buntine
2021 arXiv   pre-print
Given the sentence embeddings, the upper-level transformer is trained to recover the segmentation boundaries as well as the topic labels of each sentence.  ...  It consists of two components: bottom-level sentence encoders using pre-trained transformers, and an upper-level transformer-based segmentation model based on the sentence embeddings.  ...  Sentence Classification at the Upper Level Once the sentence embeddings are obtained, we train a transformer model at the upper level of the architecture to classify 1) whether each sentence is the segment  ... 
arXiv:2110.07160v1 fatcat:hmrqbe4i3ncg7nxgkaplnrhbqq

Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models [article]

Jize Cao, Zhe Gan, Yu Cheng, Licheng Yu, Yen-Chun Chen, Jingjing Liu
2020 arXiv   pre-print
Models such as ViLBERT, LXMERT and UNITER have significantly lifted state of the art across a wide range of V+L benchmarks with joint image-text pre-training.  ...  decipher the inner workings of multimodal pre-training (e.g., the implicit knowledge garnered in individual attention heads, the inherent cross-modal alignment learned through contextualized multimodal embeddings  ...  Inspired by BERT [9] , a common practice for pre-training V+L models is to first encode image regions and sentence words into a common embedding space, then use multiple Transformer layers to learn image-text  ... 
arXiv:2005.07310v2 fatcat:bzpkniisubggzgtxlskstyh47a

BERT-Based Sentiment Analysis: A Software Engineering Perspective [article]

Himanshu Batra, Narinder Singh Punn, Sanjay Kumar Sonbhadra, Sonali Agarwal
2021 arXiv   pre-print
., BERT, RoBERTa, ALBERT, etc.) have displayed better results in the text classification task.  ...  Following this context, the present research explores different BERT-based models to analyze the sentences in GitHub comments, Jira comments, and Stack Overflow posts.  ...  Pagliardini, M., Gupta, P., Jaggi, M.: Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features.  ... 
arXiv:2106.02581v3 fatcat:grp55d3j3zbf3pg7ri2urechuu

Text Ranking and Classification using Data Compression [article]

Nitya Kasturi, Igor L. Markov
2021 arXiv   pre-print
Text affinity scores derived from compressed sizes can be used for classification and ranking tasks, but their success depends on the compression tools used.  ...  A well-known but rarely used approach to text categorization uses conditional entropy estimates computed using data compression tools.  ...  Halford, "Text classification by data compression," 2021, https://maxhalford.github.  ... 
arXiv:2109.11577v2 fatcat:l2ffdj6pyfdozbav23plhs6fri

Fine-grained Sentiment Analysis with Faithful Attention [article]

Ruiqi Zhong, Steven Shao, Kathleen McKeown
2019 arXiv   pre-print
While the general task of textual sentiment classification has been widely studied, much less research looks specifically at sentiment between a specified source and target.  ...  Thus, we directly trained the model's attention with human rationales and improved our model performance by a robust 4~8 points on all tasks we defined on our data sets.  ...  The final sentence representation used for classification is then a weighted sum, z = n i=1Â (i)h i , to be fed into a fully connected layer with softmax activation for classification; i.e. y = sof tmax  ... 
arXiv:1908.06870v1 fatcat:4whgfjuw25codjv3ttpd27rhrq
« Previous Showing results 1 — 15 out of 1,850 results