A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Enriching feature engineering for short text samples by language time series analysis
2020
EPJ Data Science
In this case study, we are extending feature engineering approaches for short text samples by integrating techniques which have been introduced in the context of time series classification and signal processing ...
The resulting language time series can be characterised by collections of established time series feature extraction algorithms from time series analysis and signal processing. ...
Funding This research was supported by the Faculty of Engineering of the University of Auckland.
Availability of data and materials Data are available from [17] and [61] . ...
doi:10.1140/epjds/s13688-020-00244-9
fatcat:aichvkviebgf3ov5oxoujl4qdu
ETM: Enrichment by topic modeling for automated clinical sentence classification to detect patients' disease history
2020
Journal of Intelligent Information Systems
The ETM enriches text representation by incorporating probability distributions generated by an unsupervised algorithm into it. ...
This study proposes the ETM (enrichment by topic modeling) algorithm, based on latent Dirichlet allocation, to smoothen the semantic representations of short sentences. ...
There are two reasons for this: (1) Feature engineering in the Crest and ETM approaches has been proposed especially for the short text classification problem. (2) The trained word vectors are not rich ...
doi:10.1007/s10844-020-00605-w
fatcat:ho7ujxwy3fhijbyiiko66wkw5i
Fault Classification Method for Power Dispatching Log Based on Text Mining
2018
DEStech Transactions on Engineering and Technology Research
In power grid scheduling, as the basis for the dispatcher to record the operation status of the power grid, the scheduling log usually uses short text to record the current state of the power grid, accident ...
After obtaining the TF-IDF feature of the text and creating the Word2Vec feature model, we compare three kinds of text classification algorithms, the nearest neighbor algorithm, Naive Bayes and support ...
Acknowledgement This work was partially supported by the science and technology project of Zhejiang Electric Power Corporation (Grant No. 5211NB160006). ...
doi:10.12783/dtetr/pmsms2018/24915
fatcat:oyxq7wq6wvaltasl6dv5r4xeaq
Research on Sentiment Analysis Model of Short Text Based on Deep Learning
2022
Scientific Programming
neural network features in deep learning and learning the short text by combining shallow learning and deep learning. ...
potential emotional features of short texts. ...
Learning features enrich the textual feature representation of short texts. In the near future, deep learning has also been widely used in sentiment analysis of short texts. ...
doi:10.1155/2022/2681533
fatcat:ndquxxc2sjgj5kig4rea5u7ppu
Integration of Text and Graph-based Features for Detecting Mental Health Disorders from Voice
[article]
2022
arXiv
pre-print
In this paper, two methods are used to enrich voice analysis for depression detection: graph transformation of voice signals, and natural language processing of the transcript based on representational ...
The results of experiments with the DAIC-WOZ dataset suggest that integration of text-based voice classification and learning from low level and graph-based voice signal features can improve the detection ...
Mapping of Signals to Graphs Mapping from a time series to a complex network was proposed by (Lacasa et al., 2008) . ...
arXiv:2205.07006v1
fatcat:vlbsimfgondx3lktyywilwvcaa
Capturing non-functional properties through model interlinking
2014
2014 IEEE 27th Canadian Conference on Electrical and Computer Engineering (CCECE)
Then, using texts that are associated with the elements and through a semantics-enabled textual analysis process, the model elements will be semantically annotated with related ontological concepts. ...
Researchers have argued that connecting intentional variability models such as goal models with feature variability models in a target domain can enrich feature models with valuable quality and non-functional ...
Table 1 shows sample supporting texts for "3G" and "WiFi" features, and "WLAN" and "Cellular" tasks. ...
doi:10.1109/ccece.2014.6901063
dblp:conf/ccece/NoorianBD14
fatcat:x7yezxqumvaubhgbxdanmqb34e
Short Text Classification Improved by Feature Space Extension
[article]
2019
arXiv
pre-print
The difference between classifying short text and long documents is that short text is of shortness and sparsity. ...
With the explosive development of mobile Internet, short text has been applied extensively. ...
However, short text has a
series of features, such as shortness, sparsity, lack of semantic and contextual information [1-2]. It
brings challenges for traditional methods to achieve good performance. ...
arXiv:1904.01313v1
fatcat:k6gfczizc5dfzgns3xfzkocnry
Text Mining for News and Blogs Analysis
[chapter]
2016
Encyclopedia of Machine Learning and Data Mining
models, time series, and stream mining methods. ...
Many mining methods therefore enrich the text by, for example, the contents of referenced URLs (e.g. Abel et al., 2011) . ...
doi:10.1007/978-1-4899-7502-7_833-1
fatcat:ywxnluhwdfd6ldwcxjqp4x2ed4
The Paralinguistic Function of Emojis in Twitter Communication
2019
Zenodo
A manual content analysis was then conducted to ascertain the paralinguistic and emotional features of the emojis used in these tweets. ...
We present our characterization of emoji usage in Twitter and discuss implications for the design of Twitter and other text-based communication tools. ...
The paralinguistic features of spoken language are primarily auditory and visual in nature, and verbal text is neither auditory nor visual. ...
doi:10.5281/zenodo.3298638
fatcat:py5ja6b5mfcobdyxdf5tjomaka
A Study of Multilingual Toxic Text Detection Approaches under Imbalanced Sample Distribution
2021
Information
Multilingual characteristics, lack of annotated data, and imbalanced sample distribution are the three main challenges for toxic comment analysis in a multilingual setting. ...
Two models, multilingual bidirectional encoder representation from transformers (MBERT) and XLM-RoBERTa (XLM-R), are employed for pre-training through Masking Language Modeling (MLM) and Translation Language ...
Deep neural networks, on the other hand, can address this challenge by capturing the text semantic information from raw text data, without manual feature engineering and also boost the detection performance ...
doi:10.3390/info12050205
fatcat:q54pa3gxmvcr7o7emv4s3xpvqq
Clustering of semantically enriched short texts
2018
Journal of Intelligent Information Systems
In addition, we test the possibilities of improving the quality of clustering ultra-short texts by means of enriching them semantically. ...
The paper is devoted to the issue of clustering small sets of very short texts. ...
Acknowledgments We would like to thank three anonymous referees for their valuable and constructive comments, which helped us to improve the quality of this article. ...
doi:10.1007/s10844-018-0541-4
fatcat:eipabygtdrdr3ji7wqth6vic4a
Wikipedia-based Semantic Interpretation for Natural Language Processing
2009
The Journal of Artificial Intelligence Research
Here we propose a novel method, called Explicit Semantic Analysis (ESA), for fine-grained semantic interpretation of unrestricted natural language texts. ...
We evaluate the effectiveness of our method on text categorization and on computing the degree of semantic relatedness between fragments of natural language text. ...
Lee and Brandon Pincombe for making available their document similarity data. ...
doi:10.1613/jair.2669
fatcat:mwcky2jqx5e6zimhzgsbh5rffa
Serial Expression Analysis: a web tool for the analysis of serial gene expression data
2010
Nucleic Acids Research
We have created the SEA (Serial Expression Analysis) suite to provide a complete web-based resource for the analysis of serial transcriptomics data. ...
Serial transcriptomics experiments investigate the dynamics of gene expression changes associated with a quantitative variable such as time or dosage. ...
designed for short series. ...
doi:10.1093/nar/gkq488
pmid:20525784
pmcid:PMC2896172
fatcat:k32krdkke5di7o66rlh77lfcmq
Application of Semantic Tagging to Academic Paper Services
2017
The International Journal of Engineering and Science
If detailed explanation on the tagged words can also be viewed at the same time while reading a paper, in addition, readers' convenience and legibility can be improved simultaneously. ...
Among 10 papers in 5 subjects, those with improved keyword matching accounted for 70%, 60%, 50%, 60% and 80% respectively. ...
Ontology engineering in Semantic Web is primarily supported by languages such as RDF, RDFS and OWL [3] . ...
doi:10.9790/1813-0601011216
fatcat:ksl623zprvg3lm3ovktmatkyea
Analysis of the Error Pattern of HMM based Bangla ASR
2020
International Journal of Image Graphics and Signal Processing
Finally, the results are analyzed to get the error pattern needed for future development. fast by sending voice and getting the result in the text format. ...
Research on ASR by machine has attracted much attention over the last few decades. Bengali is largely spoken all over the world. ...
ACKNOWLEDGEMENTS The authors would like to thank Ahsanullah University of Science and Technology for supporting this work. ...
doi:10.5815/ijigsp.2020.01.01
fatcat:6yfrohf47fewzmwasgiqqo3xgy
« Previous
Showing results 1 — 15 out of 22,843 results