Filters








13,993 Hits in 4.3 sec

Word Embeddings for Information Extraction from Tweets

Surajit Dasgupta, Abhash Kumar, Dipankar Das, Sudip Kumar Naskar, Sivaji Bandyopadhyay
2016 Forum for Information Retrieval Evaluation  
Our method uses vector space word embeddings to extract information from microblogs (tweets) related to disaster scenarios, and can be replicated across various domains.  ...  This paper describes our approach on "Information Extraction from Microblogs Posted during Disasters" as an attempt in the shared task of the Microblog Track at Forum for Information Retrieval Evaluation  ...  To extract the location information, we used the geo-location attribute from the tweets and the Stanford NER tagger 6 to extract location names from the tweet text.  ... 
dblp:conf/fire/DasguptaKDNB16 fatcat:ooojppcgk5f55l4zzwcj65wcly

AMRITA_CEN@FIRE 2016: Code-Mix Entity Extraction for Hindi-English and Tamil-English Tweets

Remmiya Devi G, Veena P. V, M. Anand Kumar, Soman K. P
2016 Forum for Information Retrieval Evaluation  
Extraction of such information serves as the basis for the most preliminary task in Natural Language Processing called Entity extraction.  ...  The work is submitted as a part of Shared task on Code Mix Entity Extraction for Indian Languages(CMEE-IL) at Forum for Information Retrieval Evaluation (FIRE) 2016.  ...  ACKNOWLEDGMENT We would like to thank organizers of Forum for Information Retrieval Evaluation 2016 for organizing the task. We would also like to thank the organizers of the CMEE-IL task.  ... 
dblp:conf/fire/GVMP16 fatcat:gccds65b3jc5hfswpbhp4pj7ru

Semantic Wide and Deep Learning for Detecting Crisis-Information Categories on Social Media [chapter]

Grégoire Burel, Hassan Saif, Harith Alani
2017 Lecture Notes in Computer Science  
Automatically identifying the category of information (e.g., reports on affected individuals, donations and volunteers) contained in these posts is vital for their efficient handling and consumption by  ...  In this paper, we introduce Sem-CNN; a wide and deep Convolutional Neural Network (CNN) model designed for identifying the category of information contained in crisis-related social media content.  ...  embeddings and semantic embeddings initialised from extracted concepts.  ... 
doi:10.1007/978-3-319-68288-4_9 fatcat:ntetx4pzizgolpp6sskimipjve

CEN@Amrita FIRE 2016: Context based Character Embeddings for Entity Extraction in Code-Mixed Text

Srinidhi Skanda V, Shivkaran Singh, Remmiya Devi G, Veena P. V, M. Anand Kumar, Soman K. P
2016 Forum for Information Retrieval Evaluation  
The tweets in code mix are written in English mixed with Hindi or Tamil. In this work, Entity Extraction system is implemented for both Hindi-English and Tamil-English code-mix tweets.  ...  These words were further split into characters. Embedding vectors of these characters are appended with the I-O-B tags and used for training the system.  ...  ACKNOWLEDGMENT We would like to give thanks to the task organizer -Forum for Information Retrieval Evaluation. We also thank organizers of CMEE-IL task.  ... 
dblp:conf/fire/VSGVMP16 fatcat:gdjynp5l5bf3hbl7jrtm2bsua4

Detecting Arabic Offensive Language in Microblogs Using Domain-Specific Word Embeddings and Deep Learning

Khulood O. Aljuhani, Khaled H. Alyoubi, Fahd S. Alotaibi
2022 Tehnički glasnik  
This paper proposes a deep learning approach that utilizes the bidirectional long short-term memory (BiLSTM) model and domain-specific word embeddings extracted from an Arabic offensive dataset.  ...  The results showed the highest performance accuracy of 0.93% with the BiLSTM model trained using a combination of domain-specific and agnostic-domain word embeddings.  ...  word embeddings extracted from an Arabic offensive corpus.  ... 
doaj:a64f4b001e0246babbfb91c74079767f fatcat:yszynytvuneb7pez4yts7p4g4y

Keyphrase Extraction from Disaster-related Tweets

Jishnu Ray Chowdhury, Cornelia Caragea, Doina Caragea
2019 The World Wide Web Conference on - WWW '19  
While keyphrase extraction has received considerable attention in recent years, relatively few studies exist on extracting keyphrases from social media platforms such as Twitter, and even fewer for extracting  ...  Previously, joint training of two different layers of a stacked Recurrent Neural Network for keyword discovery and keyphrase extraction had been shown to be effective in extracting keyphrases from general  ...  We also thank NSF for support from the grants IIS-1526542, IIS-1423337, IIS-1652674, and CMMI-1541155.  ... 
doi:10.1145/3308558.3313696 dblp:conf/www/ChowdhuryCC19 fatcat:fgir6ach3bcbhkz7lyvuj7hvxa

On Identifying Hashtags in Disaster Twitter Data [article]

Jishnu Ray Chowdhury, Cornelia Caragea, Doina Caragea
2020 arXiv   pre-print
Tweet hashtags have the potential to improve the search for information during disaster events.  ...  To facilitate progress on automatic identification (or extraction) of disaster hashtags for Twitter data, we construct a unique dataset of disaster-related tweets annotated with hashtags useful for filtering  ...  We also thank NSF for support from the grants IIS-1526542, IIS-1423337, IIS-1652674, and CMMI-1541155.  ... 
arXiv:2001.01323v1 fatcat:h5erzrtwpbdxno52yijjuy6gkq

SIEVE: Helping developers sift wheat from chaff via cross-platform analysis

Agus Sulistya, Gede Artha Azriadi Prana, Abhishek Sharma, David Lo, Christoph Treude
2019 Empirical Software Engineering  
Our approach is based on transfer representation learning and word embeddings, leveraging information extracted from a source platform which contains rich domain-related content.  ...  We first build a word embeddings model as a representation learned from the source platform, and use the model to improve the performance of knowledge extraction tasks in the target platform.  ...  For each sentence or tweet in the dataset, we tokenize it into words. 2. For each word, we look up its weight from the word embeddings model.  ... 
doi:10.1007/s10664-019-09775-w fatcat:mss5xk2givcjle4av2qmrok5fu

SIEVE: Helping Developers Sift Wheat from Chaff via Cross-Platform Analysis [article]

Agus Sulistya, Gede Artha Azriadi Prana, Abhishek Sharma, David Lo, Christoph Treude
2018 arXiv   pre-print
Our approach is based on transfer representation learning and word embeddings, leveraging information extracted from a source platform which contains rich domain-related content.  ...  We first build a word embeddings model as a representation learned from the source platform, and use the model to improve the performance of knowledge extraction tasks in the target platform.  ...  For each sentence or tweet in the dataset, we tokenize it into words. 2. For each word, we look up its weight from the word embeddings model.  ... 
arXiv:1810.13144v1 fatcat:nvkrpmtq4zd4zhwn57nccctr74

ECNU at SemEval-2017 Task 4: Evaluating Effective Features on Machine Learning Methods for Twitter Message Polarity Classification

Yunxiao Zhou, Man Lan, Yuanbin Wu
2017 Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)  
We investigated several traditional Natural Language Processing (NLP) features, domain specific features and word embedding features together with supervised machine learning methods to address this task  ...  Software for Internet of Things (ZF1213) and NSFC (61402175).  ...  Acknowledgements This research is supported by grants from Science and Technology Commission of Shanghai Municipality (14DZ2260800 and 15ZR1410700), Shanghai Collaborative Innovation Center of Trustworthy  ... 
doi:10.18653/v1/s17-2137 dblp:conf/semeval/ZhouLW17 fatcat:6dgemujijrhxlkiwilsvvph3kq

On Identifying Hashtags in Disaster Twitter Data

Jishnu Ray Chowdhury, Cornelia Caragea, Doina Caragea
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
Tweet hashtags have the potential to improve the search for information during disaster events.  ...  To facilitate progress on automatic identification (or extraction) of disaster hashtags for Twitter data, we construct a unique dataset of disaster-related tweets annotated with hashtags useful for filtering  ...  We also thank NSF for support from the grants IIS-1526542, IIS-1423337, IIS-1652674, and CMMI-1541155.  ... 
doi:10.1609/aaai.v34i01.5387 fatcat:y7ntuyaw5naepd2y6c5dh4ly5a

Predicting Information Diffusion on Twitter a Deep Learning Neural Network Model Using Custom Weighted Word Features [chapter]

Amit Kumar Kushwaha, Arpan Kumar Kar, P. Vigneswara Ilavarasan
2020 Lecture Notes in Computer Science  
Our framework first extracts the words, create a matrix of these words using the sequences in the tweet text.  ...  The results of the proposed CWWE are compared to a pre-trained glove word embedding. For experimentation, we created a corpus of size 230,000 tweets posted by more than 45,000 users in 6 months.  ...  Once every tweet loops through the above 3 steps we extract words from all the tweets and create a matrix with all the words from the corpus as columns.  ... 
doi:10.1007/978-3-030-44999-5_38 fatcat:ftko2naewfhxfdelj36qnz4jjq

Keyphrase Extraction Using Deep Recurrent Neural Networks on Twitter

Qi Zhang, Yang Wang, Yeyun Gong, Xuanjing Huang
2016 Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing  
from tweets.  ...  Different from previous studies, which are usually focused on automatically extracting keyphrases from documents or articles, in this study, we considered the problem of automatically extracting keyphrases  ...  Acknowledgement The authors wish to thank the anonymous reviewers for their helpful comments.  ... 
doi:10.18653/v1/d16-1080 dblp:conf/emnlp/ZhangWGH16 fatcat:na2oqckrtndg5ka3aj5jqif4na

Tracing State-Level Obesity Prevalence from Sentence Embeddings of Tweets: A Feasibility Study [article]

Xiaoyi Zhang, Rodoniki Athanasiadou, Narges Razavian
2019 arXiv   pre-print
In response, we introduce a deep learning approach that uses hashtags as a form of supervision and learns tweet embeddings for extracting informative textual features.  ...  Previous public health studies based on Twitter data have largely relied on keyword-matching or topic models for clustering relevant tweets.  ...  by extracting features from word and tweet embeddings Model dim MAE Pearson Corr.  ... 
arXiv:1911.11324v2 fatcat:o4ubgnizd5av5h6kg3phakezyy

Surface and Deep Features Ensemble for Sentiment Analysis of Arabic Tweets

Nora Al-Twairesh, Hadeel Al-Negheimish
2019 IEEE Access  
Previous research on the SA of tweets mainly focused on manually extracting features from the text.  ...  In this paper, we propose to learn sentiment-specific word embeddings from Arabic tweets and use them in the Arabic Twitter sentiment classification.  ...  We use these three models to extract sentimentspecific word embeddings from Arabic tweets, as described in Section III.  ... 
doi:10.1109/access.2019.2924314 fatcat:ceyngnnfkvcbdh74urcmnw4qvm
« Previous Showing results 1 — 15 out of 13,993 results