349 Hits in 5.9 sec

Johns Hopkins or johnny-hopkins: Classifying Individuals versus Organizations on Twitter

Zach Wood-Doughty, Praateek Mahajan, Mark Dredze
2018 Proceedings of the Second Workshop on Computational Modeling of People's Opinions, Personality, and Emotions in Social Media  
We present a method that relies solely on the account profile, allowing for the classification of individuals versus organizations based on a single tweet.  ...  Previous work presented a method for determining if an account was an individual or organization based on account profile and a collection of tweets.  ...  As it is often easier to collect a large amount of noisy data than a small amount of gold-standard data, such an approach could be widely applicable to studies of Twitter users' emotions and personalities  ... 
doi:10.18653/v1/w18-1108 dblp:conf/acl-peoples/Wood-DoughtyMD18 fatcat:m2lrz7gbqvg5fgpjrhbnldefsq

Large Scale Linguistic Processing of Tweets to Understand Social Interactions among Speakers of Less Resourced Languages: The Basque Case

Joseba Fernandez de Landa, Rodrigo Agerri, Iñaki Alegria
2019 Information  
Social networks like Twitter are increasingly important in the creation of new ways of communication.  ...  to segment communities based on demographic characteristics and to discover how they interact or relate to them.  ...  Acknowledgments: We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan V GPU used for this research.  ... 
doi:10.3390/info10060212 fatcat:uhbdcxapzvaehftqx6zzbtnq3a

Demographic Dialectal Variation in Social Media: A Case Study of African-American English [article]

Su Lin Blodgett, Lisa Green, Brendan O'Connor
2016 arXiv   pre-print
We also provide an ensemble classifier for language identification which eliminates this disparity and release a new corpus of tweets containing AAE-like language.  ...  We conduct a case study of dialectal language in online conversational text by investigating African-American English (AAE) on Twitter.  ...  Thus for every tweet, the gold standard included one or more labeled edges, all rooted in a single token.  ... 
arXiv:1608.08868v1 fatcat:7hy2jzldwfhcvndz3bla2piiga

Using Noisy Self-Reports to Predict Twitter User Demographics [article]

Zach Wood-Doughty, Paiheng Xu, Xiao Liu, Mark Dredze
2020 arXiv   pre-print
Twitter) numerous studies have inferred demographics automatically.  ...  Despite errors inherent in automated supervision, we produce models with good performance when measured on gold standard self-report survey data.  ...  While the self-reports are noisy, our collected datasets are large enough that they support better demographic inference models on held-out, gold-standard labels.  ... 
arXiv:2005.00635v1 fatcat:2vdz2orxk5gwnfiomvtvzfftny

Geocoding Without Geotags: A Text-based Approach for reddit [article]

Keith Harrigian
2018 arXiv   pre-print
In this paper, we introduce the first geolocation inference approach for reddit, a social media platform where user pseudonymity has thus far made supervised demographic inference difficult to implement  ...  In particular, we design a text-based heuristic schema to generate ground truth location labels for reddit users in the absence of explicitly geotagged data.  ...  Models trained on reddit data (third and fourth rows) outperform the baseline for Twitter data sets in nearly all cases (see text for caveats).  ... 
arXiv:1810.03067v1 fatcat:e2s5rysmavaivnahzby3dter34

Stereotypical gender actions can be extracted from web text

Amaç Herdağdelen, Marco Baroni
2011 Journal of the American Society for Information Science and Technology  
With high recall, we obtained a Spearman correlation of 0.47 between corpus-based predictions and a human gold standard, and an area under the ROC curve of 0.76 when predicting the polarity of the gold  ...  We conclude that it is feasible to use natural text (and a Twitter-derived corpus in particular) in order to augment commonsense repositories with the stereotypical gender expectations of actions.  ...  Gold standard As our gold standard dataset, we randomly sampled 702 phrases from the set of actions detected in the ETC Twitter corpus and represented in OMCS.  ... 
doi:10.1002/asi.21579 fatcat:uyquopfhobfghdneyweylzly5q

Towards Augmenting Lexical Resources for Slang and African American English

Alyssa Hwang, William R. Frey, Kathleen R. McKeown
2020 Workshop on NLP for Similar Languages, Varieties and Dialects  
Amazon Mechanical Turk and expert evaluations show that clusters with low precision can still be considered high quality, and we propose the new Cluster Split Score as a metric for machine-generated clusters  ...  Since high-quality clusters would contain related words, we could also infer the meaning of an unfamiliar word based on the meanings of words clustered with it.  ...  Acknowledgments We would like to thank Emily Allaway, Elsbeth Turcan, and Shinya Kondo from Columbia University for their assistance. We also thank the reviewers for their time and helpful feedback.  ... 
dblp:conf/vardial/HwangFM20 fatcat:jwxeirvx2ve67ai4c4lwyrjrzq

Demographic Inference and Representative Population Estimates from Multilingual Social Media Data

Zijian Wang, Scott Hale, David Ifeoluwa Adelani, Przemyslaw Grabowicz, Timo Hartman, Fabian Flöck, David Jurgens
2019 The World Wide Web Conference on - WWW '19  
To learn demographic attributes, we create a new multimodal deep neural architecture for joint classification of age, gender, and organization-status of social media users that operates in 32 languages  ...  In a large experiment over multilingual heterogeneous European regions, we show that our demographic inference and bias correction together allow for more accurate estimates of populations and make a significant  ...  DEMOGRAPHIC INFERENCE We propose a new demographic inference model for three attributes: gender, age, and a binary organization indicator ("is-organization").  ... 
doi:10.1145/3308558.3313684 dblp:conf/www/WangHAGHFJ19 fatcat:4m7lcud37fee3nt3mrfoibuu5y

Social Proof: The Impact of Author Traits on Influence Detection

Sara Rosenthal, Kathy McKeown
2016 Proceedings of the First Workshop on NLP and Computational Social Science  
We then use the personal traits predicted by this classifier to predict the influence of contributors in a Wikipedia Talk Page corpus.  ...  Our research thus provides evidence for the theory of social proof. 27  ...  The Blogger corpus is annotated for age and gender while the LiveJournal corpus provides the date of birth for each poster. We use these annotations as gold labels for predicting age and gender.  ... 
doi:10.18653/v1/w16-5604 dblp:conf/acl-nlpcss/RosenthalM16 fatcat:vcg3oghny5givfrrtid6c7on3y

Unified Representation of Twitter and Online News Using Graph and Entities

Munira Syed, Daheng Wang, Meng Jiang, Oliver Conway, Vishal Juneja, Sriram Subramanian, Nitesh V. Chawla
2021 Frontiers in Big Data  
To improve consumer engagement and satisfaction, online news services employ strategies for personalizing and recommending articles to their users based on their interests.  ...  We evaluate our framework on a downstream task of identifying related pairs of news articles and tweets with promising results.  ...  These evaluation measures require a gold standard reference summaries provided by humans with which to compare the target summary.  ... 
doi:10.3389/fdata.2021.699070 pmid:34514380 pmcid:PMC8432963 fatcat:2wrdyqz3areklhwic72lnbgmz4

TwitPersonality: Computing Personality Traits from Tweets Using Word Embeddings and Supervised Learning

Giulio Carducci, Giuseppe Rizzo, Diego Monti, Enrico Palumbo, Maurizio Morisio
2018 Information  
a good conversion of the model while analyzing their Twitter posts towards the personality traits extracted from the survey.  ...  In this paper, we propose a supervised learning approach to compute personality traits by only relying on what an individual tweets about publicly.  ...  Acknowledgments: A special thank to the myPersonality project for having shared with us the dataset used for training our learning model and conducting the experimentation.  ... 
doi:10.3390/info9050127 fatcat:solvstmvsbgahdc3sr3glh6hxq

Named Entity Recognition in Social Media Data

Archisha Sharma, Shruti Shreya, Shrishail Terni
2022 International Journal for Research in Applied Science and Engineering Technology  
The proposed models have been tested on certain datasets extracted from social media networking sites such as twitter, facebook, etc. and their evaluated performance has been compared to the models proposed  ...  Through this paper, different methods proposed for the purpose of social media data extraction using Named Entity Recognition, have been studied in detail and a comparison has been provided for the same  ...  The suggested work intends to develop and annotate a massive, brand new Arabic news sentiment corpus from Twitter due to the unavailability of an acceptable Arabic news sentiment corpus.  ... 
doi:10.22214/ijraset.2022.46181 fatcat:7t2mmuqgqzawpemwz5zl2u7wie

Temporal Orientation of Tweets for Predicting Income of Users

Mohammed Hasanuzzaman, Sabyasachi Kamila, Mandeep Kaur, Sriparna Saha, Asif Ekbal
2017 Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)  
We quantify a user's overall temporal orientation based on their distribution of tweets, and use it to build a predictive model of income.  ...  Our analysis uncovers a correlation between future temporal orientation and income. Finally, we measure the predictive power of future temporal orientation on income by performing regression.  ...  Acknowledgments The ADAPT Centre for Digital Content Technology is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.  ... 
doi:10.18653/v1/p17-2104 dblp:conf/acl/HasanuzzamanKKS17 fatcat:pcurzubanzb33n26aqobpqhhua

Fine-Grained Temporal Orientation and its Relationship with Psycho-Demographic Correlates

Sabyasachi Kamila, Mohammed Hasanuzzaman, Asif Ekbal, Pushpak Bhattacharyya, Andy Way
2018 Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)  
A deep Bi-directional Long Short Term Memory (BLSTM) is used for the tweet classification task. Our tweet classifier achieves an accuracy of 78.27% when tested on a manually created test set.  ...  In this paper, we propose a very first study to demonstrate the association between the sentiment view of the temporal orientation of the users and their different psycho-demographic attributes by analyzing  ...  The empirical evidence shows that the method performs reasonably well. • We create a gold-standard temporal orientation tweet corpus. • We define a way to find a novel association between the sentiment  ... 
doi:10.18653/v1/n18-1061 dblp:conf/naacl/KamilaHEBW18 fatcat:rtleyxr575gghbfpgectfespoi

MoralStrength: Exploiting a Moral Lexicon and Embedding Similarity for Moral Foundations Prediction [article]

Oscar Araque, Lorenzo Gatti, Kyriaki Kalimeri
2019 arXiv   pre-print
Such findings pave the way for further research, allowing for an in-depth understanding of moral narratives in text for a wide range of social issues.  ...  Moreover, for each lemma it provides with a crowdsourced numeric assessment of Moral Valence, indicating the strength with which a lemma is expressing the specific value.  ...  A valid answer is one that lies within 1.5 standard deviations from the valence mean of [36] , for each specific gold word.  ... 
arXiv:1904.08314v1 fatcat:67r4yo2eevfsrlrbpu2bhmjxwu
« Previous Showing results 1 — 15 out of 349 results