A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Detecting offensive tweets via topical feature discovery over a large scale twitter corpus
2012
Proceedings of the 21st ACM international conference on Information and knowledge management - CIKM '12
Our approach exploits linguistic regularities in profane language via statistical topic modeling on a huge Twitter corpus, and detects offensive tweets using these automatically generated features. ...
In this paper, we propose a novel semi-supervised approach for detecting profanity-related offensive content in Twitter. ...
of seed profane words) and law-abiding twitterers (i.e., twitterers who rarely use seed offensive words) over a large tweet corpus using a list of pre-defined offensive seed words; we then learn topic ...
doi:10.1145/2396761.2398556
dblp:conf/cikm/XiangFWHR12
fatcat:f347mar4tjaflmwctjbhpxj2vi
Towards Measuring Adversarial Twitter Interactions against Candidates in the US Midterm Elections
[article]
2020
arXiv
pre-print
We then develop a new technique for detecting tweets with toxic content that are directed at any specific candidate.Such technique allows us to more accurately quantify adversarial interactions towards ...
We gather a new dataset consisting of 1.7 million tweets involving candidates, one of the largest corpora focusing on political discourse. ...
This research is supported by NSF research grants CNS-1704527 and IIS-1665169, as well as a Cornell Tech Digital Life Initiative Doctoral Fellowship. ...
arXiv:2005.04411v1
fatcat:xmmqrejb6bcrxm72r4w3rdiicu
EARS (earthquake alert and report system)
2014
Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '14
Detected events are automatically broadcasted by our system via a dedicated Twitter account and by email notifications. ...
We then apply a burst detection algorithm in order to promptly identify outbreaking seismic events. ...
bursts fits well with the need to identify large scale and small scale events. ...
doi:10.1145/2623330.2623358
dblp:conf/kdd/AvvenutiCMMT14
fatcat:5rkxhjlx7jdsdn4bshxfrxqmia
Sifting signal from noise: A new perspective on the meaning of tweets about the "big game"
2014
New Media & Society
A good deal of Twitter research focuses on event-detection using algorithms that rely on key words and tweet density. ...
Conceptualizing subcontexts as a socio-technical place advances the framing of Twitter event-detection from principally computational to deeply contextual. ...
Many others have been part of discussions about Twitter data like this over the past 3 years, and we apologize for those omissions. ...
doi:10.1177/1461444814541783
fatcat:wdzbhpah4na3hn2jky4jigzd6e
An Information Retrieval Approach to Building Datasets for Hate Speech Detection
[article]
2021
arXiv
pre-print
We share a new benchmark dataset for hate speech detection on Twitter that provides broader coverage of hate than prior datasets. ...
Building a benchmark dataset for hate speech detection presents various challenges. ...
This research was supported in part by Wipro (HELIOS), the Knight Foundation, the Micron Foundation, and Good Systems (https://goodsystems.utexas.edu), a UT Austin Grand Challenge to develop responsible ...
arXiv:2106.09775v3
fatcat:56cg2t7nwbfe3lwdqw7z2eqjoy
Sentiment Analysis for Fake News Detection
2021
Electronics
This has led to sentiment analysis, the part of text analytics in charge of determining the polarity and strength of sentiments expressed in a text, to be used in fake news detection approaches, either ...
In this article, we study the different uses of sentiment analysis in the detection of fake news, with a discussion of the most relevant elements and shortcomings, and the requirements that should be met ...
In addition, they considered two features that aimed to capture when the sentiment of a tweet matched the overall sentiment of the topic hypothesizing that tweets that had similar sentiments to the topic ...
doi:10.3390/electronics10111348
fatcat:p34nbmtkzrcqrowu24nmu4axnq
A Quantitative Approach to Understanding Online Antisemitism
[article]
2019
arXiv
pre-print
In this paper, we present a large-scale, quantitative study of online antisemitism. ...
We extract semantic embeddings from our corpus of posts and demonstrate how automated techniques can discover and categorize the use of antisemitic terminology. ...
They also assess which features of tweets contribute more on the detection task, finding that character n-grams along with a gender feature provide the best performance. Del Vigna et al. ...
arXiv:1809.01644v2
fatcat:wo2jcz7sgvebjp6sngfb3rmtpm
Utilising Wikipedia for Text Mining Applications
2016
SIGIR Forum
category taxonomies • Topical scores corresponding to each tweet obtained via topic modelling • Twitter-specific features obtained using the Twitter API 6 The fundamental constituent of the technique ...
The Twitter specif ic features show second best performance which confirms the fact that twitter-specific features are important over twitter for sharing information, while Topic specif ic shows the least ...
same time proposing a technique on top of Wikipedia hyperlink 1 structure to determine context of a tweet. ...
doi:10.1145/2888422.2888449
fatcat:lck3kkxoazcj5powaqhjs6epty
Contrastive Learning of Sociopragmatic Meaning in Social Media
[article]
2022
arXiv
pre-print
To bridge this gap, we propose a novel framework for learning task-agnostic representations transferable to a wide range of sociopragmatic tasks (e.g., emotion, hate speech, humor, sarcasm). ...
predictive features for hate speech detection on twitter. ...
Sarcasm detection on twitter: A behavioral modeling approach. ...
arXiv:2203.07648v2
fatcat:6zmhiogvirdlznoaqonyuesc54
Towards Understanding the Information Ecosystem Through the Lens of Multiple Web Communities
[article]
2019
arXiv
pre-print
Then, we follow a data-driven cross-platform quantitative approach to analyze billions of posts from Twitter, Reddit, 4chan's /pol/, and Gab, to shed light on: 1) how news and memes travel from one Web ...
Our analysis reveal that fringe Web communities like 4chan's /pol/ and The_Donald subreddit have a disproportionate influence on mainstream communities like Twitter with regard to the dissemination of ...
By extracting a variety of features (user-related, timing-related, content-related and sentiment-related features) from a large corpus of tweets they demonstrate that they can distinguish promoted campaigns ...
arXiv:1911.10517v1
fatcat:piuwv7zv7zghlof5tqhuhnukla
Blackmarket-Driven Collusion on Online Media: A Survey
2021
ACM/IMS Transactions on Data Science
We believe that collusive entity detection is a newly emerging topic in anomaly detection and cyber-security research in general, and the current survey will provide readers with an easy-to-access and ...
comprehensive list of methods, tools, and resources proposed so far for detecting and analyzing collusive entities on online media. ...
Twitter is a microblogging service where users write tweets about topics such as politics, sport, cooking, and fashion. Twitter has three types of appraisals: retweets, likes, and followers. ...
doi:10.1145/3517931
fatcat:7fvgujegh5hohdiemsok6kzviq
ETHOS: an Online Hate Speech Detection Dataset
[article]
2021
arXiv
pre-print
This phenomenon is primarily fostered by offensive comments, either during user interaction or in the form of a posted multimedia context. ...
A robust and reliable system for detecting and preventing the uploading of relevant content will have a significant impact on our digitally interconnected society. ...
The data was gathered again via the Twitter API, filtering tweets containing HS words submitted to Hatebase.org. ...
arXiv:2006.08328v2
fatcat:ppg2phh4nber3p42pgbgpyfmrq
An NLP-Powered Human Rights Monitoring Platform
2020
Expert Systems with Applications: X
organisations that are not of a scale that they can afford their own department dedicated to this task. ...
Highlights • A practical system for human rights monitoring combining NLP and crowdsourcing • Mining social media offers signals for human rights abuses in addition to reports • Deep learning outperforms ...
An offensive content detection model was proposed by Chen et al. (2012) to detect 255 offensive language in social media. ...
doi:10.1016/j.eswax.2020.100023
fatcat:wuqiko3wr5hkdjo5bkgm4cvrvi
The Origin and Value of Disagreement Among Data Labelers: A Case Study of the Individual Difference in Hate Speech Annotation
[article]
2021
arXiv
pre-print
scale for distilling the process of how annotators label a hate speech corpus. ...
We tested this scale with 170 annotators in a hate speech annotation task. ...
Detecting offensive tweets via topical
feature discovery over a large scale twitter corpus. ...
arXiv:2112.04030v1
fatcat:xtqe55o2c5ambh2jhgofxsjjma
From Symbols to Embeddings: A Tale of Two Representations in Computational Social Science
[article]
2021
arXiv
pre-print
However, these large-scale and multi-modal data also present researchers with a great challenge: how to represent data effectively to mine the meanings we want in CSS? ...
statistics of these applications, we unearth the strength of each kind of representations and discover the tendency that embedding-based representations are emerging and obtaining increasing attention over ...
and a large corpus of texts. ...
arXiv:2106.14198v1
fatcat:dvy5awnfuvbnnkzusjl5wbhfki
« Previous
Showing results 1 — 15 out of 155 results