Filters








151 Hits in 0.84 sec

Harnessing Folksonomies for Resource Classification [article]

Arkaitz Zubiaga
2012 arXiv   pre-print
Parts of the research in this chapter have been published in Zubiaga et al. (2009d) , Zubiaga et al. (2009c) and Zubiaga et al. (2011a) .  ...  Parts of the research in this chapter have been published in Zubiaga et al. (2009b) and Zubiaga et al. (2009a) .  ...  Guk dakigula, etiketa sozialak baliatuz egiazko sailkapen esperimentuak burutzen dituen lehen lana guk aurkeztutako lehena da (Zubiaga et al., 2009d) .  ... 
arXiv:1204.6521v1 fatcat:7wgyv5wy6zat3khv5mhu3bolmi

"Harnessing folksonomies for resource classification" by Arkaitz Zubiag with Danielle H. Lee as coordinator

Arkaitz Zubiaga
2012 ACM SIGWEB Newsletter  
It has been shown that these tags facilitate retrieval of resources not · Zubiaga only for the annotators themselves but also for the whole community.  ... 
doi:10.1145/2246063.2246067 fatcat:xnihozby3nf2zpqiearksgow6m

Automated Fact-Checking: A Survey [article]

Xia Zeng, Amani S. Abumansour, Arkaitz Zubiaga
2021 arXiv   pre-print
Preliminary research splitting the three-way classification into two binary classifications [Zeng and Zubiaga, 2021] is likely to help avoid such errors.  ... 
arXiv:2109.11427v1 fatcat:6ezha6y5svf4zkq7brodmd3vke

Enhancing Navigation on Wikipedia with Social Tags [article]

Arkaitz Zubiaga
2012 arXiv   pre-print
Social tagging has become an interesting approach to improve search and navigation over the actual Web, since it aggregates the tags added by different users to the same resource in a collaborative way. This way, it results in a list of weighted tags describing its resource. Combined to a classical taxonomic classification system such as that by Wikipedia, social tags can enhance document navigation and search. On the one hand, social tags suggest alternative navigation ways, including
more » ... including pivot-browsing, popularity-driven navigation, and filtering. On the other hand, it provides new metadata, sometimes uncovered by documents' content, that can substantially improve document search. In this work, the inclusion of an interface to add user-defined tags describing Wikipedia articles is proposed, as a way to improve article navigation and retrieval. As a result, a prototype on applying tags over Wikipedia is proposed in order to evaluate its effectiveness.
arXiv:1202.5469v1 fatcat:mkgmsxw26zev3aokc3t4f5qzda

Cross-lingual Hate Speech Detection using Transformer Models [article]

Teodor Tiţa, Arkaitz Zubiaga
2021 arXiv   pre-print
Hate speech detection within a cross-lingual setting represents a paramount area of interest for all medium and large-scale online platforms. Failing to properly address this issue on a global scale has already led over time to morally questionable real-life events, human deaths, and the perpetuation of hate itself. This paper illustrates the capabilities of fine-tuned altered multi-lingual Transformer models (mBERT, XLM-RoBERTa) regarding this crucial social data science task with
more » ... k with cross-lingual training from English to French, vice-versa and each language on its own, including sections about iterative improvement and comparative error analysis.
arXiv:2111.00981v1 fatcat:jvo6ad5bevbjxac46ws3ku5r7a

Real-Time Classification of Twitter Trends [article]

Arkaitz Zubiaga, Damiano Spina, Raquel Martínez, Víctor Fresno
2014 arXiv   pre-print
Those parameters were found suitable for another text classification task (Zubiaga, Martínez, and Fresno, 2009 ), i.e., classification of web pages using metadata from social media.  ...  about social trends of their interest, but also feed third parties with different interests: for instance, news media can be interested in early discovery of social trends associated with breaking news (Zubiaga  ... 
arXiv:1403.1451v1 fatcat:77lj3mteonhjfb36fg5p6gp5b4

Towards Detecting Rumours in Social Media [article]

Arkaitz Zubiaga, Maria Liakata, Rob Procter, Kalina Bontcheva, Peter Tolmie
2015 arXiv   pre-print
This is even more important in cases where users tend to pass on false information more often than real facts, as occurred with Hurricane Sandy in 2012 (Zubiaga and Ji 2014).  ... 
arXiv:1504.04712v1 fatcat:m5ve7gqf3bg33j6kqlfvy5u6mu

Sexism Identification in Tweets and Gabs using Deep Neural Networks [article]

Amikul Kalra, Arkaitz Zubiaga
2021 arXiv   pre-print
Through anonymisation and accessibility, social media platforms have facilitated the proliferation of hate speech, prompting increased research in developing automatic methods to identify these texts. This paper explores the classification of sexism in text using a variety of deep neural network model architectures such as Long-Short-Term Memory (LSTMs) and Convolutional Neural Networks (CNNs). These networks are used in conjunction with transfer learning in the form of Bidirectional Encoder
more » ... ectional Encoder Representations from Transformers (BERT) and DistilBERT models, along with data augmentation, to perform binary and multiclass sexism classification on the dataset of tweets and gabs from the sEXism Identification in Social neTworks (EXIST) task in IberLEF 2021. The models are seen to perform comparatively to those from the competition, with the best performances seen using BERT and a multi-filter CNN model. Data augmentation further improves these results for the multi-class classification task. This paper also explores the errors made by the models and discusses the difficulty in automatically classifying sexism due to the subjectivity of the labels and the complexity of natural language used in social media.
arXiv:2111.03612v1 fatcat:jhwntmhj2zewrcaryluwh66zji

Mining Social Media for Newsgathering: A Review [article]

Arkaitz Zubiaga
2019 arXiv   pre-print
Social media is becoming an increasingly important data source for learning about breaking news and for following the latest developments of ongoing news. This is in part possible thanks to the existence of mobile devices, which allows anyone with access to the Internet to post updates from anywhere, leading in turn to a growing presence of citizen journalism. Consequently, social media has become a go-to resource for journalists during the process of newsgathering. Use of social media for
more » ... cial media for newsgathering is however challenging, and suitable tools are needed in order to facilitate access to useful information for reporting. In this paper, we provide an overview of research in data mining and natural language processing for mining social media for newsgathering. We discuss five different areas that researchers have worked on to mitigate the challenges inherent to social media newsgathering: news discovery, curation of news, validation and verification of content, newsgathering dashboards, and other tasks. We outline the progress made so far in the field, summarise the current challenges as well as discuss future directions in the use of computational journalism to assist with social media newsgathering. This review is relevant to computer scientists researching news in social media as well as for interdisciplinary researchers interested in the intersection of computer science and journalism.
arXiv:1804.03540v2 fatcat:s4ia7c5hfjbvzp3nxjiicnnfde

All-in-one: Multi-task Learning for Rumour Verification [article]

Elena Kochkina, Maria Liakata, Arkaitz Zubiaga
2018 arXiv   pre-print
(Zubiaga et al., 2018b) .  ...  Previous work on stance classification (Lukasik et al., 2016; Kochkina et al., 2017; Zubiaga et al., 2017; Zubiaga et al., 2018b) has explored the use of sequential classifiers, treating the task as  ... 
arXiv:1806.03713v1 fatcat:57fzdkqtlvaclon7as63ysrjta

Euskahaldun: Euskararen Aldeko Martxa Baten Sare Sozialetako Islaren Bilketa eta Analisia [article]

Arkaitz Zubiaga
2015 arXiv   pre-print
This work is motivated by the dearth of research that deals with social media content created from the Basque Country or written in Basque language. While social fingerprints during events have been analysed in numerous other locations and languages, this article aims to fill this gap so as to initiate a much-needed research area within the Basque scientific community. To this end, we describe the methodology we followed to collect tweets posted during the quintessential exhibition race in
more » ... bition race in support of the Basque language. We also present the results of the analysis of these tweets. Our analysis shows that the most eventful moments lead to spikes in tweeting activity, producing more tweets. Furthermore, we emphasize the importance of having an official account for the event in question, which helps improve the visibility of the event in the social network as well as the dissemination of information to the Basque community. Along with the official account, journalists and news organisations play a crucial role in the diffusion of information.
arXiv:1508.05812v1 fatcat:t65t7szyzvaidlsrgdlf4bsj2a

Analyzing Tag Distributions in Folksonomies for Resource Classification [article]

Arkaitz Zubiaga and Raquel Martínez and Víctor Fresno
2012 arXiv   pre-print
Recent research has shown the usefulness of social tags as a data source to feed resource classification. Little is known about the effect of settings on folksonomies created on social tagging systems. In this work, we consider the settings of social tagging systems to further understand tag distributions in folksonomies. We analyze in depth the tag distributions on three large-scale social tagging datasets, and analyze the effect on a resource classification task. To this end, we study the
more » ... d, we study the appropriateness of applying weighting schemes based on the well-known TF-IDF for resource classification. We show the great importance of settings as to altering tag distributions. Among those settings, tag suggestions produce very different folksonomies, which condition the success of the employed weighting schemes. Our findings and analyses are relevant for researchers studying tag-based resource classification, user behavior in social networks, the structure of folksonomies and tag distributions, as well as for developers of social tagging systems in search of an appropriate setting.
arXiv:1202.5477v1 fatcat:qw6ct6etljhu3omct4tyoewpv4

Leveraging Aspect Phrase Embeddings for Cross-Domain Review Rating Prediction [article]

Aiqi Jiang, Arkaitz Zubiaga
2018 arXiv   pre-print
Online review platforms are a popular way for users to post reviews by expressing their opinions towards a product or service, as well as they are valuable for other users and companies to find out the overall opinions of customers. These reviews tend to be accompanied by a rating, where the star rating has become the most common approach for users to give their feedback in a quantitative way, generally as a likert scale of 1-5 stars. In other social media platforms like Facebook or Twitter, an
more » ... book or Twitter, an automated review rating prediction system can be useful to determine the rating that a user would have given to the product or service. Existing work on review rating prediction focuses on specific domains, such as restaurants or hotels. This, however, ignores the fact that some review domains which are less frequently rated, such as dentists, lack sufficient data to build a reliable prediction model. In this paper, we experiment on 12 datasets pertaining to 12 different review domains of varying level of popularity to assess the performance of predictions across different domains. We introduce a model that leverages aspect phrase embeddings extracted from the reviews, which enables the development of both in-domain and cross-domain review rating prediction systems. Our experiments show that both of our review rating prediction systems outperform all other baselines. The cross-domain review rating prediction system is particularly significant for the least popular review domains, where leveraging training data from other domains leads to remarkable improvements in performance. The in-domain review rating prediction system is instead more suitable for popular review domains, provided that a model built from training data pertaining to the target domain is more suitable when this data is abundant.
arXiv:1811.05689v1 fatcat:7742lznberesza4t2w33fesupm

Towards generalisable hate speech detection: a review on obstacles and solutions [article]

Wenjie Yin, Arkaitz Zubiaga
2021 arXiv   pre-print
Hate speech is one type of harmful online content which directly attacks or promotes hate towards a group or an individual member based on their actual or perceived aspects of identity, such as ethnicity, religion, and sexual orientation. With online hate speech on the rise, its automatic detection as a natural language processing task is gaining increasing interest. However, it is only recently that it has been shown that existing models generalise poorly to unseen data. This survey paper
more » ... s survey paper attempts to summarise how generalisable existing hate speech detection models are, reason why hate speech models struggle to generalise, sums up existing attempts at addressing the main obstacles, and then proposes directions of future research to improve generalisation in hate speech detection.
arXiv:2102.08886v1 fatcat:7uemuqrehnfvnbenp657hv3fpy

SWSR: A Chinese Dataset and Lexicon for Online Sexism Detection [article]

Aiqi Jiang, Xiaohan Yang, Yang Liu, Arkaitz Zubiaga
2021 arXiv   pre-print
Our dataset is publicly available through Zenodo and can be downloaded completely by using the following citation: Aiqi Jiang, Xiaohan Yang, Yang Liu, & Arkaitz Zubiaga. (2021).  ... 
arXiv:2108.03070v1 fatcat:7pnrrr54xrd6jc6hhacglht43e
« Previous Showing results 1 — 15 out of 151 results