Filters








125 Hits in 8.1 sec

Algorithmic Fairness Datasets: the Story so Far [article]

Alessandro Fabris, Stefano Messina, Gianmaria Silvello, Gian Antonio Susto
2022 arXiv   pre-print
Unfortunately, the algorithmic fairness community suffers from a collective data documentation debt caused by a lack of information on specific resources (opacity) and scatteredness of available information  ...  Finally, we analyze these datasets from the perspective of five important data curation topics: anonymization, consent, inclusivity, sensitive attributes, and transparency.  ...  Acknowledgements The authors would like to thank the following researchers and dataset creators for the useful feedback on the data briefs: Alain Barrat, Luc Behaghel, Asia Biega, Marko Bohanec, Chris  ... 
arXiv:2202.01711v2 fatcat:5hf4a42pubc5vnt7tw3al4m5bq

A Systematic Survey of Online Data Mining Technology Intended for Law Enforcement

Matthew Edwards, Awais Rashid, Paul Rayson
2015 ACM Computing Surveys  
As more and more crime takes on a digital aspect, law enforcement bodies must tackle an online environment generating huge volumes of data.  ...  Such technologies must be well-designed and rigorously grounded, yet no survey of the online data-mining literature exists which examines their techniques, applications and rigour.  ...  [Wang et al. 2012c ] use Twitter as a source of general crime prediction, drawing on automatic semantic analysis, event extraction and geographical information systems to map crime hotspots.  ... 
doi:10.1145/2811403 fatcat:qpvfebejpfgp5bh3cgykaoumze

Table of Contents

2019 2019 International Conference on Advancements in Computing (ICAC)  
The knowledge store gains information and expands its knowledge from the internet by crawling websites.  ...  The 'User Report Information' is included conditions of the related train and can be shared among the other interested parties through our System.  ... 
doi:10.1109/icac49085.2019.9129879 fatcat:gm5mw7qn5ncoxjr6klcl3x4uju

Open challenges for data stream mining research

Georg Krempl, Myra Spiliopoulou, Jerzy Stefanowski, Indre Žliobaite, Dariusz Brzeziński, Eyke Hüllermeier, Mark Last, Vincent Lemaire, Tino Noack, Ammar Shaker, Sonja Sievi
2014 SIGKDD Explorations  
ABSTRACT We discuss the most important database research advances, industry developments, role of relational and NoSQL databases, Computing Reality, Data Curation, Cloud Computing, Tamr and Jisto startups  ...  Streaming data can be considered as one of the main sources of what is called big data.  ...  Part of this work was funded by the German Research Foundation, projects SP 572/11-1 (IMPRINT) and HU 1284/5-1, the Academy of Finland grant 118653 (ALGODAN), and the Polish National Science Center grants  ... 
doi:10.1145/2674026.2674028 fatcat:y3bozzeohveibgxb5wmiwfcogm

Processing Social Media Messages in Mass Emergency: A Survey [article]

Muhammad Imran, Carlos Castillo, Fernando Diaz, Sarah Vieweg
2015 arXiv   pre-print
We examine the particularities of this setting, and then methodically examine a series of key sub-problems ranging from the detection of events to the creation of actionable and useful summaries.  ...  Processing social media messages to obtain such information, however, involves solving multiple challenges including: handling information overload, filtering credible information, and prioritizing different  ...  Peo- ple post situation-sensitive information on social media related to what they experi- ence, witness, and/or hear from other sources [Hughes and Palen 2009].  ... 
arXiv:1407.7071v3 fatcat:e7mcvae5freddaus7ndolygeti

Understanding And Mapping Big Data

Rajendra Akerkar, Guillermo Vega-Gorgojo, Grunde Løvoll, Stephane Grumbach, Aurelien Faravelon, Rachel Finn, Kush Wadhwa, Anna Donovan, Lorenzo Bigagli
2015 Zenodo  
Understanding and mapping big data. Deliverable D1.1 BYTE Project.  ...  The technical challenges arise from data acquisition and data curation to data analysis and data visualization.  ...  Predictive policing uses historical crime data to automatically discover trends and patterns in the data.  ... 
doi:10.5281/zenodo.49161 fatcat:wz3cwet3wfbmvfzucivu3t64eq

A Review of Computer Vision Methods in Network Security [article]

Jiawei Zhao, Rahat Masood, Suranga Seneviratne
2020 arXiv   pre-print
However, such methods are more based on statistical features extracted from sources such as binaries, emails, and packet flows.  ...  Next, we review a set of such commercial products for which public information is available and explore how computer vision methods are effectively used in those products.  ...  . random attacks, targeted attacks, multi-source attack, and port scans).  ... 
arXiv:2005.03318v1 fatcat:pcng7535obec3l6fejkllbi3ii

Journalism as usual: The use of social media as a newsgathering tool in the coverage of the Iranian elections in 2009

Megan Knight
2012 Journal of Media Practice  
Frenemy Google is often dubbed the frenemy of news organisations: half friend and half enemy.  ...  Many media organisations are uncomfortable that Google can index and link with impunity yet they value the traffic it creates. (Chapter 13)  ...  linked to the original table of crime reports.  ... 
doi:10.1386/jmpr.13.1.61_1 fatcat:abhz6rqlffdzdklr25gm5d4anq

Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases [article]

Gerhard Weikum, Luna Dong, Simon Razniewski, Fabian Suchanek
2021 arXiv   pre-print
Over the last decade, large-scale knowledge bases, also known as knowledge graphs, have been automatically constructed from web contents and text sources, and have become a key asset for search engines  ...  On top of this, the article discusses the automatic extraction of entity-centric properties.  ...  We also appreciate the sustained encouragement and support by our editors Surajit Chaudhuri, Joe Hellerstein and Ihab Ilyas.  ... 
arXiv:2009.11564v2 fatcat:vh2lqfmhhbcwpf6dcsej3hhvgy

Linked Data: Evolving the Web into a Global Data Space

Tom Heath, Christian Bizer
2011 Synthesis Lectures on the Semantic Web Theory and Technology  
The book discusses patterns for publishing Linked Data, describes deployed Linked Data applications and examines their architecture.  ...  This book gives an overview of the principles of Linked Data as well as the Web of Data that has emerged through the application of these principles.  ...  This enables applications to automatically take advantage of new data sources as they become available on the Web of Data. 2.  ... 
doi:10.2200/s00334ed1v01y201102wbe001 fatcat:y5qflrlwqzd5jhryuazzcoyfdu

By Hook or by Crook: Exposing the Diverse Abuse Tactics of Technical Support Scammers [article]

Bharat Srinivasan, Athanasios Kountouras, Najmeh Miramirkhani, Monjur Alam, Nick Nikiforakis, Manos Antonakakis, Mustaque Ahamad
2017 arXiv   pre-print
Thus, investigation of search-and-ad abuse provides new insights into TSS tactics and helps detect previously unknown abuse infrastructure that facilitates these scams.  ...  Our study period of 8 months uncovered over 9,000 TSS domains, of both passive and aggressive types, with minimal overlap between sets that are reached via organic search results and sponsored ads.  ...  The URI component of the ADs and SRs are then inserted into the ADC (AD crawling) and SRC (SR crawling) queues respectively, which then coordinate with the ACM to gather more information about them, as  ... 
arXiv:1709.08331v1 fatcat:tvxw4xq5sja3hksnagwhpucdde

A First Look at the Crypto-Mining Malware Ecosystem

Sergio Pastrana, Guillermo Suarez-Tangil
2019 Proceedings of the Internet Measurement Conference on - IMC '19  
Our analysis pipeline applies both static and dynamic analysis to extract information from the samples, such as wallet identifiers and mining pools.  ...  CCS CONCEPTS • Security and privacy → Malware and its mitigation; • Social and professional topics → Malware / spyware crime; • General and reference → Measurement.  ...  The opinions, findings, and conclusions or recommendations expressed are those of the authors and do not necessarily reflect those of any of the funders.  ... 
doi:10.1145/3355369.3355576 dblp:conf/imc/PastranaS19 fatcat:4hdozcislrgipa4pkeptlotj5u

From social data mining to forecasting socio-economic crises

D. Helbing, S. Balietti
2011 The European Physical Journal Special Topics  
and economic systems.Describe requirements for efficient large-scale scientific data mining of anonymized social and economic data.Formulate strategies how to collect stylized facts extracted from large  ...  the storage, processing, evaluation, and publication of social and economic data.  ...  The authors are grateful for financial support by the Future and Emerging Technologies programme FP7-COSI-ICT of the European Commission through the project Visioneer (grant no.: 248438).  ... 
doi:10.1140/epjst/e2011-01401-8 pmid:32215190 pmcid:PMC7088654 fatcat:qgixn26btng2flz4bqplqqgede

On the Opportunities and Risks of Foundation Models [article]

Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch (+102 others)
2021 arXiv   pre-print
This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical  ...  Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities,and their effectiveness across so many tasks incentivizes homogenization  ...  Fernando Pereira, Vinodkumar Prabhakaran, Colin Raffel, Marten van Schijndel, Ludwig Schmidt, Yoav Shoham, Madalsa Singh, Megha Srivastava, Jacob Steinhardt, Emma Strubell, Qian Yang, Luke Zettlemoyer, and  ... 
arXiv:2108.07258v2 fatcat:yktkv4diyrgzzfzqlpvaiabc2m

A Similarity-based Machine Learning Approach for Detection of Software Clones

Abdullah M. Sheneamer
2021 Expert systems with applications  
As a result, an enormous amount of unstructured data is created that demands much time and effort to organize, search or manipulate.  ...  Intelligent classification of text document in a resource-constrained language (like Bengali) is challenging due to unavailability of linguistic resources, intelligent NLP tools, and larger text corpora  ...  Acknowledgements This work was supported by the Establishment of CUET IT Business Incubator Project, BHTPA, ICT Division, Bangladesh for the research on "Automatic Bengali Document Categorization based  ... 
doi:10.1016/j.eswa.2021.115394 fatcat:44sqcdpj7nfvjoa33u4dbjcpmi
« Previous Showing results 1 — 15 out of 125 results