1,261 Hits in 4.6 sec

Link spamming Wikipedia for profit

Andrew G. West, Jian Chang, Krishna Venkatasubramanian, Oleg Sokolsky, Insup Lee
2011 Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference on - CEAS '11  
Creating and analyzing the first Wikipedia link spam corpus, we find that existing spam strategies perform quite poorly in this regard.  ...  The status quo spamming model relies on link persistence to accumulate exposures, a strategy that fails given the diligence of the Wikipedia community.  ...  Before enabling HTML nofollow for outgoing links, Wikipedia was often linked spammed for search-engine optimization (SEO) purposes [48] .  ... 
doi:10.1145/2030376.2030394 dblp:conf/ceas/WestCVSL11 fatcat:gb3z74ahpjaa3lwuhlwohfcb2a

Autonomous link spam detection in purely collaborative environments

Andrew G. West, Avantika Agrawal, Phillip Baker, Brittney Exline, Insup Lee
2011 Proceedings of the 7th International Symposium on Wikis and Open Collaboration - WikiSym '11  
., link spam). The collaborative encyclopedia, Wikipedia, is the basis for our analysis.  ...  In this work, a spam corpus is extracted from over 235,000 link additions to English Wikipedia. From this, 40+ features are codified and analyzed.  ...  Further, the authors would like to thank Andrew King (UPenn, Ph.D. student) and Oleg Sokolsky (UPenn, professor) for their support throughout this project.  ... 
doi:10.1145/2038558.2038574 dblp:conf/wikis/WestABEL11 fatcat:ctpvhvpuc5dyljvzuarl5z6yu4

Spamming for Science: Active Measurement in Web 2.0 Abuse Research [chapter]

Andrew G. West, Pedram Hayati, Vidyasagar Potdar, Insup Lee
2012 Lecture Notes in Computer Science  
In this work two such experiments serve as case studies: One testing a novel link spam model on Wikipedia and another using blackhat software to target blog comments and forums.  ...  In this work two such experiments serve as case studies: One testing a novel link spam model on Wikipedia and another using blackhat software to target blog comments and forums.  ...  Terrell of UPenn's Office of the General Counsel is thanked for his guidance. Any opinions expressed in this work do not necessarily reflect the sentiments of those acknowledged here.  ... 
doi:10.1007/978-3-642-34638-5_9 fatcat:ikgjg6jzvngf3keuxbovehccf4

Organizing the vision for web 2.0

Arnaud Gorgeon, E. Burton Swanson
2009 Proceedings of the 5th International Symposium on Wikis and Open Collaboration - WikiSym '09  
We imported the revision history from Wikipedia, and analyzed and categorized the edits that were performed and the users that contributed to the article.  ...  In this paper, we examine the evolution of Web 2.0, a buzzword that is now part of the discourse of a broad community, and look at its entry in Wikipedia over the three years since its inception in March  ...  for these three years from Wikipedia.  ... 
doi:10.1145/1641309.1641337 dblp:conf/wikis/GorgeonS09 fatcat:qdl2gu4urnfufmgvgswb2yhaaa

Quality-biased ranking of web documents

Michael Bendersky, W. Bruce Croft, Yanlei Diao
2011 Proceedings of the fourth ACM international conference on Web search and data mining - WSDM '11  
These content-based features are easy to compute, store and retrieve, even for large web collections.  ...  Accordingly, instead of using a single estimate for document quality, we consider multiple contentbased features that are directly integrated into a state-ofthe-art retrieval method.  ...  for profit or commercial advantage and that copies bear this notice and the full citation on the first page.  ... 
doi:10.1145/1935826.1935849 dblp:conf/wsdm/BenderskyCD11 fatcat:px4c5ms4onamvg5eqsowa3vyhm

In Code, We Trust? Regulation and Emancipation in Cyberspace

Zhu Chenwei
2004 SCRIPTed: A Journal of Law, Technology & Society  
Wikipedia, "Spamming", @ Chris Beasley, "What is Google-Watch?", @  ...  An ultra-free market system lashes human beings into insatiable animals greedy for monetary profits.  ... 
doi:10.2966/scrip.010404.585 fatcat:onxv3amjnbeqtpjntd6i7uaftq

Detecting Wikipedia vandalism via spatio-temporal analysis of revision metadata?

Andrew G. West, Sampath Kannan, Insup Lee
2010 Proceedings of the Third European Workshop on System Security - EUROSEC '10  
Blatantly unproductive edits undermine the quality of the collaboratively-edited encyclopedia, Wikipedia.  ...  In this paper, we leverage the spatio-temporal properties of revision metadata to detect vandalism on Wikipedia.  ...  [5] reviewed the evolution (temporal) of intra-page-link topology (spatial).  ... 
doi:10.1145/1752046.1752050 dblp:conf/eurosec/WestKL10 fatcat:qqg5krjvkze5pnotrtvprzgnha

Shame to be sham

Fiana Raiber, Kevyn Collins-Thompson, Oren Kurland
2013 Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '13  
We report a novel annotation effort performed with the ClueWeb09 benchmark where pages were labeled as being spam, sham, or legitimate content.  ...  Sham documents are often ranked artificially high in response to certain queries, but also may contain some useful information and cannot be considered as absolute spam.  ...  We thank the reviewers for their comments. This work was supported by and carried out at the Technion-Microsoft Electronic Commerce Research Center.  ... 
doi:10.1145/2484028.2484135 dblp:conf/sigir/RaiberCK13 fatcat:3yo6kjb6kbdfvb54vwfch64l4e

A Survey on Adversarial Information Retrieval on the Web [article]

Saad Farooq
2020 arXiv   pre-print
This survey paper discusses different forms of malicious techniques that can affect how an information retrieval model retrieves documents for a query and their remedies.  ...  Number of Advertisements Spammers also create pages for the purpose of generating profits, so they often put too many advertisements on their page.  ...  Wikipedia is a good example of wikis.  ... 
arXiv:1911.11060v3 fatcat:vifymsujfjfkhpwm6saq37uvye

Enabling trust in crowd labor relations through identity sharing

Jörn Klinger, Matthew Lease
2011 Proceedings of the American Society for Information Science and Technology  
While online Crowdsourcing marketplaces provide a powerful avenue for facilitating new forms of informationdriven micro-labor, their practical value is significantly reduced by worker "spam" and employer  ...  By providing a vehicle for identity sharing, the prototype provides the foundation for a future user study of employers and workers engaged in known-identity crowd labor relationships.  ...  Another popular method for fighting spam is the use of "gold" standard data.  ... 
doi:10.1002/meet.2011.14504801257 fatcat:gvhmdy7fuzhdrhfqbhpnpnoucq


Reem Swuaileh, Mucahid Kutlu, Nihal Fathima, Tamer Elsayed, Matthew Lease
2016 Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval - SIGIR '16  
For IR researchers, we expect Ara-bicWeb16 to support various research areas: ad-hoc search, question answering, filtering, cross-dialect search, dialect detection, entity search, blog search, and spam  ...  We thank the Texas Advanced Computing Center (TACC) at the University of Texas at Austin for computing resources enabling this research.  ...  While we filtered blacklisted pages in selection of seeds, we intentionally did not filter out spams. Leaving spam present in ArabicWeb16 makes it useful for (Arabic) spam detection research.  ... 
doi:10.1145/2911451.2914677 dblp:conf/sigir/SuwailehKFEL16 fatcat:pq6zmj7bhzgpzh2cxfondx7omy

Vandalism detection in Wikipedia

Sara Javanmardi, David W. McDonald, Cristina V. Lopes
2011 Proceedings of the 7th International Symposium on Wikis and Open Collaboration - WikiSym '11  
The application of machine learning techniques holds promise for developing efficient online algorithms for better tools to assist users in vandalism detection.  ...  We show the results of our classifier in the PAN Wikipedia dataset.  ...  Mola-Velasco for his feedback on implementing textual features, and Martin Potthast for his support.  ... 
doi:10.1145/2038558.2038573 dblp:conf/wikis/JavanmardiML11 fatcat:xxqwnftrefcf5fbgserbnbl54a

Retrieval and feedback models for blog feed search

Jonathan L. Elsas, Jaime Arguello, Jamie Callan, Jaime G. Carbonell
2008 Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08  
We perform an in-depth analysis of the behavior of pseudorelevance feedback for this task and develop a novel query expansion technique using the link structure in Wikipedia.  ...  This query expansion technique provides significant and consistent performance improvements for this task, yielding a 22% and 14% improvement in MAP over the unexpanded query for our baseline and federated  ...  Spam was an issue for both PRF.FEED and PRF.ENTRY, but more of a problem for PRF.ENTRY.  ... 
doi:10.1145/1390334.1390394 dblp:conf/sigir/ElsasACC08 fatcat:bfrwvgs54ffbdcqntsij6cpxr4

Defacing the map: Cartographic vandalism in the digital commons [article]

Andrea Ballatore
2014 arXiv   pre-print
of reported incidents in WikiMapia and OpenStreetMap, a typology of this kind of vandalism is outlined, including play, ideological, fantasy, artistic, and industrial carto-vandalism, as well as carto-spam  ...  Automatic techniques are particularly important to counter carto-spam, which is one of the most threatening forms of carto-vandalism because of its for-profit motive.  ...  Wikified maps, free gazetteers, mash-ups, and open geo-knowledge bases create an inter-linked ecosystem of geospatial commons (Ballatore et al., 2013) .  ... 
arXiv:1404.3341v1 fatcat:gm2k5dvelbhsjikl7sqzm3rtcu

Community Kernels Detection in OSN using SVM Clustering and Classification

Rahul Nema, Anjana Pandey
2015 International Journal of Computer Applications  
Although there are various techniques implemented for the detection of community kernels in OSN.  ...  Here in this paper a new and efficient technique for the detection of community kernels in large OSN using combinatorial method of support vector machine based clustering and classification of Community  ...  Boykin & Roychowdhury [14] propose an automated anti-spam tool that exploits the properties of social networks to distinguish between unsolicited commercial email (spam) and messages associated with  ... 
doi:10.5120/19869-1854 fatcat:ei5vsjgxe5gwhlflbrw67rqbba
« Previous Showing results 1 — 15 out of 1,261 results