2,797 Hits in 5.9 sec

Mining the UK Web Archive for Semantic Change Detection

Adam Tsakalidis, The Alan Turing Institute, London, United Kingdom, Marya Bazzi, Mihai Cucuringu, Pierpaolo Basile, Barbara McGillivray, The Alan Turing Institute, London, United Kingdom, University of Warwick, Coventry, United Kingdom, University of Oxford, Oxford, United Kingdom, The Alan Turing Institute, London, United Kingdom, University of Oxford, Oxford, United Kingdom, University of Bari, Bari, Italy (+2 others)
2019 Proceedings - Natural Language Processing in a Deep Learning World  
Semantic change detection (i.e., identifying words whose meaning has changed over time) started emerging as a growing area of research over the past decade, with important downstream applications in natural  ...  In this work, we aim to mitigate these issues by (a) releasing a new labelled dataset of more than 47K word vectors trained on the UK Web Archive over a short time-frame (2000-2013); (b) proposing a variant  ...  Acknowledgments This work was supported by The Alan Turing Institute under the EPSRC grant EP/N510129/1 and the seed funding grant SF099.  ... 
doi:10.26615/978-954-452-056-4_139 dblp:conf/ranlp/TsakalidisBCBM19 fatcat:vgfs3axxybe7rfgnce74lvxn6a

Detecting Off-Topic Pages in Web Archives [chapter]

Yasmin AlNoamany, Michele C. Weigle, Michael L. Nelson
2015 Lecture Notes in Computer Science  
In this paper, we address the problems of detecting off-topic pages in Web archive collections.  ...  Web archives have become a significant repository of our recent history and cultural heritage. Archival integrity and accuracy is a precondition for future cultural research.  ...  Below, we outline some of the approaches that have been used for mining the past Web using data in Web archives.  ... 
doi:10.1007/978-3-319-24592-8_17 fatcat:wzct6nr2hnfgjbg23hzb37slvi

Large-Scale Multimedia Retrieval and Mining [Guest editors' introduction]

Rong Yan, Benoit Huet, Rahul Sukthankar
2011 IEEE Multimedia  
event detection, landmark detection, image annotation, musical content mining, and cloud computing.  ...  To overcome this drawback, Wu and Hoi propose an online semantics-preserving, metric-learning algorithm for enhancing BoW by minimizing the semantic loss.  ... 
doi:10.1109/mmul.2011.11 fatcat:vy7eqmqqlbeifh4gibkoerx7ie

Incorporating terminology evolution for query translation in text retrieval with association rules

Amal C. Kaluarachchi, Aparna S. Varde, Srikanta Bedathur, Gerhard Weikum, Jing Peng, Anna Feldman
2010 Proceedings of the 19th ACM international conference on Information and knowledge management - CIKM '10  
When these archives cover long spans of time, the terminology within them could undergo significant changes.  ...  Time-stamped documents such as newswire articles, blog posts and other web-pages are often archived online.  ...  We use the classical Apriori algorithm to mine association rules, for which we define transactions with respect to the text archives.  ... 
doi:10.1145/1871437.1871730 dblp:conf/cikm/KaluarachchiVBWPF10 fatcat:zeew7ywctfcfhasutkoiiov5je

Tracking entities in web archives

Marc Spaniol, Gerhard Weikum
2012 Proceedings of the 21st international conference companion on World Wide Web - WWW '12 Companion  
The LAWA project (Longitudinal Analytics of Web Archive data) is developing an Internet-based experimental testbed for largescale data analytics on Web archive collections.  ...  In this paper, we highlight our research on entity-level analytics in Web archive data, which lifts Web analytics from plain text to the entity-level by detecting named entities, resolving ambiguous names  ...  Acknowledgements This work is supported by the 7 th Framework IST programme of the European Union through the focused research project (STREP) on Longitudinal Analytics of Web Archive data (LAWA) under  ... 
doi:10.1145/2187980.2188030 dblp:conf/www/SpaniolW12 fatcat:zo2mbfifzbhmbnfxqod5trhl6i

Report on the First International Workshop on Database Preservation (PresDB'07)

Vassilis Christophides, Peter Buneman
2007 SIGMOD record  
This is possible because of Chronos' ability to detect, describe and manage the semantic and structural changes in the production database schema between any two subsequent executions of the archiving  ...  Thus, the main challenge is to study common formalisms to express the structure and semantics of a database-centric environment as well as to devise frameworks for responding to change.  ... 
doi:10.1145/1324185.1324197 fatcat:2kfwgeff3vhhfiomhflpird4ta


N. Anastopoulou, M. Kavouras, M. Kokla, E. Tomai
2021 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences  
Semantic information extraction in geospatial-oriented approaches is further used for semantic analysis, search, and retrieval.  ...  For the better understanding of the above-mentioned information, semantic networks are used as a powerful visualization tool of the links among concepts – locations – emotions.  ...  ACKNOWLEDGEMENTS The research work was supported by the Hellenic Foundation for Research and Innovation (H.F.R.I.) under the First Call for H.F.R.I.  ... 
doi:10.5194/isprs-archives-xliii-b4-2021-31-2021 fatcat:q7efkmge7beejiebtymh46656u

Blog Preservation: Current Challenges and a New Paradigm [chapter]

Vangelis Banos, Nikos Baltas, Yannis Manolopoulos
2013 Lecture Notes in Business Information Processing  
We argue that current web archiving solutions are not able to capture the dynamic and continuously evolving nature of blogs, their network and social structure as well as the exchange of concepts and ideas  ...  Blogging is yet another popular and prominent application in the era of Web 2.0.  ...  We would also like to thank all BlogForever project partners for their invaluable contributions to the project.  ... 
doi:10.1007/978-3-642-40654-6_3 fatcat:3o4swju4abal5ak4dfn6daxvqi

The ARCOMEM Architecture for Social- and Semantic-Driven Web Archiving

Thomas Risse, Elena Demidova, Stefan Dietze, Wim Peters, Nikolaos Papailiou, Katerina Doka, Yannis Stavrakas, Vassilis Plachouras, Pierre Senellart, Florent Carpentier, Amin Mantrach, Bogdan Cautis (+2 others)
2014 Future Internet  
The constantly growing amount of Web content and the success of the Social Web lead to increasing needs for Web archiving. These needs go beyond the pure preservation of Web pages.  ...  Due to the size of the Web, the traditional "collect-all" strategy is in many cases not the best method to build Web archives.  ...  Conflicts of Interest Thomas Risse and Wim Peters are co-editors of the Special Issue on Archiving Community Memories.  ... 
doi:10.3390/fi6040688 fatcat:jm7aicz6trfadnamhqnfsltcvy

Protein Data Bank Japan (PDBj): updated user interfaces, resource description framework, analysis tools for large structures

Akira R. Kinjo, Gert-Jan Bekker, Hirofumi Suzuki, Yuko Tsuchiya, Takeshi Kawabata, Yasuyo Ikegawa, Haruki Nakamura
2016 Nucleic Acids Research  
While maintaining the archive in collaboration with other wwPDB partners, PDBj also provides a wide range of services and tools for analyzing structures and functions of proteins.  ...  We herein outline the updated web user interfaces together with RESTful web services and the backend relational database that support the former.  ...  T. thanks Shigeru Endo for helping with the ProMode Elastic service.  ... 
doi:10.1093/nar/gkw962 pmid:27789697 pmcid:PMC5210648 fatcat:pan6qdrpe5bknfgkot6blgnp6u

Literature Retrieval and Mining in Bioinformatics: State of the Art and Challenges

Andrea Manconi, Eloisa Vargiu, Giuliano Armano, Luciano Milanesi
2012 Advances in Bioinformatics  
The world has widely changed in terms of communicating, acquiring, and storing information.  ...  In this paper, after recalling the main topics concerning information retrieval, we present a survey on the main works on literature retrieval and mining in bioinformatics.  ...  (RBIN064YAT 003), and the European "SHIWA" projects.  ... 
doi:10.1155/2012/573846 pmid:22778730 pmcid:PMC3388278 fatcat:gjemmg4e5jg3zkxyfzfknbisfa

Big Humanities Data Workshop at IEEE Big Data 2013

Tobias Blanke, Mark Hedges, Richard Marciano
2014 D-Lib Magazine  
The curious identity of Michael Field and the semantic web , presented by John Simpson , University of Alberta, Canada.  ...  A working example is proposed where semantic web ontologies reveal a lack of nuances in dealing with the complex relationships between names and people.  ... 
doi:10.1045/january2014-blanke fatcat:kcjuz5wmdvdqbbsjqjvilemcty

Using Web Archives to Enrich the Live Web Experience Through Storytelling

Yasmin AlNoamany
2013 Bulletin of IEEE Technical Committee on Digital Libraries  
Content from web archives can be used to fill in the gaps in the live web about the evolution of the story of an important event. Every story is made up of a sequence of events.  ...  In this research, events are exemplified through corresponding web pages from the live web and web archives, (semi-)automatically discovered, arranged in a narrative structure ordered by time, and replayed  ...  We thank Kris Carpenter Negulescu (Internet Archive) for access to the anonymized Wayback Machine logs.  ... 
dblp:journals/tcdl/AlNoamany13 fatcat:avs7k5fnkbfpbbory4os5dmqjm

The past issue of the web

Helen Hockx-Yu
2011 Proceedings of the 3rd International Web Science Conference on - WebSci '11  
the web archiving agenda.  ...  The paper argues for closer collaboration with the main stream web science research community and the use of technology developed for the live web, such as visualisation and data analytics, to advance  ...  Web analysis, Web crawling and mining, event and topic detection and consolidation, and multimedia content mining.  ... 
doi:10.1145/2527031.2527050 dblp:conf/websci/Hockx-Yu11 fatcat:kdbhuomqk5an5grouch4kwe2xe

eScience and archiving for space science

Timothy E. Eastman, Kirk D. Borne, James L. Green, Edwin J. Grayzeck, Robert E. McGuire, Donald M. Sawyer
2005 Data Science Journal  
A confluence of new technologies (internet, XML and Web Services, broadband networking, high-speed computation, distributed Grid computing, ontologies and semantic representation) is dramatically changing  ...  The need for this Data-Model-HPC-Sensor synergism derives from the following set of drivers.  ...  SOLUTIONS • Distributed data environments • Grid Services (interoperability; semantic web) • eScience, virtual observatories, data grids • Knowledge discovery, data mining • Data archive standards • Sensor  ... 
doi:10.2481/dsj.4.67 fatcat:epc4a6apojcjrliksf6q54pybe
« Previous Showing results 1 — 15 out of 2,797 results