Filters








7,815 Hits in 5.9 sec

Robust web page segmentation for mobile terminal using content-distances and page layout information

Gen Hattori, Keiichiro Hoashi, Kazunori Matsumoto, Fumiaki Sugaya
2007 Proceedings of the 16th international conference on World Wide Web - WWW '07  
In this paper, we propose a hybrid segmentation method which segments Web pages based on both the content-distance calculated by the previous scheme, and a novel approach which utilizes Web page layout  ...  Therefore, a method to reconstruct PCoptimized Web pages for mobile phone users is essential.  ...  In previous work, we have proposed a Web page segmentation method, which divides a Web page into two or more small objects in order to make it viewable on a mobile phone [1] .  ... 
doi:10.1145/1242572.1242622 dblp:conf/www/HattoriHMS07 fatcat:isjefmuyzfasfia5o2ng5sqjfe

A High-Level Representation of the Navigation Behavior of Website Visitors

Alicia Huidobro, Raúl Monroy, Bárbara Cervantes
2022 Applied Sciences  
In this paper, we present a method for representing the navigation behavior of an entire class of website visitors in a moderately small graph, aiming to ease the task of web analysis, especially in marketing  ...  Then, we replace those rules with a symbol that is given a representative name and use it to obtain a shrinked representation of a session.  ...  We also thank NIC Mexico for providing the data used in this research. Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/app12136711 fatcat:bfrxd2smaff5vmlxfireynx2f4

Automated classification and localization of daily deal content from the Web

John Cuzzola, Jelena Jovanović, Ebrahim Bagheri, Dragan Gašević
2015 Applied Soft Computing  
Our approach is based on a semi-supervised method that uses sentence-level features of daily deal information on a given Web page.  ...  and extract individual deals from within a complex Web page.  ...  "iterative shrinking and dividing" strategy.  ... 
doi:10.1016/j.asoc.2015.02.029 fatcat:gc34mhlg45fzlfiprmheqxkgam

ICDAR 2011 Robust Reading Competition - Challenge 1: Reading Text in Born-Digital Images (Web and Email)

D. Karatzas, S. Robles Mestre, J. Mas, F. Nourbakhsh, P. Pratim Roy
2011 2011 International Conference on Document Analysis and Recognition  
Challenge 1 is focused on the extraction of text from born-digital images, specifically from images found in Web pages and emails.  ...  In this paper we present the results of the challenge for all three tasks, and make an open call for continuous participation outside the context of ICDAR 2011.  ...  INTRODUCTION Images are frequently used in Web pages and email messages to embed textual information.  ... 
doi:10.1109/icdar.2011.295 dblp:conf/icdar/KaratzasMMNR11 fatcat:pp3kcsp3fbhqpjwki5uomvpbpa

A Semantic-Based Framework for Summarization and Page Segmentation in Web Mining [chapter]

Alessio Leoncini, Fabio Sangiacomo, Paolo Gastaldo, Rodolfo Zunino
2012 Theory and Applications for Advanced Text Mining  
Web page segmentation Website pages are designed for visual interaction, and typically include a number of visual segments conveying heterogeneous contents.  ...  Web page segmentation methods apply heuristic algorithms, and mainly rely on the Document Object Model (DOM) tree structure that is associated to a web resource.  ... 
doi:10.5772/51178 fatcat:wxiu7ncg2ncd7azhbndz4c4si4

A Visual Technique for Web Pages Comparison

María Alpuente, Daniel Romero
2009 Electronical Notes in Theoretical Computer Science  
This allows us to translate the web page to a normalized form where groups of html tags are mapped into a common canonical one.  ...  In this work, we present a technique for recognizing and comparing the visual structural information of Web pages, The technique is based on a classification of the set of html-tags which is guided by  ...  By means of two compression functions we obtain a visual representative for the original Web page.  ... 
doi:10.1016/j.entcs.2009.03.002 fatcat:d7uiitocs5bfxhtz6u4lzjaotu

ICDAR 2013 Robust Reading Competition

Dimosthenis Karatzas, Faisal Shafait, Seiichi Uchida, Masakazu Iwamura, Lluis Gomez i Bigorda, Sergi Robles Mestre, Joan Mas, David Fernandez Mota, Jon Almazan Almazan, Lluis Pere de las Heras
2013 2013 12th International Conference on Document Analysis and Recognition  
Challenge 1 is focused on the extraction of text from born-digital images, specifically from images found in Web pages and emails.  ...  In this paper we present the results of the challenge for all three tasks, and make an open call for continuous participation outside the context of ICDAR 2011.  ...  INTRODUCTION Images are frequently used in Web pages and email messages to embed textual information.  ... 
doi:10.1109/icdar.2013.221 dblp:conf/icdar/KaratzasSUIBMMMAH13 fatcat:l2crnmcmmbh6biiprudkopbsea

Enhancing web page readability for non-native readers

Chen-Hsiang Yu, Robert C. Miller
2010 Proceedings of the 28th international conference on Human factors in computing systems - CHI '10  
In this paper, we focus on the presentation of content and propose a new transformation method, Jenga Format, to enhance web page readability.  ...  Readers face many obstacles on today's Web, including distracting content competing for the user's attention and other factors interfering with comfortable reading.  ...  ACKNOWLEDGMENTS We thank all the participants' help and reviewers' valuable suggestions. This work is supported in part by Quanta Computer as part of the TParty project.  ... 
doi:10.1145/1753326.1753709 dblp:conf/chi/YuM10 fatcat:g2y5v42dk5eb7in5jfrt6lkupa

A Realistic Dataset for Performance Evaluation of Document Layout Analysis

Apostolos Antonacopoulos, David Bridson, Christos Papadopoulos, Stefan Pletschacher
2009 2009 10th International Conference on Document Analysis and Recognition  
† There is a significant need for a realistic dataset on which to evaluate layout analysis methods and examine their performance in detail.  ...  Ground truth is efficiently created using a new semi-automated tool and stored in a new comprehensive XML representation, the PAGE format.  ...  Acknowledgement The authors would like to thank Dimosthenis Karatzas for his significant contributions to previous versions of the Aletheia ground-truthing tool.  ... 
doi:10.1109/icdar.2009.271 dblp:conf/icdar/AntonacopoulosBPP09 fatcat:bsv54ehuyjc23g7rixqjrdlzlq

CaSePer: An efficient model for personalized web page change detection based on segmentation

K.S. Kuppusamy, G. Aghila
2014 Journal of King Saud University: Computer and Information Sciences  
The change detection is micro-managed by introducing web page segmentation. The web page change detection process is made efficient by having it perform a dual-step process.  ...  Because of the increased dynamism of web pages, it would be difficult for the user to identify the changes manually.  ...  These larger segments are then sub-divided using the densitometer.  ... 
doi:10.1016/j.jksuci.2013.02.001 fatcat:avuy3cbi2ncebe6vmjejp2tcoe

The Skincare project, an interactive deep learning system for differential diagnosis of malignant skin lesions. Technical Report [article]

Daniel Sonntag, Fabrizio Nunnari, Hans-Jürgen Profitlich
2020 arXiv   pre-print
ISIC allows for differential diagnosis, a ranked list of eight diagnoses, that is used to plan treatments in the common setting of diagnostic ambiguity.  ...  However, the main contribution is a diagnostic and decision support system in dermatology for patients and doctors, an interactive deep learning system for differential diagnosis of malignant skin lesions  ...  This research is partly supported by H2020 and BMBF. See our project page about Skincare http://medicalcps.dfki.de/?page_id=1056 and the Interactive Machine Learning Lab (http: //iml.dfki.de).  ... 
arXiv:2005.09448v1 fatcat:ofwcqevb5fagvaxf6cxrtbtb24

Web content adaptation for mobile device: A fuzzy-based approach

2012 Knowledge Management & E-Learning: An International Journal  
Experimental results demonstrate that our method enhances Web content analysis and adaptation on the mobile Internet.  ...  While HTML will continue to be used to develop Web content, how to effectively and efficiently transform HTML-based content automatically into formats suitable for mobile devices remains a challenge.  ...  Acknowledgement This work is supported by National Science Council, Taiwan under grants NSC98-2511-S-008-006-MY3 and NSC98-2511-S-008-007-MY3 and NSC 100-2511-S-146 -001.  ... 
doi:10.34105/j.kmel.2012.04.010 fatcat:khb7egtjsnaxrpftrc4lpzyxb4

The Weed Plant Detection

Geetha V., Gomathy CK, Y.Padmini Reddy, Haripriya V.
2021 International Journal of Engineering and Advanced Technology  
Optical sensors changes to detect vary weed densities and species which can have mapped using GPS data.  ...  Weeds are extracted from the pictures that are using the image processing and therefore the report by the form features.  ...  Dr.C K Gomathy, Article: A Semantic Quality of Web Service Information Retrieval Techniques Using Bin Rank A Cloud Monitoring Framework Perform in Web Services, International Journal of Scientific Research  ... 
doi:10.35940/ijeat.d2454.0410421 fatcat:6ciexbiktvaldnwpwwukvl4vd4

Web-based algorithm animation

Marc Najork
2001 Proceedings of the 38th conference on Design automation - DAC '01  
JCAT augments the expressive power of Web pages for publishing passive multimedia information with a full-fledged interactive algorithm animation system.  ...  Web browsers at the appropriate page.  ...  or more views; and finally, creating web pages that make use of the algorithm and views.  ... 
doi:10.1145/378239.379013 dblp:conf/dac/Najork01 fatcat:g333hhwlcfbxnoh5amo53t25zq

Counteracting Dark Web Text-Based CAPTCHA with Generative Adversarial Learning for Proactive Cyber Threat Intelligence [article]

Ning Zhang, Mohammadreza Ebrahimi, Weifeng Li, Hsinchun Chen
2022 arXiv   pre-print
To eliminate the need for human involvement, the proposed framework utilizes Generative Adversarial Network (GAN) to counteract dark web background noise and leverages an enhanced character segmentation  ...  This framework encompasses a novel generative method to recognize dark web text-based CAPTCHA with noisy background and variable character length.  ...  ACKNOWLEDGMENTS This material is based upon work supported by the National Science Foundation (NSF) under the following grants: SaTC-1936370, CICI-1917117, and SFS-1921485.  ... 
arXiv:2201.02799v2 fatcat:nepnavt6onaf7ncinm2x6ar5ey
« Previous Showing results 1 — 15 out of 7,815 results