Filters








259 Hits in 1.4 sec

De Roncesvalles a San Millán. Dibujos de Valentín Carderera y Jaume Serra i Gibert

Antón Pombo
2010 Ad Limina  
Serra i Gibert (1865).  ...  En el anterior marco, tenemos que encuadrar las aportaciones de Jaume Serra i Gibert (1834-1877) y Valentín Carderera y Solano (1796-1880), ambos colaboradores de Pedro de Madrazo que, en agosto de 1865  ... 
doaj:798faa0badea42ef83c016b92bbd4077 fatcat:tei46hsssvel7mkowtoqd2wcze

Location Sensitive Image Retrieval and Tagging [article]

Raul Gomez, Jaume Gibert, Lluis Gomez, Dimosthenis Karatzas
2020 arXiv   pre-print
People from different parts of the globe describe objects and concepts in distinct manners. Visual appearance can thus vary across different geographic locations, which makes location a relevant contextual information when analysing visual data. In this work, we address the task of image retrieval related to a given tag conditioned on a certain location on Earth. We present LocSens, a model that learns to rank triplets of images, tags and coordinates by plausibility, and two training strategies
more » ... to balance the location influence in the final ranking. LocSens learns to fuse textual and location information of multimodal queries to retrieve related images at different levels of location granularity, and successfully utilizes location information to improve image tagging.
arXiv:2007.03375v1 fatcat:5yd2fzjflzbrdnkd4ew2f3pqdi

Exploring Hate Speech Detection in Multimodal Publications [article]

Raul Gomez, Jaume Gibert, Lluis Gomez, Dimosthenis Karatzas
2019 arXiv   pre-print
In this work we target the problem of hate speech detection in multimodal publications formed by a text and an image. We gather and annotate a large scale dataset from Twitter, MMHS150K, and propose different models that jointly analyze textual and visual information for hate speech detection, comparing them with unimodal detection. We provide quantitative and qualitative results and analyze the challenges of the proposed task. We find that, even though images are useful for the hate speech
more » ... ction task, current multimodal models cannot outperform models analyzing only text. We discuss why and open the field and the dataset for further research.
arXiv:1910.03814v1 fatcat:lx3qhyvce5hn3drszus77on54a

Self-Supervised Learning from Web Data for Multimodal Retrieval [article]

Raul Gomez, Lluis Gomez, Jaume Gibert, Dimosthenis Karatzas
2019 arXiv   pre-print
Self-Supervised learning from multimodal image and text data allows deep neural networks to learn powerful features with no need of human annotated data. Web and Social Media platforms provide a virtually unlimited amount of this multimodal data. In this work we propose to exploit this free available data to learn a multimodal image and text embedding, aiming to leverage the semantic knowledge learnt in the text domain and transfer it to a visual model for semantic image retrieval. We
more » ... e that the proposed pipeline can learn from images with associated textwithout supervision and analyze the semantic structure of the learnt joint image and text embedding space. We perform a thorough analysis and performance comparison of five different state of the art text embeddings in three different benchmarks. We show that the embeddings learnt with Web and Social Media data have competitive performances over supervised methods in the text based image retrieval task, and we clearly outperform state of the art in the MIRFlickr dataset when training in the target data. Further, we demonstrate how semantic multimodal image retrieval can be performed using the learnt embeddings, going beyond classical instance-level retrieval problems. Finally, we present a new dataset, InstaCities1M, composed by Instagram images and their associated texts that can be used for fair comparison of image-text embeddings.
arXiv:1901.02004v1 fatcat:wpibqwyf2rax7ltrahjnw6vvxy

Extended Labeled Faces in-the-Wild (ELFW): Augmenting Classes for Face Segmentation [article]

Rafael Redondo, Jaume Gibert
2020 arXiv   pre-print
Existing face datasets often lack sufficient representation of occluding objects, which can hinder recognition, but also supply meaningful information to understand the visual context. In this work, we introduce Extended Labeled Faces in-the-Wild (ELFW), a dataset supplementing with additional face-related categories -- and also additional faces -- the originally released semantic labels in the vastly used Labeled Faces in-the-Wild (LFW) dataset. Additionally, two object-based data augmentation
more » ... techniques are deployed to synthetically enrich under-represented categories which, in benchmarking experiments, reveal that not only segmenting the augmented categories improves, but also the remaining ones benefit.
arXiv:2006.13980v1 fatcat:l4o5zqacdrab3n6swrca4vocye

Learning to Learn from Web Data through Deep Semantic Embeddings [article]

Raul Gomez, Lluis Gomez, Jaume Gibert, Dimosthenis Karatzas
2018 arXiv   pre-print
In this paper we propose to learn a multimodal image and text embedding from Web and Social Media data, aiming to leverage the semantic knowledge learnt in the text domain and transfer it to a visual model for semantic image retrieval. We demonstrate that the pipeline can learn from images with associated text without supervision and perform a thourough analysis of five different text embeddings in three different benchmarks. We show that the embeddings learnt with Web and Social Media data
more » ... competitive performances over supervised methods in the text based image retrieval task, and we clearly outperform state of the art in the MIRFlickr dataset when training in the target data. Further we demonstrate how semantic multimodal image retrieval can be performed using the learnt embeddings, going beyond classical instance-level retrieval problems. Finally, we present a new dataset, InstaCities1M, composed by Instagram images and their associated texts that can be used for fair comparison of image-text embeddings.
arXiv:1808.06368v1 fatcat:m4dtjcxevnfmdgz7z47rzulv3e

Graph embedding in vector spaces by node attribute statistics

Jaume Gibert, Ernest Valveny, Horst Bunke
2012 Pattern Recognition  
Graph-based representations are of broad use and applicability in pattern recognition. They exhibit, however, a major drawback with regards to the processing tools that are available in their domain. Graph embedding into vector spaces is a growing field among the structural pattern recognition community which aims at providing a feature vector representation for every graph, and thus enables classical statistical learning machinery to be used on graph-based input patterns. In this work, we
more » ... se a novel embedding methodology for graphs with continuous node attributes and unattributed edges. The approach presented in this paper is based on statistics of the node labels and the edges between them, based on their similarity to a set of representatives. We specifically deal with an important issue of this methodology, namely, the selection of a suitable set of representatives. In an experimental evaluation, we empirically show the advantages of this novel approach in the context of different classification problems using several databases of graphs.
doi:10.1016/j.patcog.2012.01.009 fatcat:ubz2fdx7p5hmjp56luihghyeqa

AI in the media and creative industries [article]

Giuseppe Amato, Malte Behrmann, Frédéric Bimbot , Ander Garcia, Joost Geurts, Jaume Gibert, Guillaume Gravier , Antoine Liutkus, Andrew Perkis , Emmanuel Vincent
2019 arXiv   pre-print
Thanks to the Big Data revolution and increasing computing capacities, Artificial Intelligence (AI) has made an impressive revival over the past few years and is now omnipresent in both research and industry. The creative sectors have always been early adopters of AI technologies and this continues to be the case. As a matter of fact, recent technological developments keep pushing the boundaries of intelligent systems in creative applications: the critically acclaimed movie "Sunspring",
more » ... in 2016, was entirely written by AI technology, and the first-ever Music Album, called "Hello World", produced using AI has been released this year. Simultaneously, the exploratory nature of the creative process is raising important technical challenges for AI such as the ability for AI-powered techniques to be accurate under limited data resources, as opposed to the conventional "Big Data" approach, or the ability to process, analyse and match data from multiple modalities (text, sound, images, etc.) at the same time. The purpose of this white paper is to understand future technological advances in AI and their growing impact on creative industries. This paper addresses the following questions: Where does AI operate in creative Industries? What is its operative role? How will AI transform creative industries in the next ten years? This white paper aims to provide a realistic perspective of the scope of AI actions in creative industries, proposes a vision of how this technology could contribute to research and development works in such context, and identifies research and development challenges.
arXiv:1905.04175v1 fatcat:r6w6bord75flli72j5vpv3vvky

Learning from #Barcelona Instagram data what Locals and Tourists post about its Neighbourhoods [article]

Raul Gomez, Lluis Gomez, Jaume Gibert, Dimosthenis Karatzas
2018 arXiv   pre-print
Massive tourism is becoming a big problem for some cities, such as Barcelona, due to its concentration in some neighborhoods. In this work we gather Instagram data related to Barcelona consisting on images-captions pairs and, using the text as a supervisory signal, we learn relations between images, words and neighborhoods. Our goal is to learn which visual elements appear in photos when people is posting about each neighborhood. We perform a language separate treatment of the data and show
more » ... it can be extrapolated to a tourists and locals separate analysis, and that tourism is reflected in Social Media at a neighborhood level. The presented pipeline allows analyzing the differences between the images that tourists and locals associate to the different neighborhoods. The proposed method, which can be extended to other cities or subjects, proves that Instagram data can be used to train multi-modal (image and text) machine learning models that are useful to analyze publications about a city at a neighborhood level. We publish the collected dataset, InstaBarcelona and the code used in the analysis.
arXiv:1808.06369v1 fatcat:26s2i52z3rd4tmzfw47y7ahuvu

Graph of Words Embedding for Molecular Structure-Activity Relationship Analysis [chapter]

Jaume Gibert, Ernest Valveny, Horst Bunke
2010 Lecture Notes in Computer Science  
Structure-Activity relationship analysis aims at discovering chemical activity of molecular compounds based on their structure. In this article we make use of a particular graph representation of molecules and propose a new graph embedding procedure to solve the problem of structure-activity relationship analysis. The embedding is essentially an arrangement of a molecule in the form of a vector by considering frequencies of appearing atoms and frequencies of covalent bonds between them. Results
more » ... on two benchmark databases show the effectiveness of the proposed technique in terms of recognition accuracy while avoiding high operational costs in the transformation.
doi:10.1007/978-3-642-16687-7_9 fatcat:mwsh5zy6snc6xa6rjpcjqmdu7q

Learning to Learn from Web Data Through Deep Semantic Embeddings [chapter]

Raul Gomez, Lluis Gomez, Jaume Gibert, Dimosthenis Karatzas
2019 Lecture Notes in Computer Science  
In this paper we propose to learn a multimodal image and text embedding from Web and Social Media data, aiming to leverage the semantic knowledge learnt in the text domain and transfer it to a visual model for semantic image retrieval. We demonstrate that the pipeline can learn from images with associated text without supervision and perform a thourough analysis of five different text embeddings in three different benchmarks. We show that the embeddings learnt with Web and Social Media data
more » ... competitive performances over supervised methods in the text based image retrieval task, and we clearly outperform state of the art in the MIRFlickr dataset when training in the target data. Further we demonstrate how semantic multimodal image retrieval can be performed using the learnt embeddings, going beyond classical instance-level retrieval problems. Finally, we present a new dataset, InstaCities1M, composed by Instagram images and their associated texts that can be used for fair comparison of image-text embeddings.
doi:10.1007/978-3-030-11024-6_40 fatcat:crzepesrz5bglj3bmunkldk6ey

Graph Embedding Based on Nodes Attributes Representatives and a Graph of Words Representation [chapter]

Jaume Gibert, Ernest Valveny
2010 Lecture Notes in Computer Science  
Although graph embedding has recently been used to extend statistical pattern recognition techniques to the graph domain, some existing embeddings are usually computationally expensive as they rely on classical graph-based operations. In this paper we present a new way to embed graphs into vector spaces by first encapsulating the information stored in the original graph under another graph representation by clustering the attributes of the graphs to be processed. This new representation makes
more » ... e association of graphs to vectors an easy step by just arranging both node attributes and the adjacency matrix in the form of vectors. To test our method, we use two different databases of graphs whose nodes attributes are of different nature. A comparison with a reference method permits to show that this new embedding is better in terms of classification rates, while being much more faster.
doi:10.1007/978-3-642-14980-1_21 fatcat:63ksqkcwl5f3rpf35neyqp6dmi

EMBEDDING OF GRAPHS WITH DISCRETE ATTRIBUTES VIA LABEL FREQUENCIES

JAUME GIBERT, ERNEST VALVENY, HORST BUNKE
2013 International journal of pattern recognition and artificial intelligence  
Gibert, E. Valveny and H. Bunke  ... 
doi:10.1142/s0218001413600021 fatcat:mhfocovaprgt3or4jaqa7kno7i

A kernel-based approach to document retrieval

Albert Gordo, Jaume Gibert, Ernest Valveny, Marçal Rusiñol
2010 Proceedings of the 8th IAPR International Workshop on Document Analysis Systems - DAS '10  
In this paper we tackle the problem of document image retrieval by combining a similarity measure between documents and the probability that a given document belongs to a certain class. The membership probability to a specific class is computed using Support Vector Machines in conjunction with similarity measure based kernel applied to structural document representations. In the presented experiments, we use different document representations, both visual and structural, and we apply them to a
more » ... atabase of historical documents. We show how our method based on similarity kernels outperforms the usual distance-based retrieval.
doi:10.1145/1815330.1815379 dblp:conf/das/GordoGVR10 fatcat:gxleqdnwibbrhihnpozsxavewm

Learning from #Barcelona Instagram Data What Locals and Tourists Post About Its Neighbourhoods [chapter]

Raul Gomez, Lluis Gomez, Jaume Gibert, Dimosthenis Karatzas
2019 Lecture Notes in Computer Science  
Massive tourism is becoming a big problem for some cities, such as Barcelona, due to its concentration in some neighborhoods. In this work we gather Instagram data related to Barcelona consisting on images-captions pairs and, using the text as a supervisory signal, we learn relations between images, words and neighborhoods. Our goal is to learn which visual elements appear in photos when people is posting about each neighborhood. We perform a language separate treatment of the data and show
more » ... it can be extrapolated to a tourists and locals separate analysis, and that tourism is reflected in Social Media at a neighborhood level. The presented pipeline allows analyzing the differences between the images that tourists and locals associate to the different neighborhoods. The proposed method, which can be extended to other cities or subjects, proves that Instagram data can be used to train multi-modal (image and text) machine learning models that are useful to analyze publications about a city at a neighborhood level. We publish the collected dataset, InstaBarcelona and the code used in the analysis.
doi:10.1007/978-3-030-11024-6_41 fatcat:edtpyjbil5eedn6smpkhxqwyiy
« Previous Showing results 1 — 15 out of 259 results