A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Cross-Modal Retrieval in the Cooking Context: Learning Semantic Text-Image Embeddings
[article]
2018
arXiv
pre-print
In this paper, we propose a cross-modal retrieval model aligning visual and textual data (like pictures of dishes and their recipes) in a shared representation space. ...
Designing powerful tools that support cooking activities has rapidly gained popularity due to the massive amounts of available data, as well as recent advances in machine learning that are capable of analyzing ...
within the Investissements d'Avenir program under reference ANR-11-LABX-65. ...
arXiv:1804.11146v1
fatcat:kihkqzsbqbebdkcuvcqdrq34ie
Cross-Modal Retrieval in the Cooking Context
2018
The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval - SIGIR '18
In this paper, we propose a cross-modal retrieval model aligning visual and textual data (like pictures of dishes and their recipes) in a shared representation space. ...
Designing powerful tools that support cooking activities has rapidly gained popularity due to the massive amounts of available data, as well as recent advances in machine learning that are capable of analyzing ...
within the Investissements d'Avenir program under reference ANR-11-LABX-65. ...
doi:10.1145/3209978.3210036
dblp:conf/sigir/CarvalhoCPSTC18
fatcat:lue266vpufhpveg6shnbgw3lfm
TextTopicNet - Self-Supervised Learning of Visual Features Through Embedding Images on Semantic Text Spaces
[article]
2018
arXiv
pre-print
We show that adequate visual features can be learned efficiently by training a CNN to predict the semantic textual context in which a particular image is more probable to appear as an illustration. ...
Our experiments demonstrate state-of-the-art performance in image classification, object detection, and multi-modal retrieval compared to recent self-supervised or naturally-supervised approaches. ...
We gratefully acknowledge the support of the NVIDIA Corporation with the donation of the Titan X Pascal GPU used for this research. ...
arXiv:1807.02110v1
fatcat:3qe3xgsuzfem5j5doiak5bexeq
Images and Recipes: Retrieval in the Cooking Context
2018
2018 IEEE 34th International Conference on Data Engineering Workshops (ICDEW)
Recent advances in the machine learning community allowed different use cases to emerge, as its association to domains like cooking which created the computational cuisine. ...
In this paper, we tackle the picture-recipe alignment problem, having as target application the large-scale retrieval task (finding a recipe given a picture, and vice versa). ...
In this paper, we are interested in smart retrieval between recipe component modalities (namely recipe texts and cooked dish pictures) in the cooking context. ...
doi:10.1109/icdew.2018.00035
dblp:conf/icde/CarvalhoCPSC18
fatcat:43smvsvqzvdj7dwd7susx7z6wm
Food recognition and recipe analysis: integrating visual content, context and external knowledge
[article]
2018
arXiv
pre-print
the restaurant context as emerging directions. ...
as the exploration and retrieval of food-related information. ...
Cross-modal recipe modeling and retrieval Modeling the cross-modal correlation between recipes and images has multiple applications in recognition and retrieval. ...
arXiv:1801.07239v1
fatcat:kbcpto5iznhkddvdklwxxbtehm
Efficient Deep Feature Calibration for Cross-Modal Joint Embedding Learning
[article]
2021
arXiv
pre-print
This paper introduces a two-phase deep feature calibration framework for efficient learning of semantics enhanced text-image cross-modal joint embedding, which clearly separates the deep feature calibration ...
We leverage wideResNet50 to extract and encode the image category semantics to help semantic alignment of the learned recipe and image embeddings in the joint latent space. ...
in terms of both image-to-recipe and recipe-to-image cross-modal retrieval performance. ...
arXiv:2108.00705v1
fatcat:cggupnupfbehbfhzxdlcx3hp4m
Learning TFIDF Enhanced Joint Embedding for Recipe-Image Cross-Modal Retrieval Service
2021
IEEE Transactions on Services Computing
We present a Multi-modal Semantics enhanced Joint Embedding approach (MSJE) for learning a common feature space between the two modalities (text and image), with the ultimate goal of providing high-performance ...
cross-modal retrieval services. ...
The first author Zhongwei Xie has performed this work as a two-year visiting PhD student at Georgia Institute of Technology (2019-2021, under the support from China Scholarship Council (CSC) and Wuhan ...
doi:10.1109/tsc.2021.3098834
fatcat:p6qstgiejbe53p7gnyl2mrfxce
Variational Recurrent Sequence-to-Sequence Retrieval for Stepwise Illustration
[chapter]
2020
Lecture Notes in Computer Science
Unlike most cross-modal methods, we generate an image vector corresponding to the latent topic obtained from combining the text semantics and context. ...
This new task extends the traditional cross-modal retrieval, where each image-text pair is treated independently ignoring broader context. ...
Related Work Our work is related to: cross-modal retrieval, story picturing, variational recurrent neural networks, and cooking recipe datasets.
Cross-Modal Retrieval. ...
doi:10.1007/978-3-030-45439-5_4
fatcat:bjd23a7kfnednokmphtjpg6ttm
R²GAN: Cross-Modal Recipe Retrieval With Generative Adversarial Network
2019
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
The motivation of using GAN is twofold: learning compatible cross-modal features in an adversarial way, and explanation of search results by showing the images generated from recipes. ...
Furthermore, empowered by the generated images, a two-level ranking loss in both embedding and image spaces are considered. ...
Acknowledgement The work described in this paper was fully supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (CityU 11203517). ...
doi:10.1109/cvpr.2019.01174
dblp:conf/cvpr/ZhuNCH19
fatcat:o2h5oqmohzcpdd2plfimfj5mxy
MCEN: Bridging Cross-Modal Gap between Cooking Recipes and Dish Images with Latent Variable Model
[article]
2020
arXiv
pre-print
In this paper, we focus on the task of cross-modal retrieval between food images and cooking recipes. ...
We present Modality-Consistent Embedding Network (MCEN) that learns modality-invariant representations by projecting images and texts to the same embedding space. ...
Acknowledge We would like to thank the reviewers for their detailed comments and constructive suggestions. ...
arXiv:2004.01095v1
fatcat:bgxe4ogkobeqth5uldyettrwiq
Cross-Modal Food Retrieval: Learning a Joint Embedding of Food Images and Recipes with Semantic Consistency and Attention Mechanism
[article]
2021
arXiv
pre-print
In this paper, we investigate cross-modal retrieval between food images and cooking recipes. ...
The goal is to learn an embedding of images and recipes in a common feature space, such that the corresponding image-recipe embeddings lie close to one another. ...
Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the views of National Research Foundation, Singapore. ...
arXiv:2003.03955v3
fatcat:aqy7vykr5favzdfhhamkz3k6wa
Learning Joint Embedding with Modality Alignments for Cross-Modal Retrieval of Recipes and Food Images
[article]
2021
arXiv
pre-print
This paper presents a three-tier modality alignment approach to learning text-image joint embedding, coined as JEMA, for cross-modal retrieval of cooking recipes and food images. ...
The third modality alignment incorporates two types of cross-modality alignments as the auxiliary loss regularizations to further reduce the alignment errors in the joint learning of the two modality-specific ...
The first author has performed this work as a two-year visiting PhD student at Georgia Institute of Technology (2019-2021), under the support from China Scholarship Council (CSC) and Wuhan University of ...
arXiv:2108.03788v1
fatcat:6vi5ileyq5cidk2pimegn4clfq
Learning Cross-Modal Embeddings for Cooking Recipes and Food Images
2017
2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
We postulate that these embeddings will provide a basis for further exploration of the Recipe1M dataset and food and cooking in general. ...
In this paper, we introduce Recipe1M, a new large-scale, structured corpus of over 1m cooking recipes and 800k food images. ...
and the European Regional Development Fund (ERDF). ...
doi:10.1109/cvpr.2017.327
dblp:conf/cvpr/SalvadorHAMOW017
fatcat:dganhxaqebdrhfnvngdziivxe4
Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning
[article]
2021
arXiv
pre-print
Cross-modal recipe retrieval has recently gained substantial attention due to the importance of food in people's lives, as well as the availability of vast amounts of digital cooking recipes and food images ...
In this work, we revisit existing approaches for cross-modal recipe retrieval and propose a simplified end-to-end model based on well established and high performing encoders for text and images. ...
Cross-Modal Recipe Retrieval Learning cross-modal embeddings for images and text is currently an active research area [19, 15, 18] . ...
arXiv:2103.13061v1
fatcat:smg4gd3hevgxtgg2f6swyvlt3a
Out of context: Computer systems that adapt to, and learn from, context
2000
IBM Systems Journal
These operationsmay be dependenton time,place, weather,userpreferences, or the historyof interaction. In otherwords,context.But what, exactly,is context? ...
We look at perspectivesfrom softwareagents,sensors,and embedded devices,and also contrasttraditional mathematical and formalapproaches.We see how each treatsthe problemof contextand discussthe implications ...
The availability of semantic knowledge bases such as WordNet 27 also encourages partial understanding of context expressed in natural language text. ...
doi:10.1147/sj.393.0617
fatcat:3roilp7exvbk3d3nteukkdyu3u
« Previous
Showing results 1 — 15 out of 2,111 results