Filters








412 Hits in 5.0 sec

Adversarial Multimodal Representation Learning for Click-Through Rate Prediction [article]

Xiang Li, Chao Wang, Jiwei Tan, Xiaoyi Zeng, Dan Ou, Bo Zheng
2020 pre-print
For better user experience and business effectiveness, Click-Through Rate (CTR) prediction has been one of the most important tasks in E-commerce.  ...  We propose a novel Multimodal Adversarial Representation Network (MARN) for the CTR prediction task.  ...  For better user experience and business effectiveness, Click-Through Rate (CTR) prediction has been one of the most important tasks in E-commerce.  ... 
doi:10.1145/3366423.3380163 arXiv:2003.07162v1 fatcat:lfgwi2neibhvbhhoc7kt5xlj74

ACE-BERT: Adversarial Cross-modal Enhanced BERT for E-commerce Retrieval [article]

Boxuan Zhang, Chao Wei, Yan Jin, Weiru Zhang
2021 arXiv   pre-print
With the pre-trained enhanced BERT as the backbone network, ACE-BERT further adopts adversarial learning by adding a domain classifier to ensure the distribution consistency of different modality representations  ...  We propose a novel Adversarial Cross-modal Enhanced BERT (ACE-BERT) for efficient E-commerce retrieval. In detail, ACE-BERT leverages the patch features and pixel features as image representation.  ...  For offline evaluation on the retrieval task, we sample the test set from the users' click-through records in next days.  ... 
arXiv:2112.07209v1 fatcat:fa3fvsvgojeopginqgdijxt3aq

Dual Adversarial Variational Embedding for Robust Recommendation [article]

Qiaomin Yi, Ning Yang, Philip S. Yu
2021 arXiv   pre-print
In this paper, we propose a novel model called Dual Adversarial Variational Embedding (DAVE) for robust recommendation, which can provide personalized noise reduction for different users and items, and  ...  Robust recommendation aims at capturing true preference of users from noisy data, for which there are two lines of methods have been proposed.  ...  F , and use RMSprop to learn the discriminators D u and D v , where learning rate is set to 0.0001.  ... 
arXiv:2106.15779v1 fatcat:3776g6w36rfjnnsmhpu3dgfuha

COBRA: Data-Efficient Model-Based RL through Unsupervised Object Discovery and Curiosity-Driven Exploration [article]

Nicholas Watters, Loic Matthey, Matko Bosnjak, Christopher P. Burgess, Alexander Lerchner
2019 arXiv   pre-print
Data efficiency and robustness to task-irrelevant perturbations are long-standing challenges for deep reinforcement learning algorithms.  ...  Subsequently, it can learn a variety of tasks through model-based search in very few steps and excel on structured hold-out tests of policy robustness.  ...  Acknowledgments We would like to thank Matt Botvinick, Tiago Ramalho, Tejas Kulkarni, and Csaba Szepesvari for helpful discussions and insights.  ... 
arXiv:1905.09275v2 fatcat:gvkoisskqnadvghggtc7i3xnxy

Interest-Related Item Similarity Model Based on Multimodal Data for Top-N Recommendation [article]

Junmei Lv, Bin Song, Jie Guo, Xiaojiang Du, Mohsen Guizani
2019 arXiv   pre-print
However, due to the representation gap between different modalities, it is intractable to effectively use unstructured multimodal data to improve the efficiency of recommendation systems.  ...  The multimodal feature learning module adds knowledge sharing unit among different modalities. Then IRN learn the interest relevance between target item and different historical items respectively.  ...  The representation learning of item content is essential for recommendation, including the representation learning of single-modal data [8] - [10] and multimodal data [11] , [12] .  ... 
arXiv:1902.05566v1 fatcat:w6sa5dnjvremzhyhujfk3d5ce4

Multimodal Fusion with BERT and Attention Mechanism for Fake News Detection [article]

Nguyen Manh Duc Tuan, Pham Quang Nhat Minh
2021 arXiv   pre-print
In this paper, we present a novel method for detecting fake news by fusing multimodal features derived from textual and visual data.  ...  Specifically, we used a pre-trained BERT model to learn text features and a VGG-19 model pre-trained on the ImageNet dataset to extract image features.  ...  We trained the model with 10 epochs, batch size of 256, and the Adam optimizer with the learning rate of 1e-4. B.  ... 
arXiv:2104.11476v2 fatcat:cgsejtbhvbcfbjcofvsllftijm

Interest-Related Item Similarity Model Based on Multimodal Data for Top-N Recommendation

Junmei Lv, Bin Song, Jie Guo, Xiaojiang Du, Mohsen Guizani
2019 IEEE Access  
However, due to the representation gap between different modalities, it is intractable to effectively use unstructured multimodal data to improve the efficiency of recommendation systems.  ...  The multimodal feature learning module adds knowledge sharing unit among different modalities. Then, IRN learns the interest relevance between target item and different historical items respectively.  ...  The representation learning of item content is essential for recommendation, including the representation learning of single-modal data [8] - [10] and multimodal data [11] , [12] .  ... 
doi:10.1109/access.2019.2893355 fatcat:xxhijrlfrjhb3fofkwgqicbcv4

Personalized News Recommendation: Methods and Challenges [article]

Chuhan Wu, Fangzhao Wu, Yongfeng Huang, Xing Xie
2022 arXiv   pre-print
Next, we introduce the public datasets and evaluation methods for personalized news recommendation.  ...  We first review the techniques for tackling each core problem in a personalized news recommender system and the challenges they face.  ...  In addition to the context features mentioned above, several methods also explore to use weather [177] , click-through rate (CTR) [20] , and fact/opinion bias [121] to enrich the representations of  ... 
arXiv:2106.08934v3 fatcat:iagqsw73hrehxaxpvpydvtr26m

The Future of Misinformation Detection: New Perspectives and Trends [article]

Bin Guo, Yasan Ding, Lina Yao, Yunji Liang, Zhiwen Yu
2019 arXiv   pre-print
We first give a brief review of the literature history of MID, based on which we present several new research challenges and techniques of it, including early detection, detection by multimodal data fusion  ...  Finally, we give our own views on the open issues and future research directions of MID, such as model adaptivity/generality to new events, embracing of novel machine learning models, explanatory detection  ...  Deep learning-based methods prevailingly learn the latent depth representation of misinformation through neural networks. ough much effort has been made on MID in the past years, there are still numerous  ... 
arXiv:1909.03654v1 fatcat:34h2os2pzrbm3kqluk5uajtr6i

A Word is Worth A Thousand Dollars: Adversarial Attack on Tweets Fools Stock Prediction [article]

Yong Xie, Dakuo Wang, Pin-Yu Chen, Jinjun Xiong, Sijia Liu, Sanmi Koyejo
2022 arXiv   pre-print
More and more investors and machine learning models rely on social media (e.g., Twitter and Reddit) to gather real-time information and sentiment to predict stock price movements.  ...  In this paper, we experiment with a variety of adversarial attack configurations to fool three stock prediction victim models.  ...  Learning phrase representations using RNN encoder-decoder for statistical machine translation.  ... 
arXiv:2205.01094v2 fatcat:4qbgtea3c5ddjhxgk6brjv4vfy

Video-to-Video Synthesis [article]

Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, Bryan Catanzaro
2018 arXiv   pre-print
In this paper, we propose a novel video-to-video synthesis approach under the generative adversarial learning framework.  ...  Through carefully-designed generator and discriminator architectures, coupled with a spatio-temporal adversarial objective, we achieve high-resolution, photorealistic, temporally coherent video results  ...  We also thank Lisa Rhee and Miss Ketsuki for allowing us to use their dance videos for training. We thank William S. Peebles for proofreading the paper.  ... 
arXiv:1808.06601v2 fatcat:fkaob5ol4bdglfgr2mfyhhekfa

A Cross-Media Advertising Design and Communication Model Based on Feature Subspace Learning

Shanshan Li, Gengxin Sun
2022 Computational Intelligence and Neuroscience  
space, this paper proposes a discriminative feature subspace learning model based on Low-Rank Representation (LRR), which explores the local structure of samples through Low-Rank Representation and uses  ...  different modalities, so that the learned shared subspace is more discriminative; meanwhile, it proposes realizing cross-modal retrieval by the deep convolutional generative adversarial network, using  ...  relations in click data.  ... 
doi:10.1155/2022/5874722 fatcat:qsh45riw7vfhfjrvorwwamf7ne

Benchmarking Multimodal AutoML for Tabular Data with Text Fields [article]

Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, Alexander J. Smola
2021 arXiv   pre-print
We consider the use of automated supervised learning systems for data tables that not only contain numeric/categorical columns, but one or more text fields as well.  ...  Our publicly-available benchmark enables researchers to comprehensively evaluate their own methods for supervised learning with numeric, categorical, and text features.  ...  B.3 Neural Network Optimization All text/multimodal neural networks are trained with the slanted triangular learning rate scheduler [79] with initial learning rate set to 0.0, the maximal learning rate  ... 
arXiv:2111.02705v1 fatcat:kvnyjxgkqbdbpedbgat433v5uu

Design and Implementation of Continuous Authentication Mechanism Based on Multimodal Fusion Mechanism

Jianfeng Guan, Xuetao Li, Ying Zhang, Karl Andersson
2021 Security and Communication Networks  
Most of the current authentication mechanisms adopt the "one-time authentication," which authenticate users for initial access.  ...  In this case, after an illegal user completes authentication through identity forgery or a malicious user completes authentication by hijacking a legitimate user, his or her behaviour will become uncontrollable  ...  Currently, lots of multimodal continuous authentications are proposed in smartphone, IoT [49] [50] [51] . e key points of multimodal fusion continuous authentication are the association, unified representation  ... 
doi:10.1155/2021/6669429 fatcat:4buksf4fzbek7nqa72zmiqojxe

Learning from Multi-domain Artistic Images for Arbitrary Style Transfer [article]

Zheng Xu, Michael Wilber, Chen Fang, Aaron Hertzmann, Hailin Jin
2019 arXiv   pre-print
Besides the traditional content and style representation based on deep features and statistics for textures, we use adversarial networks to regularize the generation of stylized images.  ...  Our adversarial network learns the intrinsic property of image styles from large-scale multi-domain artistic images.  ...  We use Adam optimizer with prediction method [YSX*18] with learning rate 2e − 4 and parameter β 1 = 0.5, β 2 = 0.9.  ... 
arXiv:1805.09987v2 fatcat:5pgj2p4dibavbbqjqog7yiyx4y
« Previous Showing results 1 — 15 out of 412 results