3 Hits in 0.99 sec

mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections [article]

Chenliang Li, Haiyang Xu, Junfeng Tian, Wei Wang, Ming Yan, Bin Bi, Jiabo Ye, Hehong Chen, Guohai Xu, Zheng Cao, Ji Zhang, Songfang Huang (+3 others)
2022 arXiv   pre-print
To address these problems, mPLUG introduces an effective and efficient vision-language architecture with novel cross-modal skip-connections, which creates inter-layer shortcuts that skip a certain number  ...  This paper presents mPLUG, a new vision-language foundation model for both cross-modal understanding and generation.  ...  Effectiveness and Efficiency To validate the effectiveness and efficiency of our proposed cross-modal skip-connected network, we conduct in-depth analysis on different stride values and various cross-modal  ... 
arXiv:2205.12005v2 fatcat:cck3km3syjdytc5so2gzglucni

Great Expectations: Unsupervised Inference of Suspense, Surprise and Salience in Storytelling [article]

David Wilmot
2022 arXiv   pre-print
Narrative theory methods (rules and procedures) are applied to the knowledge built into deep learning models to directly infer salience, surprise, and salience in stories.  ...  Crucial to creating dramatic and exciting stories are surprise and suspense. The thesis trains a series of deep learning models via only reading stories, a self-supervised (or unsupervised) system.  ...  ., 2022) has a more efficient skipping attention for for combining modalities.  ... 
arXiv:2206.09708v1 fatcat:k4oefywyxvgn5gdtedyvr5mbpi

Great Expectations: Unsupervised inference of suspense, surprise and salience in storytelling [article]

David Wilmot, University Of Edinburgh
Recently advances in machine learning (often called deep learning) have substantially improved in many language-related tasks, including story comprehension and story writing.  ...  It is difficult because all these elements require a strong comprehension of the characters and their motivations, places, changes over time, and the cause/effect of complex interactions.  ...  ., 2022) has a more efficient skipping attention for for combining modalities.  ... 
doi:10.7488/era/2426 fatcat:revj6whctvectezehhkifksj3i