A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections
[article]
2022
arXiv
pre-print
To address these problems, mPLUG introduces an effective and efficient vision-language architecture with novel cross-modal skip-connections, which creates inter-layer shortcuts that skip a certain number ...
This paper presents mPLUG, a new vision-language foundation model for both cross-modal understanding and generation. ...
Effectiveness and Efficiency To validate the effectiveness and efficiency of our proposed cross-modal skip-connected network, we conduct in-depth analysis on different stride values and various cross-modal ...
arXiv:2205.12005v2
fatcat:cck3km3syjdytc5so2gzglucni
Great Expectations: Unsupervised Inference of Suspense, Surprise and Salience in Storytelling
[article]
2022
arXiv
pre-print
Narrative theory methods (rules and procedures) are applied to the knowledge built into deep learning models to directly infer salience, surprise, and salience in stories. ...
Crucial to creating dramatic and exciting stories are surprise and suspense. The thesis trains a series of deep learning models via only reading stories, a self-supervised (or unsupervised) system. ...
., 2022) has a more efficient skipping attention for for combining modalities. ...
arXiv:2206.09708v1
fatcat:k4oefywyxvgn5gdtedyvr5mbpi
Great Expectations: Unsupervised inference of suspense, surprise and salience in storytelling
[article]
2022
Recently advances in machine learning (often called deep learning) have substantially improved in many language-related tasks, including story comprehension and story writing. ...
It is difficult because all these elements require a strong comprehension of the characters and their motivations, places, changes over time, and the cause/effect of complex interactions. ...
., 2022) has a more efficient skipping attention for for combining modalities. ...
doi:10.7488/era/2426
fatcat:revj6whctvectezehhkifksj3i