Filters








521,565 Hits in 4.3 sec

Frame-wise Cross-modal Match for Video Moment Retrieval [article]

Haoyu Tang, Jihua Zhu, Meng Liu, Member, IEEE, Zan Gao, Zhiyong Cheng
2020 arXiv   pre-print
In this paper, we propose an Attentive Cross-modal Relevance Matching (ACRM) model which predicts the temporal bounders based on an interaction modeling between two modalities.  ...  In addition, an attention module is introduced to automatically assign higher weights to query words with richer semantic cues, which are considered to be more important for finding relevant video contents  ...  learn video and query attentions, which are used to localize the moment.  ... 
arXiv:2009.10434v1 fatcat:nhgdwujupjfxpi3v5e4xe5f4q4

Modality Shifting Attention Network for Multi-Modal Video Question Answering

Junyeong Kim, Minuk Ma, Trung Pham, Kyungsu Kim, Chang D. Yoo
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
network (HRN) that predicts the answer using an attention mechanism on both modalities.  ...  MSAN decomposes the task into two sub-tasks: (1) localization of temporal moment relevant to the question, and (2) accurate prediction of the answer based on the localized moment.  ...  In the first example, the model utilizes video modality to localize the moment (α > 0.5), and then uses subtitle modality to predict the answer (β < 0.5).  ... 
doi:10.1109/cvpr42600.2020.01012 dblp:conf/cvpr/KimMPKY20 fatcat:cobov5g4nbdthjj3r4x6i5bp3a

Modality Shifting Attention Network for Multi-modal Video Question Answering [article]

Junyeong Kim, Minuk Ma, Trung Pham, Kyungsu Kim, Chang D. Yoo
2020 arXiv   pre-print
network (HRN) that predicts the answer using an attention mechanism on both modalities.  ...  MSAN decomposes the task into two sub-tasks: (1) localization of temporal moment relevant to the question, and (2) accurate prediction of the answer based on the localized moment.  ...  This lowers the IoU, but increases the coverage which helps to include the ground truth moment. Table 3 . Comparison with the state-of-the-art method on TVQA dataset.  ... 
arXiv:2007.02036v1 fatcat:m3ogiq565bahljejhatvn75ceq

Weakly-Supervised Completion Moment Detection using Temporal Attention [article]

Farnoosh Heidarivincheh, Majid Mirmehdi, Dima Damen
2019 arXiv   pre-print
Given both complete and incomplete sequences, of the same action, we learn temporal attention, along with accumulated completion prediction from all frames in the sequence.  ...  This has potential applications from surveillance to assistive living and human-robot interactions.  ...  Acknowledgements The 1st author wishes to thank the University of Bristol for partial funding of her studies. Public datasets were used in this work.  ... 
arXiv:1910.09920v1 fatcat:fpntrvqxnzattorsg3r7xxsg7y

A GRU-Based Method for Predicting Intention of Aerial Targets

Fei Teng, Yafei Song, Gang Wang, Peng Zhang, Liuxing Wang, Zongteng Zhang, Thippa Reddy G
2021 Computational Intelligence and Neuroscience  
Depending only on a single moment to take inference, the traditional combat intention recognition method is neither scientific nor effective enough.  ...  In order to further shorten the time for intention recognition and with a certain predictive effect, an air combat characteristic prediction module is introduced before intention recognition to establish  ...  vector h t output by the BiGRU network at moment t is input to the attention mechanism layer to obtain the initial state vector S t .  ... 
doi:10.1155/2021/6082242 pmid:34764992 pmcid:PMC8577955 fatcat:63c2xwflejhtbjxdolb4r2ir7y

Video Moment Retrieval via Natural Language Queries [article]

Xinli Yu, Mohsen Malmir, Cynthia He, Yue Liu, Rex Wu
2020 arXiv   pre-print
Second, We also propose to use multiple task training objective consists of moment segmentation task, start/end distribution prediction and start/end location regression task.  ...  We have verified that start/end prediction are noisy due to annotator disagreement and joint training with moment segmentation task can provide richer information since frames inside the target clip are  ...  The moment segmentation task refers to the prediction of whether each time stamp belongs to the ground-truth clip or not.  ... 
arXiv:2009.02406v2 fatcat:6k2ryffpvnf6xkv3znhtb4ljry

Second guesses and the attention-switching model for successiveness discrimination

Lorraine G. Allan
1975 Perception & Psychophysics  
The present note demonstrates that Baron's (1971) second-guess data are not inconsistent with the basic assumptions of Kristofferson's attention-switching model.  ...  Using this argument as a basis, Baron states that moment theories predict that the probability of a correct second guess following an incorrect first response has to be independent of the temporal separation  ...  From Figure 2 it can be seen that moment theory predicts for all values of at.  ... 
doi:10.3758/bf03203999 fatcat:vhyuubce5famdnoxfejkxpeiiu

VLG-Net: Video-Language Graph Matching Network for Video Grounding [article]

Mattia Soldan, Mengmeng Xu, Sisi Qu, Jesper Tegner, Bernard Ghanem
2021 arXiv   pre-print
Finally, moment candidates are created using masked moment attention pooling by fusing the moment's enriched snippet features.  ...  We demonstrate superior performance over state-of-the-art grounding methods on three widely used datasets for temporal localization of moments in videos with language queries: ActivityNet-Captions, TACoS  ...  Masked attention pooling lists all possible moment candidates, and a Multi-Layer Perceptron (MLP) computes each moment's score to rank them as final predictions.  ... 
arXiv:2011.10132v2 fatcat:z4skoauwy5bmjnxjwjqs7mvnti

I'll be there for you

Tilman Dingler, Martin Pielot
2015 Proceedings of the 17th International Conference on Human-Computer Interaction with Mobile Devices and Services - MobileHCI '15  
users return to their attentive state within 5 minutes.  ...  By collecting more than 55,000 messages from 42 mobile phone users over the course of two weeks, we were able to predict people's attentiveness through their mobile phone usage with close to 80% accuracy  ...  For each of these states, we then ran the classifier and predicted the participant's attentiveness.  ... 
doi:10.1145/2785830.2785840 dblp:conf/mhci/DinglerP15 fatcat:fh6xgm44hvei5l3v4pgr4tbl54

VLANet: Video-Language Alignment Network for Weakly-Supervised Video Moment Retrieval [article]

Minuk Ma, Sunjae Yoon, Junyeong Kim, Youngjoon Lee, Sunghun Kang, Chang D. Yoo
2020 arXiv   pre-print
Video Moment Retrieval (VMR) is a task to localize the temporal moment in untrimmed video specified by natural language query.  ...  To leverage the weak supervision, contrastive learning is used which predicts higher scores for the correct video-query pairs than for the incorrect pairs.  ...  Due to the limited space, only some proposals are visualized. The color indicates the attention strength. The top-2 predicted moments are visualized with the temporal boundaries.  ... 
arXiv:2008.10238v1 fatcat:gh5kbplaubd25jjg2jsyfwkqzi

Deep Neural Network Model Forecasting for Financial and Economic Market

Fan Chen
2022 Journal of Mathematics  
First, the proposed model processes the input of characteristic variables of multiple series (market macrodynamic series and multiseed series) and uses an attention mechanism to fuse the input variables  ...  The financial markets have higher liquidity and volatility as compared to traditional financial markets.  ...  the current state output prediction time j finally by softmax function, normalization of e tj operation, so as to obtain the weight factor of market state at each historical moment to the current forecast  ... 
doi:10.1155/2022/8146555 doaj:696a08d7c47a4d1f9e88809b83d87b4e fatcat:hqetzclnrfgf5evqwc7gcuqaz4

A3T-GCN: Attention Temporal Graph Convolutional Network for Traffic Forecasting [article]

Jiawei Zhu, Yujiao Song, Ling Zhao, Haifeng Li
2020 arXiv   pre-print
Moreover, the attention mechanism was introduced to adjust the importance of different time points and assemble global temporal information to improve prediction accuracy.  ...  In this study, an attention temporal graph convolutional network (A3T-GCN) traffic forecasting method was proposed to simultaneously capture global temporal dynamics and spatial correlations.  ...  The attention mechanism was introduced to re-weight the influence of historical traffic states and thus to capture the global variation trends of traffic state.  ... 
arXiv:2006.11583v1 fatcat:n5bx23yrhrainc5krpjdptev3i

Span-based Localizing Network for Natural Language Video Localization [article]

Hao Zhang, Aixin Sun, Wei Jing, Joey Tianyi Zhou
2020 arXiv   pre-print
Through extensive experiments on three benchmark datasets, we show that the proposed VSLNet outperforms the state-of-the-art methods; and adopting span-based QA framework is a promising direction to solve  ...  Given an untrimmed video and a text query, natural language video localization (NLVL) is to locate a matching span from the video that semantically corresponds to the query.  ...  We further study the error patterns of predicted moment lengths, as shown in Figure 9 . The differences between moment lengths of ground truths and predicted results are measured.  ... 
arXiv:2004.13931v2 fatcat:ifau7knpgrdgfgpc7xl7nty63m

A Closer Look at Temporal Sentence Grounding in Videos: Datasets and Metrics [article]

Yitian Yuan, Xiaohan Lan, Long Chen, Wei Liu, Xin Wang, Wenwu Zhu
2021 arXiv   pre-print
Meanwhile, we further introduce a new evaluation metric "dR@n,IoU@m" to calibrate the basic IoU scores by penalizing more on the over-long moment predictions and reduce the inflating performance caused  ...  To this end, we propose to re-organize two widely-used TSGV datasets (Charades-STA and ActivityNet Captions), and deliberately Change the moment annotation Distribution of the test split to make it different  ...  Due to its profound significance, the TSGV task has received unprecedented attention over the last few years -a surge of datasets [1, 6, 13, 20] and state-of-the-art (SOTA) methods [3, 4, 5, 7, 8, 9  ... 
arXiv:2101.09028v2 fatcat:tlvfxoxr4nfcbetnaqc4adxs5a

Development of Modified LSTM Model for Reservoir Capacity Prediction in Huanggang Reservoir, Fujian, China

Bibo Dai, Jiangbin Wang, Xiao Gu, Chunyan Xu, Xin Yu, Haosheng Zhang, Canming Yuan, Wen Nie, Zizheng Guo
2022 Geofluids  
In order to accurately understand the Huanggang Reservoir capacity change, we develop a new hydrological prediction model based on the LSTM (Long-Short-Term Memory) method, which is used to predict the  ...  In this modified model, we choose to input multidimensional factors, two fully connected layers, selecting the optimal number of the hidden neurons, the optimizer, and adding the attention mechanism.  ...  The input gate determines how much input data of the network at the current moment is saved in the unit state; the forget gate determines how much of the unit state at the previous moment is saved to the  ... 
doi:10.1155/2022/2891029 fatcat:alo7y5fwtnhbfadonxevnck3bu
« Previous Showing results 1 — 15 out of 521,565 results