Filters








399,131 Hits in 5.9 sec

Learning to Localize Actions from Moments [article]

Fuchen Long and Ting Yao and Zhaofan Qiu and Xinmei Tian and Jiebo Luo and Tao Mei
2020 arXiv   pre-print
In this paper, we introduce a new design of transfer learning type to learn action localization for a large set of action categories, but only on action moments from the categories of interest and temporal  ...  More remarkably, we train AherNet to localize actions from 600 categories on the leverage of action moments in Kinetics-600 and temporal annotations from 200 classes in ActivityNet v1.3.  ...  to support action localization learning.  ... 
arXiv:2008.13705v1 fatcat:pmgj2u7ulzdbnc6vfmqccl7hyy

Attending to Distinctive Moments: Weakly-Supervised Attention Models for Action Localization in Video

Lei Chen, Mengyao Zhai, Greg Mori
2017 2017 IEEE International Conference on Computer Vision Workshops (ICCVW)  
Weak supervision gathered from sports website is provided in the form of an action taking place in a video clip, without specification of the person performing the action.  ...  We present a method for utilizing weakly supervised data for action localization in videos. We focus on sports video analysis, where videos contain scenes of multiple people.  ...  Conclusion We demonstrated that attention models can be used to select distinctive frames for learning action localization from weakly supervised data.  ... 
doi:10.1109/iccvw.2017.47 dblp:conf/iccvw/0023ZM17 fatcat:6y7rwk4qdvejrmeyvd2tlunqvy

Tripping through time: Efficient Localization of Activities in Videos [article]

Meera Hahn, Asim Kadav, James M. Rehg, Hans Peter Graf
2020 arXiv   pre-print
Furthermore, TripNet uses reinforcement learning to efficiently localize relevant activity clips in long videos, by learning how to intelligently skip around the video.  ...  Localizing moments in untrimmed videos via language queries is a new and interesting task that requires the ability to accurately ground language into video.  ...  In the figure, we show the sequential list of actions the agent takes in order to temporally localize a moment in the video.  ... 
arXiv:1904.09936v5 fatcat:4bvoeildkna2xejzdsvcjo6nhm

Global citizenship education through curriculum-as-relations

Eun-Ji Amy Kim
2021 Prospect: Quarterly Review of Comparative Education  
Authentic critical-translocal learning through the strategy of comparison offers an alternative view of global-local relations as "articulated moments created by situated praxis".  ...  This article reviews the GCED discourses conceptualizing global competence as instrumental action and a binary view of global-local relations.  ...  Curriculum offers pedagogy and activities that examine global-local issues as cause-and-effect transactions (e.g., How would local action impact global issues? Or global issues impact local action?).  ... 
doi:10.1007/s11125-021-09554-w pmid:34024944 pmcid:PMC8122212 fatcat:yugxyejd5rg3loljhcyvmmp4gm

Multi-Moments in Time: Learning and Interpreting Models for Multi-Action Video Understanding [article]

Mathew Monfort, Bowen Pan, Kandan Ramakrishnan, Alex Andonian, Barry A McNamara, Alex Lascelles, Quanfu Fan, Dan Gutfreund, Rogerio Feris, Aude Oliva
2021 arXiv   pre-print
Towards this goal, we present the Multi-Moments in Time dataset (M-MiT) which includes over two million action labels for over one million three second videos.  ...  This multi-label dataset introduces novel challenges on how to train and analyze models for multi-action detection.  ...  These class activation maps (CAMs) can be thought of as a method for visualizing the learned attention model of the network and have been shown to localize actions occurring in videos [26] .  ... 
arXiv:1911.00232v4 fatcat:u245hymiwjbd7mymr27ukmt4ty

Weak Supervision and Referring Attention for Temporal-Textual Association Learning [article]

Zhiyuan Fang, Shu Kong, Zhe Wang, Charless Fowlkes, Yezhou Yang
2020 arXiv   pre-print
Therefore we provide a Weak-Supervised alternative with our proposed Referring Attention mechanism to learn temporal-textual association (dubbed WSRA).  ...  The principle in our designed mechanism is to fully exploit 1) the weak supervision by considering informative and discriminative cues from intra-video segments anchored with the textual query, 2) multiple  ...  Video-level Attention Modeling Prior weakly supervised methods for action localization propose to generate a weight vector to localize action labels among the video snippets in a bottom-up manner [36,  ... 
arXiv:2006.11747v2 fatcat:bpqa6chthfgjhatmsgqq5t2dym

'LEARNING MOMENTS' AS INSPECTABLE PHENOMENA OF INQUIRY IN A SECOND LANGUAGE CLASSROOM

Ricardo Moutinho, Andrew P. Carlin
2021 Problems of Education in the 21st Century  
In this paper, a praxiological approach to observe learning moments is proposed.  ...  as learning moments.  ...  This study uses video data from the project "Interactions in Ricardo MOUTINHO, Andrew P. CARLIN. 'Learning moments' as inspectable phenomena of inquiry in a second language classroom  ... 
doi:10.33225/pec/21.79.80 fatcat:qbxtksetdveblfwqhx3cbxzxye

The Elements of Temporal Sentence Grounding in Videos: A Survey and Future Directions [article]

Hao Zhang, Aixin Sun, Wei Jing, Joey Tianyi Zhou
2022 arXiv   pre-print
., natural language video localization (NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that semantically corresponds to a language query from an untrimmed video.  ...  As the background, we present a common structure of functional components in TSGV, in a tutorial style: from feature extraction from raw video and language query, to answer prediction of the target moment  ...  Moment localization is to learn a mapping l θ1 : (V, Q) → m, i.e., retrieving a moment m corresponded to the caption C i from video V .  ... 
arXiv:2201.08071v1 fatcat:2k2if6dsyveinec2dmmujcmhkq

Conceptual Changes and Methodological Challenges: On Language and Learning from a Conversation Analytic Perspective on SLA [chapter]

Simona Pekarek Doehler
2010 Conceptualising 'Learning' in Applied Linguistics  
Within socioculturally and socio-interactionally oriented research, attention has shifted away from an understanding of language learning/acquisition as an intra-psychological, cognitive process enclosed  ...  The last two decades have been witness to the increasing influence, across several disciplines in the human sciences, of research that challenges established conceptions of learning and of language.  ...  From an emic perspective, we witness how a previous moment of interaction is treated as a moment of learning, and how the present interlocutor is treated as a person designed to witness that learning.  ... 
doi:10.1057/9780230289772_7 fatcat:rmiebo2rszgefoga2aojw4trzq

Learning Sample Importance for Cross-Scenario Video Temporal Grounding [article]

Peijun Bao, Yadong Mu
2022 arXiv   pre-print
To this end, we propose a novel method called Debiased Temporal Language Localizer (DebiasTLL) to prevent the model from naively memorizing the biases and enforce it to ground the query sentence based  ...  The task of temporal grounding aims to locate video moment in an untrimmed video, with a given sentence query.  ...  The first model is designed to learn the video moment bias and predict the localization results only from visual modality.  ... 
arXiv:2201.02848v1 fatcat:mge6vtradfaxdm2yds6rfhzloq

Progressive Localization Networks for Language-based Moment Localization [article]

Qi Zheng, Jianfeng Dong, Xiaoye Qu, Xun Yang, Yabing Wang, Pan Zhou, Baolong Liu, Xun Wang
2022 arXiv   pre-print
In this fashion, the later stages are able to absorb the previously learned information, thus facilitating the more fine-grained localization.  ...  To this end, we propose a novel multi-stage Progressive Localization Network (PLN) which progressively localizes the target moment in a coarse-to-fine manner.  ...  For temporal action localization, it aims to temporally localize segments whose action labels are within a pre-defined list of actions.  ... 
arXiv:2102.01282v2 fatcat:7i2dm6t2y5bzzjgoz6wmd67bja

Interventional Video Grounding with Dual Contrastive Learning [article]

Guoshun Nan, Rui Qiao, Yao Xiao, Jun Liu, Sicong Leng, Hao Zhang, Wei Lu
2021 arXiv   pre-print
Video grounding aims to localize a moment from an untrimmed video for a given textual query.  ...  Then, we present a simple yet effective method to approximate the unobserved confounder as it cannot be directly sampled from the dataset. 2) Meanwhile, we introduce a dual contrastive learning approach  ...  Acknowledgments We would like to thank the anonymous reviewers for their helpful comments.  ... 
arXiv:2106.11013v2 fatcat:orjpyddcfjcyrhaanxx7yqc7ni

Intelligent Games for Education - An Intention Monitoring Approach based on Dynamic Bayesian Network [article]

Irene Cheng, Feng Chen, Saul Daniel Rodrigues, Oscar Garcia Pañella, Lluis Vicent, Anup Basu
2010 Eurographics State of the Art Reports  
In order to provide an engaging and interactive environment, each game in the system has a local student module constructed based on a Dynamic Bayesian Network.  ...  The information collected from student interaction with computer games is used to update a student module that reports a student's current level of knowledge, making adaptive tutoring and assessment with  ...  In the local student module of the Balance game, there are three nodes inherited from the global student module and a number of action (Evidence) nodes ( Figure 6 ).  ... 
doi:10.2312/eged.20101012 fatcat:nk67gmzpjfejfgnwali3kuafmm

Multi-Source Video Domain Adaptation with Temporal Attentive Moment Alignment [article]

Yuecong Xu, Jianfei Yang, Haozhi Cao, Keyu Wu, Min Wu, Rui Zhao, Zhenghua Chen
2021 arXiv   pre-print
feature moments.  ...  TAMAN further constructs robust global temporal features by attending to dominant domain-invariant local temporal features with high local classification confidence and low disparity between global and  ...  The trade-off weight for the moment-based spatial and temporal feature discrepancies λ df and λ dt are set to 0.005 and 0.01. All experiments are conducted using two NVIDIA RTX 2080 Ti GPUs.  ... 
arXiv:2109.09964v2 fatcat:qaozikkiqjagtas544pajv4dge

Qiniu Submission to ActivityNet Challenge 2018 [article]

Xiaoteng Zhang, Yixin Bao, Feiyun Zhang, Kai Hu, Yicheng Wang, Liang Zhu, Qinzhu He, Yining Lin, Jie Shao, Yao Peng
2018 arXiv   pre-print
We also propose new non-local-based models for further improvement on the recognition accuracy.  ...  In this paper, we introduce our submissions for the tasks of trimmed activity recognition (Kinetics) and trimmed event recognition (Moments in Time) for Activitynet Challenge 2018.  ...  Firstly, non-local operations would be important for relation learning, but global operations may be unnecessary. If position i is far away from j, then f (x i , x j ) ≈ 0.  ... 
arXiv:1806.04391v1 fatcat:a4tlvtf7evfc3o5ns5rpdacgsq
« Previous Showing results 1 — 15 out of 399,131 results