Filters








365,731 Hits in 4.5 sec

Multi‐label learning based target detecting from multi‐frame data

Mengqing Mei, Fazhi He
2021 IET Image Processing  
Owing to the progress of multiple frames time series data, or video satellites, target detecting from space-borne satellite videos has been available.  ...  This paper considers target detecting from time series data as multi-label problem as there are several different kinds of background objects and targets of interest.  ...  Ltd. for acquiring and providing the data used in this study, and the IEEE GRSS Image Analysis and Data Fusion Technical Committee. ORCID Fazhi He https://orcid.org/0000-0001-7016-3698  ... 
doi:10.1049/ipr2.12271 fatcat:rjzwkbcdtzhkxkb4arocv7djqi

Learning From Multi-Frame Data [article]

Patrick Wieschollek, Universitaet Tuebingen, Lensch, Hendrik P.A. (Prof. Dr.)
2020
Multi-frame data-driven methods bear the promise that aggregating multiple observations leads to better estimates of target quantities than a single (still) observation.  ...  This thesis examines how data-driven approaches such as deep neural networks should be constructed to improve over single-frame-based counterparts.  ...  The primary objective of this thesis is to design deep learning based models to handle multi-frame data more naturally.  ... 
doi:10.15496/publikation-51349 fatcat:gpnsx7adbrgizh2bwtrd4pexgq

FML: Face Model Learning from Videos [article]

Ayush Tewari, Florian Bernard, Pablo Garrido, Gaurav Bharaj, Mohamed Elgharib, Hans-Peter Seidel, Patrick Pérez, Michael Zollhöfer, Christian Theobalt
2019 arXiv   pre-print
Our face model is learned using only corpora of in-the-wild video clips collected from the Internet. This virtually endless source of training data enables learning of a highly general 3D face model.  ...  Most existing methods rely on data-driven priors that are built from limited 3D face scans.  ...  Multi-frame Consistent Face Model Learning We propose a novel network for consistent multi-frame face model learning.  ... 
arXiv:1812.07603v2 fatcat:mdnemyu7xjf5lbhszg5i3e53fu

FML: Face Model Learning From Videos

Ayush Tewari, Florian Bernard, Pablo Garrido, Gaurav Bharaj, Mohamed Elgharib, Hans-Peter Seidel, Patrick Perez, Michael Zollhofer, Christian Theobalt
2019 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
We propose multi-frame self-supervised training of a deep network based on in-the-wild video data for jointly learning a face model and 3D face reconstruction.  ...  Our face model is learned using only corpora of in-the-wild video clips collected from the Internet. This virtually endless source of training data enables learning of a highly general 3D face model.  ...  Multi-frame Consistent Face Model Learning We propose a novel network for consistent multi-frame face model learning.  ... 
doi:10.1109/cvpr.2019.01107 dblp:conf/cvpr/TewariB0BESPZT19 fatcat:6gf5b75bkzbldhzbyqnun4okzm

Detecting and Localizing 3D Object Classes using Viewpoint Invariant Reference Frames

Matthew Toews, Tal Arbel
2007 2007 IEEE 11th International Conference on Computer Vision  
We present a new, iterative learning algorithm to determine an optimal viewpoint invariant reference frame from training images in a data-driven manner.  ...  We compare multi-view and viewpoint invariant representations trained and tested on the same data, where the viewpoint invariant approach results in fewer false positive detections and higher average precision  ...  Ideally, fully unsupervised learning could derive an optimal invariant reference frame from data, however unsupervised learning is a challenging task even in the single viewpoint case.  ... 
doi:10.1109/iccv.2007.4408832 dblp:conf/iccv/ToewsA07 fatcat:si4cwyi24bfhpjoltdr45bnuwu

MultiNet++: Multi-Stream Feature Aggregation and Geometric Loss Strategy for Multi-Task Learning [article]

Sumanth Chennupati, Ganesh Sistu, Senthil Yogamani, Samir A Rawashdeh
2019 arXiv   pre-print
In this work, we propose a multi-stream multi-task network to take advantage of using feature representations from preceding frames in a video sequence for joint learning of segmentation, depth, and motion  ...  Current work on multi-task learning networks focus on processing a single input image and there is no known implementation of multi-task learning handling a sequence of images.  ...  Conclusion We introduced an efficient way of constructing Multi-Net++, a multi-task learning network that operates on multiple streams of input data.  ... 
arXiv:1904.08492v2 fatcat:ob6bo36oardubot35ejs7inabi

Multi-modal Affect Analysis using standardized data within subjects in the Wild [article]

Sachihiro Youoku, Takahisa Yamamoto, Junya Saito, Akiyoshi Uchida, Xiaoyu Mi, Ziqiang Shi, Liu Liu, Zhongling Liu, Osafumi Nakayama, Kentaro Murase
2021 arXiv   pre-print
Furthermore, the above features were learned using multi-modal data such as image features, AU, Head pose, and Gaze. In the validation set, our model achieved a facial expression score of 0.546.  ...  Therefore, after learning the common features for each frame, we constructed a facial expression estimation model and valence-arousal model using time-series data after combining the common features and  ...  Then, image features and audio features are combined, and A multi-frame model was generated by learning multiple frame data that combines the intermediate features and the intermediate features standardized  ... 
arXiv:2107.03009v3 fatcat:4o5w47c6yrfajp6fuubleufswi

Multi-Task Learning of Generalizable Representations for Video Action Recognition [article]

Zhiyu Yao, Yunbo Wang, Mingsheng Long, Jianmin Wang, Philip S Yu, Jiaguang Sun
2020 arXiv   pre-print
Based on these findings, we present a multi-task learning paradigm for video classification.  ...  the discrepancy of the multi-task features in a self-supervised manner.  ...  Specifically, we present the Reversed Two-Stream Networks (Rev2Net), which is trained in a multi-task learning framework with selfsupervision from the multi-modality data.  ... 
arXiv:1811.08362v2 fatcat:o35bbhtrcncyvdulm7htm7we2m

Issue Framing in Online Discussion Fora [article]

Mareike Hartmann and Tallulah Jansen and Isabelle Augenstein and Anders Søgaard
2019 arXiv   pre-print
, assuming only unlabeled training data in the target domain.  ...  We explore to what extent models trained to detect issue frames in newswire and social media can be transferred to the domain of discussion fora, using a combination of multi-task and adversarial training  ...  Adversarial Learning Ganin and Lempitsky (2015) proposed adversarial learning for domain adaptation that can exploit unlabeled data from the target domain.  ... 
arXiv:1904.03969v2 fatcat:rhwsnvlp6fcm5gi7z3b6t367lq

Issue Framing in Online Discussion Fora

Mareike Hartmann, Tallulah Jansen, Isabelle Augenstein, Anders Søgaard
2019 Proceedings of the 2019 Conference of the North  
, assuming only unlabeled training data in the target domain.  ...  We explore to what extent models trained to detect issue frames in newswire and social media can be transferred to the domain of discussion fora, using a combination of multi-task and adversarial training  ...  Adversarial Learning Ganin and Lempitsky (2015) proposed adversarial learning for domain adaptation that can exploit unlabeled data from the target domain.  ... 
doi:10.18653/v1/n19-1142 dblp:conf/naacl/HartmannJAS19 fatcat:vvwhfzjqsngrlkpal6bk7fcgwi

Collaborative Attention Mechanism for Multi-View Action Recognition [article]

Yue Bai, Zhiqiang Tao, Lichen Wang, Sheng Li, Yu Yin, Yun Fu
2020 arXiv   pre-print
Multi-view action recognition (MVAR) leverages complementary temporal information from different views to improve the learning performance.  ...  It paves a novel way to leverage attention information and enhances the multi-view representation learning.  ...  multi-view learning based on temporal data.  ... 
arXiv:2009.06599v2 fatcat:gzwmxgsoebfnnlr3mnrqfgne2a

Watching the World Go By: Representation Learning from Unlabeled Videos [article]

Daniel Gordon, Kiana Ehsani, Dieter Fox, Ali Farhadi
2020 arXiv   pre-print
Prior work uses artificial data augmentation techniques such as cropping, and color jitter which can only affect the image in superficial ways and are not aligned with how objects actually change e.g.  ...  Networks learn to ignore the augmentation noise and extract semantically meaningful representations.  ...  mask flips the booleans of each point in the mask. method using Multi-Pair on video data and the Multi-Frame learning procedure as Video Noise Contrastive Estimation (VINCE).  ... 
arXiv:2003.07990v2 fatcat:qzohn3hyargr5gutsi2jpopkwi

Multi Modal RGB D Action Recognition with CNN LSTM Ensemble Deep Network

D. Srihari, P. V.
2020 International Journal of Advanced Computer Science and Applications  
Human action recognition has transformed from a video processing problem into multi modal machine learning problem.  ...  This proposed framework can learn both temporal and spatial dynamics in both RGB and depth modal action data.  ...  In this paper, we propose to develop a hybrid recurrent CNN based deep learning framework for multi modal action recognition from RGB and depth data.  ... 
doi:10.14569/ijacsa.2020.0111284 fatcat:h63esrv6pfhljkzt7xdy6ygypa

MultiNet++: Multi-Stream Feature Aggregation and Geometric Loss Strategy for Multi-Task Learning

Sumanth Chennupati, Ganesh Sistu, Senthil Yogamani, Samir A Rawashdeh
2019 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)  
In this work, we propose a multistream multi-task network to take advantage of using feature representations from preceding frames in a video sequence for joint learning of segmentation, depth, and motion  ...  Current work on multi-task learning networks focus on processing a single input image and there is no known implementation of multi-task learning handling a sequence of images.  ...  Conclusion We introduced an efficient way of constructing Multi-Net++, a multi-task learning network that operates on multiple streams of input data.  ... 
doi:10.1109/cvprw.2019.00159 dblp:conf/cvpr/ChennupatiSYR19 fatcat:bpnqthy5wzh4xanjrukvbobiqi

Improved Accented Speech Recognition Using Accent Embeddings and Multi-task Learning

Abhinav Jain, Minali Upreti, Preethi Jyothi
2018 Interspeech 2018  
We propose a multi-task architecture that jointly learns an accent classifier and a multi-accent acoustic model.  ...  In this work, we explore how to use accent embeddings and multi-task learning to improve speech recognition for accented speech.  ...  models where the interpolation coefficients are learned from data [10] .  ... 
doi:10.21437/interspeech.2018-1864 dblp:conf/interspeech/JainUJ18 fatcat:mwaeo4e7vjdufoxbfpojnos6km
« Previous Showing results 1 — 15 out of 365,731 results