A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Multi‐label learning based target detecting from multi‐frame data
2021
IET Image Processing
Owing to the progress of multiple frames time series data, or video satellites, target detecting from space-borne satellite videos has been available. ...
This paper considers target detecting from time series data as multi-label problem as there are several different kinds of background objects and targets of interest. ...
Ltd. for acquiring and providing the data used in this study, and the IEEE GRSS Image Analysis and Data Fusion Technical Committee.
ORCID Fazhi He https://orcid.org/0000-0001-7016-3698 ...
doi:10.1049/ipr2.12271
fatcat:rjzwkbcdtzhkxkb4arocv7djqi
Learning From Multi-Frame Data
[article]
2020
Multi-frame data-driven methods bear the promise that aggregating multiple observations leads to better estimates of target quantities than a single (still) observation. ...
This thesis examines how data-driven approaches such as deep neural networks should be constructed to improve over single-frame-based counterparts. ...
The primary objective of this thesis is to design deep learning based models to handle multi-frame data more naturally. ...
doi:10.15496/publikation-51349
fatcat:gpnsx7adbrgizh2bwtrd4pexgq
FML: Face Model Learning from Videos
[article]
2019
arXiv
pre-print
Our face model is learned using only corpora of in-the-wild video clips collected from the Internet. This virtually endless source of training data enables learning of a highly general 3D face model. ...
Most existing methods rely on data-driven priors that are built from limited 3D face scans. ...
Multi-frame Consistent Face Model Learning We propose a novel network for consistent multi-frame face model learning. ...
arXiv:1812.07603v2
fatcat:mdnemyu7xjf5lbhszg5i3e53fu
FML: Face Model Learning From Videos
2019
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
We propose multi-frame self-supervised training of a deep network based on in-the-wild video data for jointly learning a face model and 3D face reconstruction. ...
Our face model is learned using only corpora of in-the-wild video clips collected from the Internet. This virtually endless source of training data enables learning of a highly general 3D face model. ...
Multi-frame Consistent Face Model Learning We propose a novel network for consistent multi-frame face model learning. ...
doi:10.1109/cvpr.2019.01107
dblp:conf/cvpr/TewariB0BESPZT19
fatcat:6gf5b75bkzbldhzbyqnun4okzm
Detecting and Localizing 3D Object Classes using Viewpoint Invariant Reference Frames
2007
2007 IEEE 11th International Conference on Computer Vision
We present a new, iterative learning algorithm to determine an optimal viewpoint invariant reference frame from training images in a data-driven manner. ...
We compare multi-view and viewpoint invariant representations trained and tested on the same data, where the viewpoint invariant approach results in fewer false positive detections and higher average precision ...
Ideally, fully unsupervised learning could derive an optimal invariant reference frame from data, however unsupervised learning is a challenging task even in the single viewpoint case. ...
doi:10.1109/iccv.2007.4408832
dblp:conf/iccv/ToewsA07
fatcat:si4cwyi24bfhpjoltdr45bnuwu
MultiNet++: Multi-Stream Feature Aggregation and Geometric Loss Strategy for Multi-Task Learning
[article]
2019
arXiv
pre-print
In this work, we propose a multi-stream multi-task network to take advantage of using feature representations from preceding frames in a video sequence for joint learning of segmentation, depth, and motion ...
Current work on multi-task learning networks focus on processing a single input image and there is no known implementation of multi-task learning handling a sequence of images. ...
Conclusion We introduced an efficient way of constructing Multi-Net++, a multi-task learning network that operates on multiple streams of input data. ...
arXiv:1904.08492v2
fatcat:ob6bo36oardubot35ejs7inabi
Multi-modal Affect Analysis using standardized data within subjects in the Wild
[article]
2021
arXiv
pre-print
Furthermore, the above features were learned using multi-modal data such as image features, AU, Head pose, and Gaze. In the validation set, our model achieved a facial expression score of 0.546. ...
Therefore, after learning the common features for each frame, we constructed a facial expression estimation model and valence-arousal model using time-series data after combining the common features and ...
Then, image features and audio features are combined, and A multi-frame model was generated by learning multiple frame data that combines the intermediate features and the intermediate features standardized ...
arXiv:2107.03009v3
fatcat:4o5w47c6yrfajp6fuubleufswi
Multi-Task Learning of Generalizable Representations for Video Action Recognition
[article]
2020
arXiv
pre-print
Based on these findings, we present a multi-task learning paradigm for video classification. ...
the discrepancy of the multi-task features in a self-supervised manner. ...
Specifically, we present the Reversed Two-Stream Networks (Rev2Net), which is trained in a multi-task learning framework with selfsupervision from the multi-modality data. ...
arXiv:1811.08362v2
fatcat:o35bbhtrcncyvdulm7htm7we2m
Issue Framing in Online Discussion Fora
[article]
2019
arXiv
pre-print
, assuming only unlabeled training data in the target domain. ...
We explore to what extent models trained to detect issue frames in newswire and social media can be transferred to the domain of discussion fora, using a combination of multi-task and adversarial training ...
Adversarial Learning Ganin and Lempitsky (2015) proposed adversarial learning for domain adaptation that can exploit unlabeled data from the target domain. ...
arXiv:1904.03969v2
fatcat:rhwsnvlp6fcm5gi7z3b6t367lq
Issue Framing in Online Discussion Fora
2019
Proceedings of the 2019 Conference of the North
, assuming only unlabeled training data in the target domain. ...
We explore to what extent models trained to detect issue frames in newswire and social media can be transferred to the domain of discussion fora, using a combination of multi-task and adversarial training ...
Adversarial Learning Ganin and Lempitsky (2015) proposed adversarial learning for domain adaptation that can exploit unlabeled data from the target domain. ...
doi:10.18653/v1/n19-1142
dblp:conf/naacl/HartmannJAS19
fatcat:vvwhfzjqsngrlkpal6bk7fcgwi
Collaborative Attention Mechanism for Multi-View Action Recognition
[article]
2020
arXiv
pre-print
Multi-view action recognition (MVAR) leverages complementary temporal information from different views to improve the learning performance. ...
It paves a novel way to leverage attention information and enhances the multi-view representation learning. ...
multi-view learning based on temporal data. ...
arXiv:2009.06599v2
fatcat:gzwmxgsoebfnnlr3mnrqfgne2a
Watching the World Go By: Representation Learning from Unlabeled Videos
[article]
2020
arXiv
pre-print
Prior work uses artificial data augmentation techniques such as cropping, and color jitter which can only affect the image in superficial ways and are not aligned with how objects actually change e.g. ...
Networks learn to ignore the augmentation noise and extract semantically meaningful representations. ...
mask flips the booleans of each point in the mask. method using Multi-Pair on video data and the Multi-Frame learning procedure as Video Noise Contrastive Estimation (VINCE). ...
arXiv:2003.07990v2
fatcat:qzohn3hyargr5gutsi2jpopkwi
Multi Modal RGB D Action Recognition with CNN LSTM Ensemble Deep Network
2020
International Journal of Advanced Computer Science and Applications
Human action recognition has transformed from a video processing problem into multi modal machine learning problem. ...
This proposed framework can learn both temporal and spatial dynamics in both RGB and depth modal action data. ...
In this paper, we propose to develop a hybrid recurrent CNN based deep learning framework for multi modal action recognition from RGB and depth data. ...
doi:10.14569/ijacsa.2020.0111284
fatcat:h63esrv6pfhljkzt7xdy6ygypa
MultiNet++: Multi-Stream Feature Aggregation and Geometric Loss Strategy for Multi-Task Learning
2019
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
In this work, we propose a multistream multi-task network to take advantage of using feature representations from preceding frames in a video sequence for joint learning of segmentation, depth, and motion ...
Current work on multi-task learning networks focus on processing a single input image and there is no known implementation of multi-task learning handling a sequence of images. ...
Conclusion We introduced an efficient way of constructing Multi-Net++, a multi-task learning network that operates on multiple streams of input data. ...
doi:10.1109/cvprw.2019.00159
dblp:conf/cvpr/ChennupatiSYR19
fatcat:bpnqthy5wzh4xanjrukvbobiqi
Improved Accented Speech Recognition Using Accent Embeddings and Multi-task Learning
2018
Interspeech 2018
We propose a multi-task architecture that jointly learns an accent classifier and a multi-accent acoustic model. ...
In this work, we explore how to use accent embeddings and multi-task learning to improve speech recognition for accented speech. ...
models where the interpolation coefficients are learned from data [10] . ...
doi:10.21437/interspeech.2018-1864
dblp:conf/interspeech/JainUJ18
fatcat:mwaeo4e7vjdufoxbfpojnos6km
« Previous
Showing results 1 — 15 out of 365,731 results