A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Video Interpolation and Prediction with Unsupervised Landmarks
[article]
2019
arXiv
pre-print
This work poses video prediction and interpolation as unsupervised latent structure inference followed by a temporal prediction in this latent space. ...
Prediction and interpolation for long-range video data involves the complex task of modeling motion trajectories for each visible object, occlusions and dis-occlusions, as well as appearance changes due ...
Acknowledgements: We would like to thank Alex Lee and Emily Denton for releasing the source code and pre-trained weights for their respective models. ...
arXiv:1909.02749v1
fatcat:2xwaxtjc7ndvnhka2ry4qigjue
Unsupervised Discovery of Object Landmarks as Structural Representations
[article]
2018
arXiv
pre-print
Our discovered landmarks are semantically meaningful and more predictive of manually annotated landmarks than those discovered by previous methods. ...
In addition, the proposed method naturally creates an unsupervised, perceptible interface to manipulate object shapes and decode images with controllable structures. ...
Acknowledgements This work was supported in part by ONR N00014-13-1-0762, NSF CAREER IIS-1453651, and Sloan Research Fellowship. ...
arXiv:1804.04412v1
fatcat:cabwrmgygfb2ti6hm3cw6wepam
Unsupervised Learning of Object Landmarks through Conditional Image Generation
[article]
2018
arXiv
pre-print
We demonstrate that our approach can learn object landmarks from synthetic image deformations or videos, all without manual supervision, while outperforming state-of-the-art unsupervised landmark detectors ...
We propose a method for learning landmark detectors for visual objects (such as the eyes and the nose in a face) without any manual supervision. ...
We would like to thank James Thewlis for suggestions and support with code and data, and David Novotný and Triantafyllos Afouras for helpful advice. ...
arXiv:1806.07823v2
fatcat:wiypxze42vbbfm6pib6rgtcqwq
Deforming Autoencoders: Unsupervised Disentangling of Shape and Appearance
[article]
2018
arXiv
pre-print
We show experiments with expression morphing in humans, hands, and digits, face manipulation, such as shape and appearance interpolation, as well as unsupervised landmark localization. ...
A more powerful form of unsupervised disentangling becomes possible in template coordinates, allowing us to successfully decompose face images into shading and albedo, and further manipulate face images ...
Our experiments with expression morphing in humans, image manipulation, such as shape and appearance interpolation, as well as unsupervised landmark localization, show the generality of our approach. ...
arXiv:1806.06503v1
fatcat:2y3w7ofn6fhzrkac27gsabrg74
Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors
[article]
2018
arXiv
pre-print
In this paper, we present supervision-by-registration, an unsupervised approach to improve the precision of facial landmark detectors on both images and video. ...
With supervision-by-registration, we demonstrate (1) improvements in facial landmark detection on both images (300W, ALFW) and video (300VW, Youtube-Celebrities), and (2) significant reduction of jittering ...
We filter videos with low resolution 2 , and use the remaining videos to train SBR in an unsupervised way. 300-VW [6, 31, 35] . This video dataset contains 50 training videos with 95192 frames. ...
arXiv:1807.00966v2
fatcat:tqp6x5qxxjcbxaacpzozn5455u
Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors
2018
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
In this paper, we present supervision-by-registration, an unsupervised approach to improve the precision of facial landmark detectors on both images and video. ...
With supervision-by-registration, we demonstrate (1) improvements in facial landmark detection on both images (300W, ALFW) and video (300VW, Youtube-Celebrities), and (2) significant reduction of jittering ...
We filter videos with low resolution 2 , and use the remaining videos to train SBR in an unsupervised way. 300-VW [6, 31, 35] . This video dataset contains 50 training videos with 95192 frames. ...
doi:10.1109/cvpr.2018.00045
dblp:conf/cvpr/DongYWW0S18
fatcat:wdtpkpvojva2nm4pqlcisqwxmu
Neural Head Reenactment with Latent Pose Descriptors
[article]
2020
arXiv
pre-print
We show that despite its simplicity, with a large and diverse enough training dataset, such learning successfully decomposes pose from identity. ...
Additionally, we show that the learned descriptors are useful for other pose-related tasks, such as keypoint prediction and pose-based retrieval. ...
We employ an off-the-shelf 2D facial landmarks prediction algorithm [2] L to obtain landmarks in both the driver I k j and the reenactment result T k (I k j ). ...
arXiv:2004.12000v1
fatcat:y3s3xkidvvhelc4gla3r5oos4a
Unsupervised Learning of Monocular Depth Estimation with Bundle Adjustment, Super-Resolution and Clip Loss
[article]
2018
arXiv
pre-print
We present a novel unsupervised learning framework for single view depth estimation using monocular videos. ...
Additionally, we introduce the clip loss to deal with moving objects and occlusion. ...
This paper focuses on unsupervised learning using monocular videos and seeks to reduce this gap. ...
arXiv:1812.03368v1
fatcat:462ptqvitjcmlmhig6im2hmmke
Audio-Driven Emotional Video Portraits
[article]
2021
arXiv
pre-print
., a duration-independent emotion space and a duration dependent content space. With the disentangled features, dynamic 2D emotional facial landmarks can be deduced. ...
Then we propose the Target-Adaptive Face Synthesis technique to generate the final high-quality video portraits, by bridging the gap between the deduced landmarks and the natural head poses of target videos ...
the Beijing Natural Science Foundation (JQ19015), the NSFC (No.61822111, 61727808, 61627804), the NSFJS (BK20192003), partly by Leading Technology of Jiangsu Basic Research Plan under Grant BK2019200, and ...
arXiv:2104.07452v2
fatcat:mp6zh2bxtnc7hilkldxzzokaxi
Deformable Generator Network: Unsupervised Disentanglement of Appearance and Geometry
[article]
2020
arXiv
pre-print
We present a deformable generator model to disentangle the appearance and geometric information for both image and video data in a purely unsupervised manner. ...
Two generators take independent latent vectors as input to disentangle the appearance and geometric information from image or video sequences. ...
landmark prediction on the MAFL test set. ...
arXiv:1806.06298v3
fatcat:cwx4l5crqjhsfagzctvoylvivq
Supervision by Registration and Triangulation for Landmark Detection
2020
IEEE Transactions on Pattern Analysis and Machine Intelligence
We present Supervision by Registration and Triangulation (SRT), an unsupervised approach that utilizes unlabeled multi-view video to improve the accuracy and precision of landmark detectors. ...
Experiments with 11 datasets and a newly proposed metric to measure precision demonstrate accuracy and precision improvements in landmark detection on both images and video. ...
and precision in both images and videos, more stable predictions in videos, and more consistent predictions in different views. ...
doi:10.1109/tpami.2020.2983935
pmid:32248096
fatcat:qo7zjzlxarf47iyvobzyvjwtqm
"Look Ma, No Landmarks!" – Unsupervised, Model-Based Dense Face Alignment
[chapter]
2020
Lecture Notes in Computer Science
In this paper, we show how to train an image-to-image network to predict dense correspondence between a face image and a 3D morphable model using only the model for supervision. ...
The least squares residuals provide an unsupervised training signal that allows us to avoid artefacts common in the literature such as shrinking and conservative underfitting. ...
[39] and compare our result with supervised facial landmark detection methods. We evaluate landmarks obtained from both direct correspondence and fitted model. ...
doi:10.1007/978-3-030-58536-5_41
fatcat:wq3ymqnkpbd73i5ido3krtzoku
Teacher-Student Asynchronous Learning with Multi-Source Consistency for Facial Landmark Detection
[article]
2020
arXiv
pre-print
Due to the high annotation cost of large-scale facial landmark detection tasks in videos, a semi-supervised paradigm that uses self-training for mining high-quality pseudo-labels to participate in training ...
And extensive experiments on 300W, AFLW, and 300VW benchmarks show that the TSAL framework achieves state-of-the-art performance. ...
Video The video dataset used was the 300VW dataset (Shen et al. 2015 ) that contains 50 training videos with 95192 frames. ...
arXiv:2012.06711v1
fatcat:xe4elcr5nfh5ninytwo5fau2yi
Physics Driven Domain Specific Transporter Framework with Attention Mechanism for Ultrasound Imaging
[article]
2021
arXiv
pre-print
The proposed framework has been trained on130 Lung ultrasound (LUS) videos and 113 Wrist ultrasound (WUS) videos and validated on 100 Lung ultrasound (LUS) videos and 58 Wrist ultrasound (WUS) videos acquired ...
In this paper, we propose an unsupervised, physics driven domain specific transporter framework with an attention mechanism to identify relevant key points with applications in ultrasound imaging. ...
Semantic segmentation in natural images and videos has been approached using unsupervised techniques [20] . ...
arXiv:2109.06346v1
fatcat:n2w2ykhkbfhvfmv3gifqqttq2q
The Sparse Manifold Transform
[article]
2018
arXiv
pre-print
We provide a theoretical description of the transform and demonstrate properties of the learned representation on both synthetic data and natural videos. ...
The sparse manifold transform is an unsupervised and generative framework that explicitly and simultaneously models the sparse discreteness and low-dimensional manifold structure found in natural scenes ...
At each time, t, we use a nearest neighbor (KNN) solver to find a local linear
interpolation of the point’s location from the landmarks, that is xt = ΦLM αt , with αt ∈ IR300 and
αt 0 (the choice of ...
arXiv:1806.08887v2
fatcat:zmurlcmff5f7pokdnpochlhlaq
« Previous
Showing results 1 — 15 out of 1,056 results