Filters








248 Hits in 1.3 sec

Differentiable Surface Rendering via Non-Differentiable Sampling [article]

Forrester Cole, Kyle Genova, Avneesh Sud, Daniel Vlasic, Zhoutong Zhang
2021 arXiv   pre-print
We present a method for differentiable rendering of 3D surfaces that supports both explicit and implicit representations, provides derivatives at occlusion boundaries, and is fast and simple to implement. The method first samples the surface using non-differentiable rasterization, then applies differentiable, depth-aware point splatting to produce the final image. Our approach requires no differentiable meshing or rasterization steps, making it efficient for large 3D models and applicable to
more » ... surfaces extracted from implicit surface definitions. We demonstrate the effectiveness of our method for implicit-, mesh-, and parametric-surface-based inverse rendering and neural-network training applications. In particular, we show for the first time efficient, differentiable rendering of an isosurface extracted from a neural radiance field (NeRF), and demonstrate surface-based, rather than volume-based, rendering of a NeRF.
arXiv:2108.04886v1 fatcat:ko5d6jnwyjahzlq3sjdn2ljtgi

Semantic deformation transfer

Ilya Baran, Daniel Vlasic, Eitan Grinspun, Jovan Popović
2009 ACM Transactions on Graphics  
Results We applied our method to publicly available mesh animations from performance capture [Vlasic et al. 2008 ] and deformation transfer [Sumner and Popović 2004] .  ... 
doi:10.1145/1531326.1531342 fatcat:hzmdejtxk5ebrosgflcglu36sm

Unsupervised Training for 3D Morphable Model Regression [article]

Kyle Genova, Forrester Cole, Aaron Maschinot, Aaron Sarna, Daniel Vlasic, William T. Freeman
2018 arXiv   pre-print
We present a method for training a regression network from image pixels to 3D morphable model coordinates using only unlabeled photographs. The training loss is based on features from a facial recognition network, computed on-the-fly by rendering the predicted faces with a differentiable renderer. To make training from features feasible and avoid network fooling effects, we introduce three objectives: a batch distribution loss that encourages the output distribution to match the distribution of
more » ... the morphable model, a loopback loss that ensures the network can correctly reinterpret its own output, and a multi-view identity loss that compares the features of the predicted 3D face and the input photograph from multiple viewing angles. We train a regression network using these objectives, a set of unlabeled photographs, and the morphable model itself, and demonstrate state-of-the-art results.
arXiv:1806.06098v1 fatcat:gxq5jkm35faoxg3ylucuuvw6ta

Learning Shape Templates with Structured Implicit Functions [article]

Kyle Genova, Forrester Cole, Daniel Vlasic, Aaron Sarna, William T. Freeman, Thomas Funkhouser
2019 arXiv   pre-print
Template 3D shapes are useful for many tasks in graphics and vision, including fitting observation data, analyzing shape collections, and transferring shape attributes. Because of the variety of geometry and topology of real-world shapes, previous methods generally use a library of hand-made templates. In this paper, we investigate learning a general shape template from data. To allow for widely varying geometry and topology, we choose an implicit surface representation based on composition of
more » ... ocal shape elements. While long known to computer graphics, this representation has not yet been explored in the context of machine learning for vision. We show that structured implicit functions are suitable for learning and allow a network to smoothly and simultaneously fit multiple classes of shapes. The learned shape template supports applications such as shape exploration, correspondence, abstraction, interpolation, and semantic segmentation from an RGB image.
arXiv:1904.06447v1 fatcat:lpr6pcbgf5hormxx6mjnimxmf4

Face transfer with multilinear models

Daniel Vlasic, Matthew Brand, Hanspeter Pfister, Jovan Popović
2005 ACM Transactions on Graphics  
Face Transfer is a method for mapping videorecorded performances of one individual to facial animations of another. It extracts visemes (speech-related mouth articulations), expressions, and three-dimensional (3D) pose from monocular video or film footage. These parameters are then used to generate and drive a detailed 3D textured face mesh for a target identity, which can be seamlessly rendered back into target footage. The underlying face model automatically adjusts for how the target
more » ... facial expressions and visemes. The performance data can be easily edited to change the visemes, expressions, pose, or even the identity of the target -the attributes are separably controllable. This supports a wide variety of video rewrite and puppetry applications. Face Transfer is based on a multilinear model of 3D face meshes that separably parameterizes the space of geometric variations due to difference attributes (e.g., identity, expression, and viseme). Separability means that each of these attributes can be independently varied. A multilinear model can be estimated from a Cartesian product of examples (identities x expressions x visemes) with techniques from statistical analysis, but only after careful preprocessing of the geometric data set to secure one-to-one correspondence, to minimize cross-coupling artifacts, and to fill in any missing examples. Face Transfer offers new solutions to these problems and links the estimated model with a face-tracking algorithm to extract pose, expression, and viseme parameters. SIGGRAPH 2005 This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved. Figure 1: Face Transfer with multilinear models gives animators decoupled control over facial attributes such as identity, expression, and viseme. In this example, we combine pose and identity from the first video, expressions from the second, and visemes from the third one to get a composite result blended back into the original video. Abstract Face Transfer is a method for mapping videorecorded performances of one individual to facial animations of another. It extracts visemes (speech-related mouth articulations), expressions, and three-dimensional (3D) pose from monocular video or film footage. These parameters are then used to generate and drive a detailed 3D textured face mesh for a target identity, which can be seamlessly rendered back into target footage. The underlying face model automatically adjusts for how the target performs facial expressions and visemes. The performance data can be easily edited to change the visemes, expressions, pose, or even the identity of the target-the attributes are separably controllable. This supports a wide variety of video rewrite and puppetry applications. Face Transfer is based on a multilinear model of 3D face meshes that separably parameterizes the space of geometric variations due to different attributes (e.g., identity, expression, and viseme). Separability means that each of these attributes can be independently varied. A multilinear model can be estimated from a Cartesian product of examples (identities × expressions × visemes) with techniques from statistical analysis, but only after careful preprocessing of the geometric data set to secure one-to-one correspondence, to minimize cross-coupling artifacts, and to fill in any missing examples. Face Transfer offers new solutions to these problems and links the estimated model with a face-tracking algorithm to extract pose, expression, and viseme parameters.
doi:10.1145/1073204.1073209 fatcat:b4iidkieezff3gkmbdu6ozyvea

Semantic deformation transfer

Ilya Baran, Daniel Vlasic, Eitan Grinspun, Jovan Popović
2009 ACM SIGGRAPH 2009 papers on - SIGGRAPH '09  
Results We applied our method to publicly available mesh animations from performance capture [Vlasic et al. 2008 ] and deformation transfer [Sumner and Popović 2004] .  ... 
doi:10.1145/1576246.1531342 fatcat:ik4pkqqm7jc4hnzt5gda4g45ou

Video face replacement

Kevin Dale, Kalyan Sunkavalli, Micah K. Johnson, Daniel Vlasic, Wojciech Matusik, Hanspeter Pfister
2011 ACM Transactions on Graphics  
We refer the reader to Vlasic et al. [2005] for more details.  ...  Comparison with Vlasic et al. [2005] We reprocessed the original scan data [Vlasic et al. 2005 ] to place it into correspondence with a face mesh that covers the full face, including the jaw.  ... 
doi:10.1145/2070781.2024164 fatcat:2xhzidcfrrajjazpmfsrwez7cu

Articulated mesh animation from multi-view silhouettes

Daniel Vlasic, Ilya Baran, Wojciech Matusik, Jovan Popović
2008 ACM Transactions on Graphics  
Details in mesh animations are difficult to generate but they have great impact on visual quality. In this work, we demonstrate a practical software system for capturing such details from multi-view video recordings. Given a stream of synchronized video images that record a human performance from multiple viewpoints and an articulated template of the performer, our system captures the motion of both the skeleton and the shape. The output mesh animation is enhanced with the details observed in
more » ... e image silhouettes. For example, a performance in casual loose-fitting clothes will generate mesh animations with flowing garment motions. We accomplish this with a fast pose tracking method followed by nonrigid deformation of the template to fit the silhouettes. The entire process takes less than sixteen seconds per frame and requires no markers or texture cues. Captured meshes are in full correspondence making them readily usable for editing operations including texturing, deformation transfer, and deformation model learning.
doi:10.1145/1360612.1360696 fatcat:eu6s2gfncrdcpnj7dhvq7s7vj4

Face transfer with multilinear models

Daniel Vlasic, Matthew Brand, Hanspeter Pfister, Jovan Popović
2005 ACM SIGGRAPH 2005 Papers on - SIGGRAPH '05  
Face Transfer is a method for mapping videorecorded performances of one individual to facial animations of another. It extracts visemes (speech-related mouth articulations), expressions, and three-dimensional (3D) pose from monocular video or film footage. These parameters are then used to generate and drive a detailed 3D textured face mesh for a target identity, which can be seamlessly rendered back into target footage. The underlying face model automatically adjusts for how the target
more » ... facial expressions and visemes. The performance data can be easily edited to change the visemes, expressions, pose, or even the identity of the target -the attributes are separably controllable. This supports a wide variety of video rewrite and puppetry applications. Face Transfer is based on a multilinear model of 3D face meshes that separably parameterizes the space of geometric variations due to difference attributes (e.g., identity, expression, and viseme). Separability means that each of these attributes can be independently varied. A multilinear model can be estimated from a Cartesian product of examples (identities x expressions x visemes) with techniques from statistical analysis, but only after careful preprocessing of the geometric data set to secure one-to-one correspondence, to minimize cross-coupling artifacts, and to fill in any missing examples. Face Transfer offers new solutions to these problems and links the estimated model with a face-tracking algorithm to extract pose, expression, and viseme parameters. SIGGRAPH 2005 This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved. Figure 1: Face Transfer with multilinear models gives animators decoupled control over facial attributes such as identity, expression, and viseme. In this example, we combine pose and identity from the first video, expressions from the second, and visemes from the third one to get a composite result blended back into the original video. Abstract Face Transfer is a method for mapping videorecorded performances of one individual to facial animations of another. It extracts visemes (speech-related mouth articulations), expressions, and three-dimensional (3D) pose from monocular video or film footage. These parameters are then used to generate and drive a detailed 3D textured face mesh for a target identity, which can be seamlessly rendered back into target footage. The underlying face model automatically adjusts for how the target performs facial expressions and visemes. The performance data can be easily edited to change the visemes, expressions, pose, or even the identity of the target-the attributes are separably controllable. This supports a wide variety of video rewrite and puppetry applications. Face Transfer is based on a multilinear model of 3D face meshes that separably parameterizes the space of geometric variations due to different attributes (e.g., identity, expression, and viseme). Separability means that each of these attributes can be independently varied. A multilinear model can be estimated from a Cartesian product of examples (identities × expressions × visemes) with techniques from statistical analysis, but only after careful preprocessing of the geometric data set to secure one-to-one correspondence, to minimize cross-coupling artifacts, and to fill in any missing examples. Face Transfer offers new solutions to these problems and links the estimated model with a face-tracking algorithm to extract pose, expression, and viseme parameters.
doi:10.1145/1186822.1073209 fatcat:ycfhzjzojfbzbhnor3mzu26om4

Face transfer with multilinear models

Daniel Vlasic, Matthew Brand, Hanspeter Pfister, Jovan Popovic
2006 ACM SIGGRAPH 2006 Courses on - SIGGRAPH '06  
Face Transfer is a method for mapping videorecorded performances of one individual to facial animations of another. It extracts visemes (speech-related mouth articulations), expressions, and three-dimensional (3D) pose from monocular video or film footage. These parameters are then used to generate and drive a detailed 3D textured face mesh for a target identity, which can be seamlessly rendered back into target footage. The underlying face model automatically adjusts for how the target
more » ... facial expressions and visemes. The performance data can be easily edited to change the visemes, expressions, pose, or even the identity of the target -the attributes are separably controllable. This supports a wide variety of video rewrite and puppetry applications. Face Transfer is based on a multilinear model of 3D face meshes that separably parameterizes the space of geometric variations due to difference attributes (e.g., identity, expression, and viseme). Separability means that each of these attributes can be independently varied. A multilinear model can be estimated from a Cartesian product of examples (identities x expressions x visemes) with techniques from statistical analysis, but only after careful preprocessing of the geometric data set to secure one-to-one correspondence, to minimize cross-coupling artifacts, and to fill in any missing examples. Face Transfer offers new solutions to these problems and links the estimated model with a face-tracking algorithm to extract pose, expression, and viseme parameters. SIGGRAPH 2005 This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved. Figure 1: Face Transfer with multilinear models gives animators decoupled control over facial attributes such as identity, expression, and viseme. In this example, we combine pose and identity from the first video, expressions from the second, and visemes from the third one to get a composite result blended back into the original video. Abstract Face Transfer is a method for mapping videorecorded performances of one individual to facial animations of another. It extracts visemes (speech-related mouth articulations), expressions, and three-dimensional (3D) pose from monocular video or film footage. These parameters are then used to generate and drive a detailed 3D textured face mesh for a target identity, which can be seamlessly rendered back into target footage. The underlying face model automatically adjusts for how the target performs facial expressions and visemes. The performance data can be easily edited to change the visemes, expressions, pose, or even the identity of the target-the attributes are separably controllable. This supports a wide variety of video rewrite and puppetry applications. Face Transfer is based on a multilinear model of 3D face meshes that separably parameterizes the space of geometric variations due to different attributes (e.g., identity, expression, and viseme). Separability means that each of these attributes can be independently varied. A multilinear model can be estimated from a Cartesian product of examples (identities × expressions × visemes) with techniques from statistical analysis, but only after careful preprocessing of the geometric data set to secure one-to-one correspondence, to minimize cross-coupling artifacts, and to fill in any missing examples. Face Transfer offers new solutions to these problems and links the estimated model with a face-tracking algorithm to extract pose, expression, and viseme parameters.
doi:10.1145/1185657.1185864 dblp:conf/siggraph/VlasicBPP06 fatcat:i62ikvmrk5ainlg7c4usbkdhjq

Practical motion capture in everyday surroundings

Daniel Vlasic, Rolf Adelsberger, Giovanni Vannucci, John Barnwell, Markus Gross, Wojciech Matusik, Jovan Popović
2007 ACM Transactions on Graphics  
Figure 1 : Traditional motion-capture systems excel at recording motions within lab-like environments but struggle with recording outdoor activities such as skiing, biking, and driving. This limitation led us to design a wearable motion-capture system that records human activity in both indoor and outdoor environments. Abstract Commercial motion-capture systems produce excellent in-studio reconstructions, but offer no comparable solution for acquisition in everyday environments. We present a
more » ... tem for acquiring motions almost anywhere. This wearable system gathers ultrasonic time-of-flight and inertial measurements with a set of inexpensive miniature sensors worn on the garment. After recording, the information is combined using an Extended Kalman Filter to reconstruct joint configurations of a body. Experimental results show that even motions that are traditionally difficult to acquire are recorded with ease within their natural settings. Although our prototype does not reliably recover the global transformation, we show that the resulting motions are visually similar to the original ones, and that the combined acoustic and inertial system reduces the drift commonly observed in purely inertial systems. Our final results suggest that this system could become a versatile input device for a variety of augmented-reality applications.
doi:10.1145/1239451.1239486 fatcat:bnbkaadlnzbl3hytmgz4kp7i6a

LASR: Learning Articulated Shape Reconstruction from a Monocular Video [article]

Gengshan Yang, Deqing Sun, Varun Jampani, Daniel Vlasic, Forrester Cole, Huiwen Chang, Deva Ramanan, William T. Freeman, Ce Liu
2021 arXiv   pre-print
Remarkable progress has been made in 3D reconstruction of rigid structures from a video or a collection of images. However, it is still challenging to reconstruct nonrigid structures from RGB inputs, due to its under-constrained nature. While template-based approaches, such as parametric shape models, have achieved great success in modeling the "closed world" of known object categories, they cannot well handle the "open-world" of novel object categories or outlier shapes. In this work, we
more » ... uce a template-free approach to learn 3D shapes from a single video. It adopts an analysis-by-synthesis strategy that forward-renders object silhouette, optical flow, and pixel values to compare with video observations, which generates gradients to adjust the camera, shape and motion parameters. Without using a category-specific shape template, our method faithfully reconstructs nonrigid 3D structures from videos of human, animals, and objects of unknown classes. Code will be available at lasr-google.github.io .
arXiv:2105.02976v1 fatcat:oostakof65ah7epconiy63y24y

Practical motion capture in everyday surroundings

Daniel Vlasic, Rolf Adelsberger, Giovanni Vannucci, John Barnwell, Markus Gross, Wojciech Matusik, Jovan Popović
2007 ACM Transactions on Graphics  
Figure 1 : Traditional motion-capture systems excel at recording motions within lab-like environments but struggle with recording outdoor activities such as skiing, biking, and driving. This limitation led us to design a wearable motion-capture system that records human activity in both indoor and outdoor environments. Abstract Commercial motion-capture systems produce excellent in-studio reconstructions, but offer no comparable solution for acquisition in everyday environments. We present a
more » ... tem for acquiring motions almost anywhere. This wearable system gathers ultrasonic time-of-flight and inertial measurements with a set of inexpensive miniature sensors worn on the garment. After recording, the information is combined using an Extended Kalman Filter to reconstruct joint configurations of a body. Experimental results show that even motions that are traditionally difficult to acquire are recorded with ease within their natural settings. Although our prototype does not reliably recover the global transformation, we show that the resulting motions are visually similar to the original ones, and that the combined acoustic and inertial system reduces the drift commonly observed in purely inertial systems. Our final results suggest that this system could become a versatile input device for a variety of augmented-reality applications.
doi:10.1145/1276377.1276421 fatcat:m5cvuzduabawxljgqjthbz4ao4

Opacity light fields

Daniel Vlasic, Hanspeter Pfister, Sergey Molinov, Radek Grzeszczuk, Wojciech Matusik
2003 Proceedings of the 2003 symposium on Interactive 3D graphics - SI3D '03  
We present new hardware-accelerated techniques for rendering surface light fields with opacity hulls that allow for interactive visualization of objects that have complex reflectance properties and elaborate geometrical details. The opacity hull is a shape enclosing the object with view-dependent opacity parameterized onto that shape. We call the combination of opacity hulls and surface light fields the opacity light field. Opacity light fields are ideally suited for rendering of the visually
more » ... mplex objects and scenes obtained with 3D photography. We show how to implement opacity light fields in the framework of three surface light field rendering methods: viewdependent texture mapping, unstructured lumigraph rendering, and light field mapping. The modified algorithms can be effectively supported on modern graphics hardware. Our results show that all three implementations are able to achieve interactive or real-time frame rates.
doi:10.1145/641480.641496 dblp:conf/si3d/VlasicPMGM03 fatcat:3kgiqwf2dzd4pgrhxpfgjmqruq

AutoFlow: Learning a Better Training Set for Optical Flow [article]

Deqing Sun, Daniel Vlasic, Charles Herrmann, Varun Jampani, Michael Krainin, Huiwen Chang, Ramin Zabih, William T. Freeman, Ce Liu
2021 arXiv   pre-print
Synthetic datasets play a critical role in pre-training CNN models for optical flow, but they are painstaking to generate and hard to adapt to new applications. To automate the process, we present AutoFlow, a simple and effective method to render training data for optical flow that optimizes the performance of a model on a target dataset. AutoFlow takes a layered approach to render synthetic data, where the motion, shape, and appearance of each layer are controlled by learnable hyperparameters.
more » ... Experimental results show that AutoFlow achieves state-of-the-art accuracy in pre-training both PWC-Net and RAFT. Our code and data are available at https://autoflow-google.github.io .
arXiv:2104.14544v1 fatcat:wd2r2aa6ozajfe25rawf5s5h5q
« Previous Showing results 1 — 15 out of 248 results