2,021 Hits in 8.1 sec

Video Object Segmentation and Tracking: A Survey [article]

Rui Yao, Guosheng Lin, Shixiong Xia, Jiaqi Zhao, Yong Zhou
2019 arXiv   pre-print
These two topics are diffcult to handle some common challenges, such as occlusion, deformation, motion blur, and scale variation.  ...  Object segmentation and object tracking are fundamental research area in the computer vision community.  ...  In addition, they propose a motion-based bilateral network, then a graph cut model is build to propagate the pixel-wise labels.  ... 
arXiv:1904.09172v3 fatcat:nm3zptbidvgxfkxezqjekwdpdi

Deep learning based fence segmentation and removal from an image using a video sequence [article]

Sankaraganesh Jonna, Krishna K. Nakka, Rajiv R. Sahay
2016 arXiv   pre-print
Specifically, we use a pre-trained convolutional neural network to segment fence pixels from a single image.  ...  In this paper, we propose a de-fencing algorithm for images of dynamic scenes using an occlusion-aware optical flow method.  ...  Schematic of fence mask segmentation. 3.2 Occlusion aware optical flow The image alignment problem becomes more complex when real-world videos contain dynamic objects.  ... 
arXiv:1609.07727v2 fatcat:ppur7ow7tjdwzkabfzj2swbm2m

Accurate and efficient video de-fencing using convolutional neural networks and temporal information [article]

Chen Du, Byeongkeun Kang, Zheng Xu, Ji Dai, Truong Nguyen
2018 arXiv   pre-print
However, the state-of-the-art de-fencing methods have limited performance caused by the difficulty of fence segmentation and also suffer from the motion of the camera or objects.  ...  The segmentation algorithm using convolutional neural network achieves significant improvement in the accuracy of fence segmentation.  ...  In [6] , Jonna et al. proposed a robust method for background motion estimation by introducing occlusion-aware optical flow.  ... 
arXiv:1806.10781v1 fatcat:mtjnpmneerd6vk2o2rtx7vbf6e

Learning a Spatio-Temporal Embedding for Video Instance Segmentation [article]

Anthony Hu, Alex Kendall, Roberto Cipolla
2019 arXiv   pre-print
We present a novel embedding approach for video instance segmentation.  ...  We show that our model can accurately track and segment instances, even with occlusions and missed detections, advancing the state-of-the-art on the KITTI Multi-Object and Tracking Dataset.  ...  SELF-SUPERVISED DEPTH ESTIMATION The relative distance of objects is a strong cue for segmenting instances in video.  ... 
arXiv:1912.08969v1 fatcat:wvyylpvju5go5iqe7qwxyu6chu

Occlusion resistant learning of intuitive physics from videos [article]

Ronan Riochet, Josef Sivic, Ivan Laptev, Emmanuel Dupoux
2020 arXiv   pre-print
Finally, we also show results on predicting motion of objects in real videos.  ...  onto pixel space.  ...  First, scene graph proposal gives initial values for object states based on visible objects detected on a frame-byframe basis.  ... 
arXiv:2005.00069v1 fatcat:43itcbxngbcyjinubcitou7nda

Every Pixel Counts: Unsupervised Geometry Learning with Holistic 3D Motion Understanding [article]

Zhenheng Yang and Peng Wang and Yang Wang and Wei Xu and Ram Nevatia
2018 arXiv   pre-print
Specifically, given two consecutive frames from a video, we adopt a motion network to predict their relative 3D camera pose and a segmentation mask distinguishing moving objects and rigid background.  ...  Current state-of-the-art (SOTA) methods, are based on the learning framework of rigid structure-from-motion, where only 3D camera ego motion is modeled for geometry estimation.However, moving objects also  ...  For moving objects M d (p t ), we apply an edge-aware spatial smoothness loss for the motion map similar to that in [4] .  ... 
arXiv:1806.10556v2 fatcat:3jnh5gvxwzevjbzz2hkmtyfy6q

Learning Pixel Trajectories with Multiscale Contrastive Random Walks [article]

Zhangxing Bian, Allan Jabri, Alexei A. Efros, Andrew Owens
2022 arXiv   pre-print
This establishes a unified technique for self-supervised learning of optical flow, keypoint tracking, and video object segmentation.  ...  We take a step towards bridging this gap by extending the recent contrastive random walk formulation to much denser, pixel-level space-time graphs.  ...  We thank David Fouhey and Jeff Fessler for the helpful feedback. AO thanks Rick Szeliski for introducing him to multi-frame optical flow.  ... 
arXiv:2201.08379v2 fatcat:uppecqnftjbwhawpk3zxvrblou

PVO: Panoptic Visual Odometry [article]

Weicai Ye, Xinyue Lan, Shuo Chen, Yuhang Ming, Xingyuan Yu, Hujun Bao, Zhaopeng Cui, Guofeng Zhang
2022 arXiv   pre-print
We present a novel panoptic visual odometry framework, termed PVO, to achieve a more comprehensive modeling of the scene's motion, geometry, and panoptic segmentation information.  ...  PVO models visual odometry (VO) and video panoptic segmentation (VPS) in a unified view, enabling the two tasks to facilitate each other.  ...  Conclusion We present a novel panoptic visual odometry method, which models the visual odometry (VO) task for estimating scene motion and the video panoptic segmentation (VPS) task for perceiving the scene  ... 
arXiv:2207.01610v1 fatcat:pqps6zj4tjgttfckeanb3pjvae

High-Quality Video Generation from Static Structural Annotations

Lu Sheng, Junting Pan, Jiaming Guo, Jing Shao, Chen Change Loy
2020 International Journal of Computer Vision  
predicted motions and provide explicit occlusion handling in a principled manner.  ...  We employ a cycle-consistent flow-based conditioned variational autoencoder to capture the long-term motion distributions, by which the learned bi-directional flows ensure the physical reliability of the  ...  The proposed network includes three components: (a) a motion encoder, (b) a bi-directional flow generator, (c) an occlusion-aware synthesis module.  ... 
doi:10.1007/s11263-020-01334-x fatcat:yedge4qmcbd2jpyz6bo3n5fbqe

On Detection, Data Association and Segmentation for Multi-target Tracking

Yicong Tian, Afshin Dehghan, Mubarak Shah
2018 IEEE Transactions on Pattern Analysis and Machine Intelligence  
For segmentation, multi-label Conditional Random Field (CRF) is applied to a superpixel based spatio-temporal graph in a segment of video, in order to assign background or target labels to every superpixel  ...  Our structured learning based tracker learns a model for each target and infers the best locations of all targets simultaneously in a video clip.  ...  Target Identity-aware Network Flow First we need to build our graph G(V, E).  ... 
doi:10.1109/tpami.2018.2849374 pmid:29994110 fatcat:mkwrhrt2d5djdhopercl6yd5ii

Online Video Object Segmentation via Convolutional Trident Network

Won-Dong Jang, Chang-Su Kim
2017 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
A semi-supervised online video object segmentation algorithm, which accepts user annotations about a target object at the first frame, is proposed in this work.  ...  We sequentially carry out these processes from the second to the last frames to extract a segment track of the target object.  ...  Introduction Video object segmentation aims at clustering pixels in videos into objects or background.  ... 
doi:10.1109/cvpr.2017.790 dblp:conf/cvpr/JangK17 fatcat:hjgqlh6f3vglhcijywcqwagk4m

Real-Time Facial Segmentation and Performance Capture from RGB Input [article]

Shunsuke Saito, Tianye Li, Hao Li
2016 arXiv   pre-print
Along with recent breakthroughs in deep learning, we demonstrate that pixel-level facial segmentation is possible in real-time by repurposing convolutional neural networks designed originally for general  ...  We develop an efficient architecture based on a two-stream deconvolution network with complementary characteristics, and introduce carefully designed training samples and data augmentation strategies for  ...  We also thank Rui Saito and Frances Chen for being our capture models.  ... 
arXiv:1604.02647v1 fatcat:i3bdzgot4ffixn4jcm7hqyb7ze

2020 Index IEEE Transactions on Image Processing Vol. 29

2020 IEEE Transactions on Image Processing  
., +, TIP 2020 2328-2343 MATNet: Motion-Attentive Transition Network for Zero-Shot Video Object Segmentation.  ...  ., TIP 2020 4232-4242 Geometry-Aware Graph Transforms for Light Field Compact Representa-Graph-Based Transforms for Video Coding.  ... 
doi:10.1109/tip.2020.3046056 fatcat:24m6k2elprf2nfmucbjzhvzk3m

Video Frame Interpolation via Adaptive Convolution [article]

Simon Niklaus and Long Mai and Feng Liu
2017 arXiv   pre-print
Our method employs a deep fully convolutional neural network to estimate a spatially-adaptive convolution kernel for each pixel.  ...  The convolution kernel captures both the local motion between the input frames and the coefficients for pixel synthesis.  ...  As shown in Figure 10, that allow for edge-aware pixel synthesis to produce sharp our method can handle motion within 41 pixels well. How- interpolation results.  ... 
arXiv:1703.07514v1 fatcat:ayhja7ru3jfa7muv5ymareqp6e

Every Frame Counts: Joint Learning of Video Segmentation and Optical Flow

Mingyu Ding, Zhe Wang, Bolei Zhou, Jianping Shi, Zhiwu Lu, Ping Luo
Semantic segmentation brings semantic information to handle occlusion for more robust optical flow estimation, while the non-occluded optical flow provides accurate pixel-level temporal correspondences  ...  A major challenge for video semantic segmentation is the lack of labeled data.  ...  Ping Luo is partially supported by the HKU Seed Funding for Basic Research and SenseTime's Donation for Basic Research.  ... 
doi:10.1609/aaai.v34i07.6699 fatcat:cdpnu5fhazelzdh77rzdvyxhzq
« Previous Showing results 1 — 15 out of 2,021 results