394 Hits in 3.0 sec

Inertial-aided Rolling Shutter Relative Pose Estimation [article]

Chang-Ryeol Lee, Kuk-Jin Yoon
2017 arXiv   pre-print
Relative pose estimation is a fundamental problem in computer vision and it has been studied for conventional global shutter cameras for decades. However, recently, a rolling shutter camera has been widely used due to its low cost imaging capability and, since the rolling shutter camera captures the image line-by-line, the relative pose estimation of a rolling shutter camera is more difficult than that of a global shutter camera. In this paper, we propose to exploit inertial measurements
more » ... y and angular velocity) for the rolling shutter relative pose estimation problem. The inertial measurements provide information about the partial relative rotation between two views (cameras) and the instantaneous motion that causes the rolling shutter distortion. Based on this information, we simplify the rolling shutter relative pose estimation problem and propose effective methods to solve it. Unlike the previous methods, which require 44 (linear) or 17 (nonlinear) points with the uniform rolling shutter camera model, the proposed methods require at most 9 or 11 points to estimate the relative pose between the rolling shutter cameras. Experimental results on synthetic data and the public PennCOSYVIO dataset show that the proposed methods outperform the existing methods.
arXiv:1712.00184v1 fatcat:gmkiwjdn4bcdvphj5tp44gtasq

Gyroscope-aided Relative Pose Estimation for Rolling Shutter Cameras [article]

Chang-Ryeol Lee, Ju Hong Yoon, Min-Gyu Park, Kuk-Jin Yoon
2019 arXiv   pre-print
The rolling shutter camera has received great attention due to its low cost imaging capability, however, the estimation of relative pose between rolling shutter cameras still remains a difficult problem owing to its line-by-line image capturing characteristics. To alleviate this problem, we exploit gyroscope measurements, angular velocity, along with image measurement to compute the relative pose between rolling shutter cameras. The gyroscope measurements provide the information about
more » ... ous motion that causes the rolling shutter distortion. Having gyroscope measurements in one hand, we simplify the relative pose estimation problem and find a minimal solution for the problem based on the Grobner basis polynomial solver. The proposed method requires only five points to compute relative pose between rolling shutter cameras, whereas previous methods require 20 or 44 corresponding points for linear and uniform rolling shutter geometry models, respectively. Experimental results on synthetic and real data verify the superiority of the proposed method over existing relative pose estimation methods.
arXiv:1904.06770v1 fatcat:h74epvyuuvcxfodnutyvtt7r3a

Monocular Visual Odometry with a Rolling Shutter Camera [article]

Chang-Ryeol Lee, Kuk-Jin Yoon
2017 arXiv   pre-print
Rolling Shutter (RS) cameras have become popularized because of low-cost imaging capability. However, the RS cameras suffer from undesirable artifacts when the camera or the subject is moving, or illumination condition changes. For that reason, Monocular Visual Odometry (MVO) with RS cameras produces inaccurate ego-motion estimates. Previous works solve this RS distortion problem with motion prediction from images and/or inertial sensors. However, the MVO still has trouble in handling the RS
more » ... tortion when the camera motion changes abruptly (e.g. vibration of mobile cameras causes extremely fast motion instantaneously). To address the problem, we propose the novel MVO algorithm in consideration of the geometric characteristics of RS cameras. The key idea of the proposed algorithm is the new RS essential matrix which incorporates the instantaneous angular and linear velocities at each frame. Our algorithm produces accurate and robust ego-motion estimates in an online manner, and is applicable to various mobile applications with RS cameras. The superiority of the proposed algorithm is validated through quantitative and qualitative comparison on both synthetic and real dataset.
arXiv:1704.07163v1 fatcat:c57gma3o5vejbi55jneme7gsiy

Exploring Pixel-level Self-supervision for Weakly Supervised Semantic Segmentation [article]

Sung-Hoon Yoon, Hyeokjun Kweon, Jaeseok Jeong, Hyeonseong Kim, Shinjeong Kim, Kuk-Jin Yoon
2021 arXiv   pre-print
Existing studies in weakly supervised semantic segmentation (WSSS) have utilized class activation maps (CAMs) to localize the class objects. However, since a classification loss is insufficient for providing precise object regions, CAMs tend to be biased towards discriminative patterns (i.e., sparseness) and do not provide precise object boundary information (i.e., impreciseness). To resolve these limitations, we propose a novel framework (composed of MainNet and SupportNet.) that derives
more » ... level self-supervision from given image-level supervision. In our framework, with the help of the proposed Regional Contrastive Module (RCM) and Multi-scale Attentive Module (MAM), MainNet is trained by self-supervision from the SupportNet. The RCM extracts two forms of self-supervision from SupportNet: (1) class region masks generated from the CAMs and (2) class-wise prototypes obtained from the features according to the class region masks. Then, every pixel-wise feature of the MainNet is trained by the prototype in a contrastive manner, sharpening the resulting CAMs. The MAM utilizes CAMs inferred at multiple scales from the SupportNet as self-supervision to guide the MainNet. Based on the dissimilarity between the multi-scale CAMs from MainNet and SupportNet, CAMs from the MainNet are trained to expand to the less-discriminative regions. The proposed method shows state-of-the-art WSSS performance both on the train and validation sets on the PASCAL VOC 2012 dataset. For reproducibility, code will be available publicly soon.
arXiv:2112.05351v1 fatcat:a6eyh5ofx5db5ieeydbofhlofi

Interacting Multiview Tracker

Ju Hong Yoon, Ming-Hsuan Yang, Kuk-Jin Yoon
2016 IEEE Transactions on Pattern Analysis and Machine Intelligence  
A robust algorithm is proposed for tracking a target object in dynamic conditions including motion blurs, illumination changes, pose variations, and occlusions. To cope with these challenging factors, multiple trackers based on different feature representations are integrated within a probabilistic framework. Each view of the proposed multiview (multi-channel) feature learning algorithm is concerned with one particular feature representation of a target object from which a tracker is developed
more » ... ith different levels of reliability. With the multiple trackers, the proposed algorithm exploits tracker interaction and selection for robust tracking performance. In the tracker interaction, a transition probability matrix is used to estimate dependencies between trackers. Multiple trackers communicate with each other by sharing information of sample distributions. The tracker selection process determines the most reliable tracker with the highest probability. To account for object appearance changes, the transition probability matrix and tracker probability are updated in a recursive Bayesian framework by reflecting the tracker reliability measured by a robust tracker likelihood function that learns to account for both transient and stable appearance changes. Experimental results on benchmark datasets demonstrate that the proposed interacting multiview algorithm performs robustly and favorably against state-of-the-art methods in terms of several quantitative metrics.
doi:10.1109/tpami.2015.2473862 pmid:26336117 fatcat:zgiqxfl3lrgpdgk5dla676yv4m

Learning Monocular Depth Estimation via Selective Distillation of Stereo Knowledge [article]

Kyeongseob Song, Kuk-Jin Yoon
2022 arXiv   pre-print
Monocular depth estimation has been extensively explored based on deep learning, yet its accuracy and generalization ability still lag far behind the stereo-based methods. To tackle this, a few recent studies have proposed to supervise the monocular depth estimation network by distilling disparity maps as proxy ground-truths. However, these studies naively distill the stereo knowledge without considering the comparative advantages of stereo-based and monocular depth estimation methods. In this
more » ... aper, we propose to selectively distill the disparity maps for more reliable proxy supervision. Specifically, we first design a decoder (MaskDecoder) that learns two binary masks which are trained to choose optimally between the proxy disparity maps and the estimated depth maps for each pixel. The learned masks are then fed to another decoder (DepthDecoder) to enforce the estimated depths to learn from only the masked area in the proxy disparity maps. Additionally, a Teacher-Student module is designed to transfer the geometric knowledge of the StereoNet to the MonoNet. Extensive experiments validate our methods achieve state-of-the-art performance for self- and proxy-supervised monocular depth estimation on the KITTI dataset, even surpassing some of the semi-supervised methods.
arXiv:2205.08668v1 fatcat:c77e5lghlrgehisagy6dqwxg2q

Distance-based Camera Network Topology Inference for Person Re-identification [article]

Yeong-Jun Cho, Kuk-Jin Yoon
2017 arXiv   pre-print
In this paper, we propose a novel distance-based camera network topology inference method for efficient person re-identification. To this end, we first calibrate each camera and estimate relative scales between cameras. Using the calibration results of multiple cameras, we calculate the speed of each person and infer the distance between cameras to generate distance-based camera network topology. The proposed distance-based topology can be applied adaptively to each person according to its
more » ... and handle diverse transition time of people between non-overlapping cameras. To validate the proposed method, we tested the proposed method using an open person re-identification dataset and compared to state-of-the-art methods. The experimental results show that the proposed method is effective for person re-identification in the large-scale camera network with various people transition time.
arXiv:1712.00158v1 fatcat:e7u42esth5ejhkd3lf3zueqczi

Exploiting Multi-layer Graph Factorization for Multi-attributed Graph Matching [article]

Han-Mu Park, Kuk-Jin Yoon
2017 arXiv   pre-print
Recently, Park and Yoon [24] tried to solve the problems of integration approaches by adopting a multi-layer structure.  ... 
arXiv:1704.07077v1 fatcat:24dordzbjnaxnp6ydn7mfx5vkm

SphereSR: 360 Image Super-Resolution with Arbitrary Projection via Continuous Spherical Image Representation [article]

Youngho Yoon, Inchul Chung, Lin Wang, Kuk-Jin Yoon
2021 arXiv   pre-print
Youngho Yoon, Inchul Chung, Lin Wang, and Kuk-Jin Yoon Visual Intelligence Lab., KAIST, Korea {dudgh1732,inchul1221,wanglin,kjyoon} Abstract Table 1 . 1 Derivation of ∆x, ∆y for various projection  ... 
arXiv:2112.06536v2 fatcat:j7cawsacdvgj7aj47b76sphdby

Visual Tracking via Adaptive Tracker Selection with Multiple Features [chapter]

Ju Hong Yoon, Du Yong Kim, Kuk-Jin Yoon
2012 Lecture Notes in Computer Science  
In this paper, a robust visual tracking method is proposed to track an object in dynamic conditions that include motion blur, illumination changes, pose variations, and occlusions. To cope with these challenges, multiple trackers with different feature descriptors are utilized, and each of which shows different level of robustness to certain changes in an object's appearance. To fuse these independent trackers, we propose two configurations, tracker selection and interaction. The tracker
more » ... tion is achieved based on a transition probability matrix (TPM) in a probabilistic manner. The tracker selection extracts one tracking result from among multiple tracker outputs by choosing the tracker that has the highest tracker probability. According to various changes in an object's appearance, the TPM and tracker probability are updated in a recursive Bayesian form by evaluating each tracker's reliability, which is measured by a robust tracker likelihood function (TLF). When the tracking in each frame is completed, the estimated object's state is obtained and fed into the reference update via the proposed learning strategy, which retains the robustness and adaptability of the TLF and multiple trackers. The experimental results demonstrate that our proposed method is robust in various benchmark scenarios.
doi:10.1007/978-3-642-33765-9_3 fatcat:7igidipzz5c4xltungtalii4ve

Evaluating COPY-BLEND Augmentation for Low Level Vision Tasks [article]

Pranjay Shyam, Sandeep Singh Sengar, Kuk-Jin Yoon, Kyung-Soo Kim
2021 arXiv   pre-print
Region modification-based data augmentation techniques have shown to improve performance for high level vision tasks (object detection, semantic segmentation, image classification, etc.) by encouraging underlying algorithms to focus on multiple discriminative features. However, as these techniques destroy spatial relationship with neighboring regions, performance can be deteriorated when using them to train algorithms designed for low level vision tasks (low light image enhancement, image
more » ... ng, deblurring, etc.) where textural consistency between recovered and its neighboring regions is important to ensure effective performance. In this paper, we examine the efficacy of a simple copy-blend data augmentation technique that copies patches from noisy images and blends onto a clean image and vice versa to ensure that an underlying algorithm localizes and recovers affected regions resulting in increased perceptual quality of a recovered image. To assess performance improvement, we perform extensive experiments alongside different region modification-based augmentation techniques and report observations such as improved performance, reduced requirement for training dataset, and early convergence across tasks such as low light image enhancement, image dehazing and image deblurring without any modification to baseline algorithm.
arXiv:2103.05889v1 fatcat:msyracva35dgnd2fsn426viqbi

Generic Scene Recovery Using Multiple Images [chapter]

Kuk-Jin Yoon, Emmanuel Prados, Peter Sturm
2009 Lecture Notes in Computer Science  
In this paper, a generative model based method for recovering both the shape and the reflectance of the surface(s) of a scene from multiple images is presented, assuming that illumination conditions are known in advance. Based on a variational framework and via gradient descents, the algorithm minimizes simultaneously and consistently a global cost functional with respect to both shape and reflectance. Contrary to previous works which consider specific individual scenarios, our method applies
more » ... a number of scenarios -mutiview stereovision, multiview photometric stereo, and multiview shape from shading. In addition, our approach naturally combines stereo, silhouette and shading cues in a single framework and, unlike most previous methods dealing with only Lambertian surfaces, the proposed method considers general dichromatic surfaces.
doi:10.1007/978-3-642-02256-2_62 fatcat:nwmvktjxufh2bkvgwhbv337xvy

Improving stereo matching with symmetric cost functions

Kuk-Jin Yoon, Sung-Kee Park
2011 IEICE Electronics Express  
In this paper, we propose new symmetric cost functions for global stereo methods. We first present a symmetric data cost function for the likelihood and then propose a symmetric discontinuity cost function for the prior in the MRF model for stereo. In defining cost function, both the reference image and the target image are taken into account to improve performance without modeling half-occluded pixels explicitly. The performance improvement of stereo matching due to the proposed symmetric cost
more » ... functions is verified by applying the proposed symmetric cost functions to the belief propagation (BP) based stereo method.
doi:10.1587/elex.8.57 fatcat:xbgmftlrnra5rllsid4k57s3ia

Calibration and Noise Identification of a Rolling Shutter Camera and a Low-Cost Inertial Measurement Unit

Chang-Ryeol Lee, Ju Yoon, Kuk-Jin Yoon
2018 Sensors  
A low-cost inertial measurement unit (IMU) and a rolling shutter camera form a conventional device configuration for localization of a mobile platform due to their complementary properties and low costs. This paper proposes a new calibration method that jointly estimates calibration and noise parameters of the low-cost IMU and the rolling shutter camera for effective sensor fusion in which accurate sensor calibration is very critical. Based on the graybox system identification, the proposed
more » ... od estimates unknown noise density so that we can minimize calibration error and its covariance by using the unscented Kalman filter. Then, we refine the estimated calibration parameters with the estimated noise density in batch manner. Experimental results on synthetic and real data demonstrate the accuracy and stability of the proposed method and show that the proposed method provides consistent results even with unknown noise density of the IMU. Furthermore, a real experiment using a commercial smartphone validates the performance of the proposed calibration method in off-the-shelf devices.
doi:10.3390/s18072345 pmid:30029509 pmcid:PMC6069048 fatcat:5mxrn4ga6bfuve7232ideyj5hm

Spatiotemporal Stereo Matching with 3D Disparity Profiles

Yongho Shin, Kuk-Jin Yoon
2015 Procedings of the British Machine Vision Conference 2015  
Adaptive support weights and over-parameterized disparity estimation truly improve the accuracy of stereo matching by enabling window-based similarity measures to handle depth discontinuities and non-fronto-parallel surfaces more effectively. Nevertheless, a disparity map sequence obtained in a frame-by-frame manner still tends to be inconsistent even with the use of state-of-the-art stereo matching methods. To solve this inconsistency problem, we propose a window-based spatiotemporal stereo matching method exploiting 3D disparity profiles.
doi:10.5244/c.29.152 dblp:conf/bmvc/ShinY15 fatcat:rx4p6klohrgqvjysm27m73twuu
« Previous Showing results 1 — 15 out of 394 results