A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Learned Spectral Super-Resolution
[article]
2017
arXiv
pre-print
We describe a novel method for blind, single-image spectral super-resolution. While conventional super-resolution aims to increase the spatial resolution of an input image, our goal is to spectrally enhance the input, i.e., generate an image with the same spatial resolution, but a greatly increased number of narrow (hyper-spectral) wave-length bands. Just like the spatial statistics of natural images has rich structure, which one can exploit as prior to predict high-frequency content from a low
arXiv:1703.09470v1
fatcat:yf4kpi5pknesxdjlfuq7bw6ca4
more »
... resolution image, the same is also true in the spectral domain: the materials and lighting conditions of the observed world induce structure in the spectrum of wavelengths observed at a given pixel. Surprisingly, very little work exists that attempts to use this diagnosis and achieve blind spectral super-resolution from single images. We start from the conjecture that, just like in the spatial domain, we can learn the statistics of natural image spectra, and with its help generate finely resolved hyper-spectral images from RGB input. Technically, we follow the current best practice and implement a convolutional neural network (CNN), which is trained to carry out the end-to-end mapping from an entire RGB image to the corresponding hyperspectral image of equal size. We demonstrate spectral super-resolution both for conventional RGB images and for multi-spectral satellite data, outperforming the state-of-the-art.
PatchmatchNet: Learned Multi-View Patchmatch Stereo
[article]
2020
arXiv
pre-print
Galliani et al. [16] present Gipuma, a massively parallel multi-view extension of Patchmatch stereo. It uses a red-black checkerboard pattern to parallelize message-passing during propagation. ...
arXiv:2012.01411v1
fatcat:kzbuchbjw5hkvmd5slzpwvqt4a
Learned Multi-Patch Similarity
[article]
2017
arXiv
pre-print
Estimating a depth map from multiple views of a scene is a fundamental task in computer vision. As soon as more than two viewpoints are available, one faces the very basic question how to measure similarity across >2 image patches. Surprisingly, no direct solution exists, instead it is common to fall back to more or less robust averaging of two-view similarities. Encouraged by the success of machine learning, and in particular convolutional neural networks, we propose to learn a matching
arXiv:1703.08836v2
fatcat:b5nfngmdarhibfnw3eb72qagmq
more »
... n which directly maps multiple image patches to a scalar similarity score. Experiments on several multi-view datasets demonstrate that this approach has advantages over methods based on pairwise patch similarity.
IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo
[article]
2021
arXiv
pre-print
In
[10] Silvano Galliani, Katrin Lasinger, and Konrad Schindler. ...
In CVPR, 2017. 2
[39] Fangjinhua Wang, Silvano Galliani, Christoph Vogel, Pablo
Speciale, and Marc Pollefeys. Patchmatchnet: Learned
multi-view patchmatch stereo. ...
arXiv:2112.05126v1
fatcat:hwlfy6g5xvbzbkww5e2zihxthm
The ball in the hole
2007
Proceedings of the 15th international conference on Multimedia - MULTIMEDIA '07
-all the equipment belonging to Bek was marked with a little tag Thanks for your care and attention :) With Regards, Eleonora Oreggia & Silvano Galliani Bergen, October 2006In case of problems, you can ...
contact us by email: silvano
kysucix@dyne.org
eleonora
eleonora@dyne.org
or by mobile:
eleonora
0031 618 473 162
silvano
0039 349 4141582 ...
doi:10.1145/1291233.1291317
dblp:conf/mm/OreggiaG07
fatcat:l42n374uabd25kmxzi5ym4plie
Generalised Perspective Shape from Shading in Spherical Coordinates
[chapter]
2013
Lecture Notes in Computer Science
Moreover, Silvano Galliani gratefully acknowledges funding by the Fraunhofer Institute for Industrial Mathematics (ITWM). ...
doi:10.1007/978-3-642-38267-3_19
fatcat:5iqmv2h6tbfrha7rtl2xvmqb6u
Massively Parallel Multiview Stereopsis by Surface Normal Diffusion
2015
2015 IEEE International Conference on Computer Vision (ICCV)
We present a new, massively parallel method for highquality multiview matching. Our work builds on the Patchmatch idea: starting from randomly generated 3D planes in scene space, the best-fitting planes are iteratively propagated and refined to obtain a 3D depth and normal field per view, such that a robust photo-consistency measure over all images is maximized. Our main novelties are on the one hand to formulate Patchmatch in scene space, which makes it possible to aggregate image similarity
doi:10.1109/iccv.2015.106
dblp:conf/iccv/GallianiLS15
fatcat:ut5e7jomorervnewnjguuwqoqe
more »
... ross multiple views and obtain more accurate depth maps. And on the other hand a modified, diffusion-like propagation scheme that can be massively parallelized and delivers dense multiview correspondence over ten 1.9-Megapixel images in 3 seconds, on a consumer-grade GPU. Our method uses a slanted support window and thus has no fronto-parallel bias; it is completely local and parallel, such that computation time scales linearly with image size, and inversely proportional to the number of parallel threads. Furthermore, it has low memory footprint (four values per pixel, independent of the depth range). It therefore scales exceptionally well and can handle multiple large images at high depth resolution. Experiments on the DTU and Middlebury multiview datasets as well as oblique aerial images show that our method achieves very competitive results with high accuracy and completeness, across a range of different scenarios.
Fast and Robust Surface Normal Integration by a Discrete Eikonal Equation
2012
Procedings of the British Machine Vision Conference 2012
The integration of surface normals is a classic and fundamental task in computer vision. In this paper we deal with a highly efficient fast marching (FM) method to perform the integration. In doing this we build upon a previous work of Ho and his coauthors. Their FM scheme is based on an analytic model that incorporates the eikonal equation. Our method is also built upon this equation, but it makes use of a complete discrete formulation for constructing the FM integrator (DEFM). We not only
doi:10.5244/c.26.106
dblp:conf/bmvc/GallianiBJ12
fatcat:6ettptpo4zcrzp3n3ygczo22tu
more »
... ide a theoretical justification of the proposed method, but also illustrate at hand of a simple example that our approach is much better suited to the task. Several more sophisticated tests confirm the robustness and higher accuracy of the DEFM model. Moreover, we present an extension of DEFM that allows to integrate surface normals over non-trivial domains, e.g. featuring holes. Numerical results confirm desirable qualities of this method.
Classification With an Edge: Improving Semantic Image Segmentation with Boundary Detection
[article]
2017
arXiv
pre-print
We present an end-to-end trainable deep convolutional neural network (DCNN) for semantic segmentation with built-in awareness of semantically meaningful boundaries. Semantic segmentation is a fundamental remote sensing task, and most state-of-the-art methods rely on DCNNs as their workhorse. A major reason for their success is that deep networks learn to accumulate contextual information over very large windows (receptive fields). However, this success comes at a cost, since the associated loss
arXiv:1612.01337v2
fatcat:5s6rdkquszdplbgas3z75evmh4
more »
... of effecive spatial resolution washes out high-frequency details and leads to blurry object boundaries. Here, we propose to counter this effect by combining semantic segmentation with semantically informed edge detection, thus making class-boundaries explicit in the model, First, we construct a comparatively simple, memory-efficient model by adding boundary detection to the Segnet encoder-decoder architecture. Second, we also include boundary detection in FCN-type models and set up a high-end classifier ensemble. We show that boundary detection significantly improves semantic segmentation with CNNs. Our high-end ensemble achieves > 90% overall accuracy on the ISPRS Vaihingen benchmark.
HoloLens 2 Research Mode as a Tool for Computer Vision Research
[article]
2020
arXiv
pre-print
Mixed reality headsets, such as the Microsoft HoloLens 2, are powerful sensing devices with integrated compute capabilities, which makes it an ideal platform for computer vision research. In this technical report, we present HoloLens 2 Research Mode, an API and a set of tools enabling access to the raw sensor streams. We provide an overview of the API and explain how it can be used to build mixed reality applications based on processing sensor data. We also show how to combine the Research Mode
arXiv:2008.11239v1
fatcat:vh6f3pitovbd7phiyqvnoqs7uu
more »
... sensor data with the built-in eye and hand tracking capabilities provided by HoloLens 2. By releasing the Research Mode API and a set of open-source tools, we aim to foster further research in the fields of computer vision as well as robotics and encourage contributions from the research community.
Shape from Shading for Rough Surfaces: Analysis of the Oren-Nayar Model
2012
Procedings of the British Machine Vision Conference 2012
Moreover, Silvano Galliani gratefully acknowledges funding by the Fraunhofer Institute for Industrial Mathematics (ITWM). ...
doi:10.5244/c.26.104
dblp:conf/bmvc/JuBBG12
fatcat:fflhkaac3nfxlcj2qsv5e3xuvm
Just Look at the Image: Viewpoint-Specific Surface Normal Prediction for Improved Multi-View Reconstruction
2016
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
We present a multi-view reconstruction method that combines conventional multi-view stereo (MVS) with appearance-based normal prediction, to obtain dense and accurate 3D surface models. Reliable surface normals reconstructed from multi-view correspondence serve as training data for a convolutional neural network (CNN), which predicts continuous normal vectors from raw image patches. By training from known points in the same image, the prediction is specifically tailored to the materials and
doi:10.1109/cvpr.2016.591
dblp:conf/cvpr/GallianiS16
fatcat:6ba7locv45gx5mz6wrhlpgmo3e
more »
... ting conditions of the particular scene, as well as to the precise camera viewpoint. It is therefore a lot easier to learn than generic single-view normal estimation. The estimated normal maps, together with the known depth values from MVS, are integrated to dense depth maps, which in turn are fused into a 3D model. Experiments on the DTU dataset show that our method delivers 3D reconstructions with the same accuracy as MVS, but with significantly higher completeness.
A Multi-view Stereo Benchmark with High-Resolution Images and Multi-camera Videos
2017
2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Motivated by the limitations of existing multi-view stereo benchmarks, we present a novel dataset for this task. Towards this goal, we recorded a variety of indoor and outdoor scenes using a high-precision laser scanner and captured both high-resolution DSLR imagery as well as synchronized low-resolution stereo videos with varying fieldsof-view. To align the images with the laser scans, we propose a robust technique which minimizes photometric errors conditioned on the geometry. In contrast to
doi:10.1109/cvpr.2017.272
dblp:conf/cvpr/SchopsSGSSPG17
fatcat:nbc7ximws5gpxf3re2nkcolfla
more »
... revious datasets, our benchmark provides novel challenges and covers a diverse set of viewpoints and scene types, ranging from natural scenes to man-made indoor and outdoor environments. Furthermore, we provide data at significantly higher temporal and spatial resolution. Our benchmark is the first to cover the important use case of hand-held mobile devices while also providing high-resolution DSLR camera images. We make our datasets and an online evaluation server available at http:// www.eth3d.net.
DeepVideoMVS: Multi-View Stereo on Video with Recurrent Spatio-Temporal Fusion
[article]
2021
arXiv
pre-print
We propose an online multi-view depth prediction approach on posed video streams, where the scene geometry information computed in the previous time steps is propagated to the current time step in an efficient and geometrically plausible way. The backbone of our approach is a real-time capable, lightweight encoder-decoder that relies on cost volumes computed from pairs of images. We extend it by placing a ConvLSTM cell at the bottleneck layer, which compresses an arbitrary amount of past
arXiv:2012.02177v3
fatcat:pezh4aalffd3fj3wxxkswj752u
more »
... tion in its states. The novelty lies in propagating the hidden state of the cell by accounting for the viewpoint changes between time steps. At a given time step, we warp the previous hidden state into the current camera plane using the previous depth prediction. Our extension brings only a small overhead of computation time and memory consumption, while improving the depth predictions significantly. As a result, we outperform the existing state-of-the-art multi-view stereo methods on most of the evaluated metrics in hundreds of indoor scenes while maintaining a real-time performance. Code available: https://github.com/ardaduz/deep-video-mvs
Inference, Learning and Attention Mechanisms that Exploit and Preserve Sparsity in Convolutional Networks
[article]
2018
arXiv
pre-print
While CNNs naturally lend themselves to densely sampled data, and sophisticated implementations are available, they lack the ability to efficiently process sparse data. In this work we introduce a suite of tools that exploit sparsity in both the feature maps and the filter weights, and thereby allow for significantly lower memory footprints and computation times than the conventional dense framework when processing data with a high degree of sparsity. Our scheme provides (i) an efficient GPU
arXiv:1801.10585v2
fatcat:2q6ahn53bjbmxk7vxynqn4k65e
more »
... lementation of a convolution layer based on direct, sparse convolution; (ii) a filter step within the convolution layer, which we call attention, that prevents fill-in, i.e., the tendency of convolution to rapidly decrease sparsity, and guarantees an upper bound on the computational resources; and (iii) an adaptation of the back-propagation algorithm, which makes it possible to combine our approach with standard learning frameworks, while still exploiting sparsity in the data and the model.
« Previous
Showing results 1 — 15 out of 37 results