Filters








22,474 Hits in 6.3 sec

Volume-based Semantic Labeling with Signed Distance Functions [article]

Tommaso Cavallari, Luigi Di Stefano
2015 arXiv   pre-print
Here, we link them quite tightly by delineating a category label fusion technique that allows for embedding semantic information into the dense map created by a volume-based SLAM algorithm such as KinectFusion  ...  We validate our proposal using a publicly available semantically annotated RGB-D dataset and a) employing ground truth labels, b) corrupting such annotations with synthetic noise, c) deploying a state  ...  Similarly to KinectFusion [11] , the map is represented by a Signed Distance Function [6] , but, peculiarly, we also provide each voxel with a label that specifies the type of object appearing in that  ... 
arXiv:1511.04242v1 fatcat:lbdnsyhsr5fr3feanab4etvm5u

Neural 3D Scene Reconstruction with the Manhattan-world Assumption [article]

Haoyu Guo, Sida Peng, Haotong Lin, Qianqian Wang, Guofeng Zhang, Hujun Bao, Xiaowei Zhou
2022 arXiv   pre-print
Specifically, we use an MLP network to represent the signed distance function as the scene geometry.  ...  To resolve the inaccurate segmentation, we encode the semantics of 3D points with another MLP and design a novel loss that jointly optimizes the scene geometry and semantics in 3D space.  ...  Other volume rendering based methods -UNISURF, NeuS and VolSDF perform better than NeRF as occupancy and signed distance function have better surface constraints.  ... 
arXiv:2205.02836v2 fatcat:u7fekb2kpvflhovl6vjbz3ehea

Deep semantic cross modal hashing based on graph similarity of modal-specific

Junzheng Li
2021 IEEE Access  
For image graph, we build the intra-modal similarity with Euclidean distance function. For text graph, we build the intra-modal similarity with cosine distance function.  ...  Paying attention to the specifics of each modality, we build the images' similarity with Euclidean distance function and the texts' similarity with cosine distance function.  ... 
doi:10.1109/access.2021.3093357 fatcat:uyouxawgzbhzhlrsufj4iauiuy

SemanticFusion: Joint Labeling, Tracking and Mapping [chapter]

Tommaso Cavallari, Luigi Di Stefano
2016 Lecture Notes in Computer Science  
Kick-started by deployment of the well-known KinectFusion, recent research on the task of RGBD-based dense volume reconstruction has focused on improving different shortcomings of the original algorithm  ...  Accordingly, we present an extended KinectFusion pipeline which takes into account per-pixel semantic labels gathered from the input frames.  ...  We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.  ... 
doi:10.1007/978-3-319-49409-8_55 fatcat:s4e6ervplzce7akyzw4t7kkx6u

Two Stream 3D Semantic Scene Completion [article]

Martin Garbade, Yueh-Tung Chen, Johann Sawatzky, Juergen Gall
2019 arXiv   pre-print
The approach voxelizes the scene and predicts for each voxel if it is occupied and, if it is occupied, the semantic class label.  ...  The approach constructs an incomplete 3D semantic tensor, which uses a compact three-channel encoding for the inferred semantic information, and uses a 3D CNN to infer the complete 3D semantic tensor.  ...  To provide a more meaningful input signal, the signed distance function is transformed into a flipped TSDF [36] , where every signed distance value d is converted into a distance d f which is 1 or -1  ... 
arXiv:1804.03550v4 fatcat:6bxtk7pcbfflhnirvsr5jrfhjq

Deep Multi-level Semantic Hashing for Cross-modal Retrieval

Zhenyan Ji, Weina Yao, Wei Wei, Houbing Song, Huaiyu Pi
2019 IEEE Access  
In this paper, the multi-level semantic supervision generating approach is proposed by exploring the label relevance.  ...  Most existing hashing methods are designed based on binary supervision, which transforms complex relationships of multi-label data into simple similar or dissimilar.  ...  C (g) , C (x) ∈ {−1, +1} c×n F (g) = f (G; ϕ g ) F (x) = f (X ; ϕ x ) (8) where C (g) = sign(F (g) ), C (x) = sign(F (x) ), and sign(•) is a sign function defined as: sign(x) = 1, x 0 −1, x < 0 (9) • 2  ... 
doi:10.1109/access.2019.2899536 fatcat:xynopqlgyfhe3ef6su55zqczim

A Real-Time Online Learning Framework for Joint 3D Reconstruction and Semantic Segmentation of Indoor Scenes [article]

Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstrom, Cristian Sminchisescu, Luc Van Gool
2021 arXiv   pre-print
Given noisy depth maps, a camera trajectory, and 2D semantic labels at train time, the proposed deep neural network based approach learns to fuse the depth over frames with suitable semantic labels in  ...  This paper presents a real-time online vision framework to jointly recover an indoor scene's 3D structure and semantic label.  ...  TSDF fusion [16] is an incremental method to integrate the depth maps over frames for each location x ∈ R 3 into a volume by averaging truncated signed distance functions (TSDF).  ... 
arXiv:2108.05246v2 fatcat:ux3hrwh75faynkmopirgoo7hdu

ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans

Angela Dai, Daniel Ritchie, Martin Bokeloh, Scott Reed, Jurgen Sturm, Matthias NieBner
2018 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition  
We propose a novel data-driven approach based on fully-convolutional neural networks that transforms incomplete signed distance functions (SDFs) into complete meshes at unprecedented spatial extents (middle  ...  Abstract We introduce ScanComplete, a novel data-driven approach for taking an incomplete 3D scan of a scene as input and predicting a complete 3D model along with per-voxel semantic labels.  ...  We would also like to thank Shuran Song for helping with the SSCNet comparison.  ... 
doi:10.1109/cvpr.2018.00481 dblp:conf/cvpr/DaiRBRSN18 fatcat:is4fexgq55gl3nyuqchg5o2m3a

Deep Attention-Guided Hashing

Zhan Yang, Osolo Ian Raymond, Wuqing Sun, Jun Long
2019 IEEE Access  
The loss function we propose contains two components: the semantic loss and the attention loss.  ...  With the rapid growth of multimedia data (e.g., image, audio and video etc.) on the web, learning-based hashing techniques such as Deep Supervised Hashing (DSH) have proven to be very efficient for large-scale  ...  [55] proposed a justifiable approach based on the continuation of the tanh function, which approaches the sign function with the scale parameter β in its limit: lim β→∞ tanh(βx) = sign(x), they prove  ... 
doi:10.1109/access.2019.2891894 fatcat:edzv4jcp5zarzkv4qmltalcgje

SEMANTIC LABELLING OF ROAD FURNITURE IN MOBILE LASER SCANNING DATA

F. Li, S. Oude Elberink, G. Vosselman
2017 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences  
Road furniture semantic labelling is vital for large scale mapping and autonomous driving systems.  ...  In this paper, a novel method is proposed to interpret road furniture based on their logical relations and functionalities.  ...  However, there is little attention on interpreting road furniture at a functional component level, namely semantically labelling of road furniture based on their functionalities.  ... 
doi:10.5194/isprs-archives-xlii-2-w7-247-2017 fatcat:ldlkgfvu45eebf3zwwsjxussg4

imGHUM: Implicit Generative Models of 3D Human Shape and Articulated Pose [article]

Thiemo Alldieck, Hongyi Xu, Cristian Sminchisescu
2021 arXiv   pre-print
We present imGHUM, the first holistic generative model of 3D human shape and articulated pose, represented as a signed distance function.  ...  We propose a novel network architecture and a learning paradigm, which make it possible to learn a detailed implicit generative model of human pose, shape, and semantics, on par with state-of-the-art mesh-based  ...  equation ∥∇pS(p, α; ω)∥ = 1, (1) where S is a signed distance function that vanishes at the surface Y with gradients equal to surface normals.  ... 
arXiv:2108.10842v1 fatcat:4aoggdkt5ba75fbzlg7ji7h7au

Urban 3D semantic modelling using stereo vision

Sunando Sengupta, Eric Greveson, Ali Shahrokni, Philip H. S. Torr
2013 2013 IEEE International Conference on Robotics and Automation  
In this paper we propose a robust algorithm that generates an efficient and accurate dense 3D reconstruction with associated semantic labellings.  ...  The street level images are automatically labelled using a Conditional Random Field (CRF) framework exploiting stereo images, and label estimates are aggregated to annotate the 3D volume.  ...  A signed distance function corresponds to the distance to the closest surface interface (zero crossing), with positive values corresponding to free space, and negative values corresponding to points behind  ... 
doi:10.1109/icra.2013.6630632 dblp:conf/icra/SenguptaGST13 fatcat:fnkmujibxbhkfa2qavaoddmble

BrainGazer - Visual Queries for Neurobiology Research

S. Bruckner, V. Solteszova, M.E. Groller, J. Hladuvka, K. Buhler, J.Y. Yu, B.J. Dickson
2009 IEEE Transactions on Visualization and Computer Graphics  
We focus on the ability to visually query the data based on semantic as well as spatial relationships.  ...  We have designed and implemented BrainGazer, a system which integrates visualization techniques for volume data acquired through confocal microscopy as well as annotated anatomical structures with an intuitive  ...  We use signed distance volumes generated for all objects in the database.  ... 
doi:10.1109/tvcg.2009.121 pmid:19834226 fatcat:pusgaju775ftjjmtbto2duzsui

ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans [article]

Angela Dai, Daniel Ritchie, Martin Bokeloh, Scott Reed, Jürgen Sturm, Matthias Nießner
2018 arXiv   pre-print
We introduce ScanComplete, a novel data-driven approach for taking an incomplete 3D scan of a scene as input and predicting a complete 3D model along with per-voxel semantic labels.  ...  Our results show that we outperform other methods not only in the size of the environments handled and processing efficiency, but also with regard to completion quality and semantic segmentation performance  ...  We would also like to thank Shuran Song for helping with the SSCNet comparison.  ... 
arXiv:1712.10215v2 fatcat:6mfauuwj5rathpmobktp2htubi

PyTorch Connectomics: A Scalable and Flexible Segmentation Framework for EM Connectomics [article]

Zudi Lin, Donglai Wei, Jeff Lichtman, Hanspeter Pfister
2021 arXiv   pre-print
We present PyTorch Connectomics (PyTC), an open-source deep-learning framework for the semantic and instance segmentation of volumetric microscopy images, built upon PyTorch.  ...  Those functionalities can be easily realized in PyTC by changing the configuration options without coding and adapted to other 2D and 3D segmentation tasks for different tissues and imaging modalities.  ...  Thus the loader samples fewer data points from volumes with sparse labels by expectation.  ... 
arXiv:2112.05754v1 fatcat:jjdatdvbr5eypmxmqbvujmarc4
« Previous Showing results 1 — 15 out of 22,474 results