Multiscale matters for part segmentation of instruments in robotic surgery

Wenhao He, Haitao Song, Yue Guo, Guibin Bian, Yuejie Sun, Xiaowei Zhou, Xiaonan Wang
2020 IET Image Processing  
A challenging aspect of instrument segmentation in robotic surgery is to distinguish different parts of the same instrument. Parts with similar textures are common in a practical instrument and are difficult to distinguish. In this work, the authors introduce an end-to-end recurrent model that comprises a multiscale semantic segmentation network and a refinement model. Specifically, the semantic segmentation network uniformly transforms the input images in multiple scales into a semantic mask,
more » ... nd the refinement model is a single-scale net recurrently optimising the above semantic mask. Through extensive experiments, the authors validate that the models with multiscale inputs perform better than those to fuse encoded feature maps and ones with spatial attention. Furthermore, the authors verify the effectiveness of the proposed model with state-of-the-art performances on several robotic instrument datasets derived from MICCAI Endoscopic Vision Challenges. Fig. 1 Number of parameters versus mean Dice coefficient: every circle represents the performance of a method, and our model outperforms others by a large margin with the fewest parameters. '15', '17', and '18', respectively, denote the dataset from Endovis15, Endovis17, and Endovis18 IET Image Process., 2020, Vol. 14 Iss. 13, pp. 3215-3222 © The Institution of Engineering and Technology 2020 3215 Fig. 4 Segmentation masks of baselines and our method: three columns, respectively, correspond to results in Endovis15, Endovis17, and Endovis18, and every colour corresponds to a specific semantic class. Most of the baselines can be used to accurately segment instruments, but sometimes our method outperforms baselines by a large margin on surgical contexts
doi:10.1049/iet-ipr.2020.0320 fatcat:oir6g37vvvawnbeelto2tamcla