Strategies for Training Deep Learning Models in Medical Domains with Small Reference Datasets

Gerald A. Zwettler, David R. Holmes III, Werner Backfrieder
2020 Journal of WSCG  
With the steady progress of Deep Learning (DL), powerful tools are now present for sophisticated segmentation tasks. Nevertheless, the generally very high demand for training data and precise reference segmentations often cannot be met in medical domains when processing small and individual studies or acquisition protocols. As common strategies, reinforcement learning or transfer learning are applicable but coherent with immense effort due to domain-specific adjustment. In this work the
more » ... ility of a U-net cascade for training on a very low amount of abdominal MRI datasets of the parenchyma is evaluated and strategies to compensate for the lack of training data are discussed. Although the model accuracy when training on 13 MRI volumes with achievable JI=89.41 is rather low, results are still good enough for manual post-processing utilizing a Graph cut (GC) approach with medium demand for user interaction. This way, the DL models are retrained, when additional test data sets become available to subsequently improve the classification accuracy. With only 2 additional GC postprocessed datasets, the accuracy after model re-training is increased to JI= 89.87. Besides, the applicability of Generative Adversial Networks (GAN) in the medical domain is evaluated discussing to synthesize axial CT slices together with perfect ground truth reference segmentations. It is shown for abdominal CT slices of the parenchyma, that in case of lack of training data, synthesized slices, that can be derived at arbitrary number, help to significantly improve the DL training process when only an insufficient amount of data is available. While training on 2,200 real images only leads to accuracy JI=88.75, the enrichment with 2,200 additional images synthesized from a GAN trained on 5,000 datasets only leads to an increase up to JI=92.02. Even if the DL model is exclusively trained on 4,400 computer-generated images, the classification accuracy on real-world data is notable with JI=90.81.
doi:10.24132/jwscg.2020.28.5 fatcat:jle43tluxzflri5wnlgt54yska