Image Features Based on Characteristic Curves and Local Binary Patterns for Automated HER2 Scoring

Ramakrishnan Mukundan
2018 Journal of Imaging  
This paper presents novel feature descriptors and classification algorithms for 9 automated scoring of HER2 in Whole Slide Images (WSI) of breast cancer histology slides. Since a 10 large amount of processing is involved in analyzing WSI images, the primary design goal has been 11 to keep the computational complexity to the minimum possible level and to use simple, yet robust 12 feature descriptors that can provide accurate classification of the slides. We propose two types of 13 feature
more » ... tors that encode important information about staining patterns and the percentage of 14 staining present in ImmunoHistoChemistry (IHC) stained slides. The first descriptor is called a 15 characteristic curve which is a smooth non-increasing curve that represents the variation of 16 percentage of staining with saturation levels. The second new descriptor introduced in this paper is 17 an LBP feature curve which is also a non-increasing smooth curve that represents the local texture of 18 the staining patterns. Both descriptors show excellent interclass variance and intraclass correlation, 19 and are suitable for the design of automatic HER2 classification algorithms. This paper gives the 20 detailed theoretical aspects of the feature descriptors and also provides experimental results and 21 comparative analysis. 22 Keywords: medical image classification; local binary patterns; characteristic curves; whole slide 23 image processing; automated HER2 scoring 24 25 26 IHC stained slides are normally observed under a microscope by pathologists to determine the level 30 of over-expression of Human Epidermal Growth factor Receptor 2 (HER2) protein in cancer cells. 31 The tissue sample is then assigned a HER2 score of 0 to 3+ representing the grade of cancer present 32 in the sample [2]. Manual grading and annotations of breast cancer slides are time consuming, 33 and there are huge maintenance costs associated with collecting, archiving, and transporting tissue 34 specimens. It is also well documented that manual grading can have significant variability in 35 pathologist assessments due to the subjective process of determining the intensity and uniformity of 36 staining in the presence of variable staining patterns and heterogeneity of tumor grade [3]. 37 Automated methods can also suffer from errors due to inaccuracies in the training algorithm and its 38 inability to segment faint and complex tissue structures [4]. 39 In the rapidly growing field of digital pathology, several Whole Slide Image (WSI) processing 40 algorithms are currently being developed as diagnostic tools to help pathologists in the assessment 41 of disease patterns [5]. WSIs have a pyramidal structure to enable optimized viewing across multiple 42 magnification levels, and they provide a high resolution overview of the entire slide [5,6]. Typically 43 at 40x magnification, the images have a resolution of approximately 0.25 microns per pixel. At this 44 resolution, a slide region of size 15mm x 15mm could correspond to 60,000 x 60,000 pixels. WSIs 45 were originally used as a computer aided digital microscopy tool, where pathologists could view 46 different parts of a sample at different magnifications to improve the accuracy of their scores [3]. Recently, an online contest was organized by the University of Warwick in conjunction with the 51 UK/Ireland Pathology Society annual meeting 2016, with the aim of advancing research in the field 52 of automated HER2 scoring algorithms [9]. This contest was the primary motivation for our research 53 work presented in this paper. Our algorithm (registered with team name UC-CSSE-CGIP) 54 performed exceedingly well in the contest, obtaining the second best points score of 390 out of 420 55 and the overall seventh position in the combined leader board [10]. The teams that were on the top of 56 the leader board, including our team, were invited to submit a very brief (one paragraph) summary 57 of the algorithms used for inclusion in a journal paper prepared by the contest organizers [11]. 58 WSIs contain voluminous amounts of data. One of the primary design goals has been to keep 59 the computational complexity to the minimum possible level and to develop an efficient method that 60 can process relevant tiles of an input WSI image quickly and classify the image into one of the four 61 classes corresponding to the four HER2 scores. The second design goal was to have a feature set 62 whose correlation to the percentage of membrane staining in the given sample could be easily 63 visualized and interpreted by pathologists. The third design goal was to reduce the amount of 64 information redundancy in the feature set by extracting a minimal set of characteristic features that 136 specifying allowable variations in the hue value, and similarly [v1, v2] denote value thresholds. 137 Since we specify only the lower bound for saturation, progressively increasing slow, typically from 0.1 138 to 0.5, produces a non-increasing characteristic curve (Figure 3 ). 139 The base components of the stain colour [h, s, v] are computed using the training set where the 140 given percentage of staining is above 80%. While computing the percentage of staining for the test 141 (or cross-validation) sets, it is important to eliminate not only the background region but also other 142 segments that are not part of the membrane region such as connective tissues, lobules and nuclei. 157 30% mark showing a strong and complete membrane staining. As seen in Figure 4 , the curve passes 158 through a much wider range of values of percentage staining when the score is 2+. 159 160 161 Figure 4. Variations in the shapes of the characteristic curves with different levels of staining.
doi:10.3390/jimaging4020035 fatcat:a3coihowy5gcbgdfxyfh5t3q2i