Detection of Breast Region of Interest via Breast MR Scan on an Axial Slice

Breast cancer is one of the most common cancer types especially met in women. The number of breast cancer patients increases every year. Thus, to detect breast cancer at its early stages gains importance. Breast region detection is the pioneering step of breast cancer diagnosis researches performed via image processing techniques. The performance of computer-aided breast cancer diagnosis systems can be improved by exactly determining the breast region of interest. In this study, the goal is to determine a region of interest for breast MR images, in which one or more lesion can appear. The achieved region includes two breasts and lymph nodes. The proposed region of interest detection system is fully automatic and it utilizes several image processing techniques. At first, the local adaptive thresholding technique is applied to the noise-filtered grey level breast magnetic resonance images taken with ethical permissions from Sakarya Education and Research Hospital. After adaptive thresholding, connected component analysis is performed to exclude extra structures around the breast region as thorax area. This analysis selects the largest area in the binary image which corresponds to a gyrate region including breast area and lymph nodes over the backbone. Then, the integral of horizontal projection is calculated to determine an optimum horizontal line that allows setting the region of interest apart. In the following step, sternum midpoint is detected to separate the right breast from the left one. Finally, a masking operation is applied to get corresponding right and left breast regions in the original MR image. To evaluate the performance of the proposed study, the results of automatic region of interest detection system are compared with the manual region of interest selection performed by an expert radiologist. Dice similarity coefficient and Jaccard coefficient are used as performance criteria. According to the results, the proposed system can detect region of interest for computer-aided breast cancer diagnosis researches, exactly. This is an open access article under the CC BY-SA 4.0 license. (https://creativecommons.org/licenses/by-sa/4.0/)

Breast cancer is one of the most common cancer types especially met in women. The number of breast cancer patients increases every year. Thus, to detect breast cancer at its early stages gains importance. Breast region detection is the pioneering step of breast cancer diagnosis researches performed via image processing techniques. The performance of computer-aided breast cancer diagnosis systems can be improved by exactly determining the breast region of interest. In this study, the goal is to determine a region of interest for breast MR images, in which one or more lesion can appear. The achieved region includes two breasts and lymph nodes. The proposed region of interest detection system is fully automatic and it utilizes several image processing techniques. At first, the local adaptive thresholding technique is applied to the noise-filtered grey level breast magnetic resonance images taken with ethical permissions from Sakarya Education and Research Hospital. After adaptive thresholding, connected component analysis is performed to exclude extra structures around the breast region as thorax area. This analysis selects the largest area in the binary image which corresponds to a gyrate region including breast area and lymph nodes over the backbone. Then, the integral of horizontal projection is calculated to determine an optimum horizontal line that allows setting the region of interest apart. In the following step, sternum midpoint is detected to separate the right breast from the left one. Finally, a masking operation is applied to get corresponding right and left breast regions in the original MR image. To evaluate the performance of the proposed study, the results of automatic region of interest detection system are compared with the manual region of interest selection performed by an expert radiologist. Dice similarity coefficient and Jaccard coefficient are used as performance criteria. According to the results, the proposed system can detect region of interest for computer-aided breast cancer diagnosis researches, exactly.

Introduction
Today, breast cancer is the most common form of cancer among women and affects 2.1 million women every year. According to the statistical data of the world health organization, approximately 627.000 women die from breast cancer and this constitutes 15% of cancer deaths [1]. In Turkey, one of four cancers diagnosed among women is breast cancer [2]. As with many types of cancer, early diagnosis in breast cancer saves lives.
In recent decades, CAD systems are developed to help the detection of cancerous lesions and to augment the accuracy of diagnosis. These systems analyse the data (i.e. MR images and ultrasound scans) through improved deterministic algorithms. The studies about breast cancer diagnosis, can roughly be divided into two groups. In the first group, researchers aim to determine the region of the lesions in the breast. This group can be named as the segmentation group. The goal of the second group, which is called the lesion analysis group, is to distinguish the lesions as benign and malignant. For both groups, the first step is to obtain breast region which may include breast lesions. This process can be thought as breast region of interest detection step.
Determining the region of interest that may include the lesion area is a crucial step for breast cancer diagnosis via various imaging modalities such as ultrasound (US), mammography (MG), and magnetic resonance imaging (MRI) [3]. In recent years, the researchers studying in biomedical image processing, radiology and cancer diagnosis areas are interested in the detection of the region of interest (ROI) to enhance the success of medical treatments. There are several studies that target to extract the ROI of medical images. In [4], Renukalatha and Suresh are proposed an automatic ROI extraction algorithm to determine the important regions of noisy medical images taken by different imaging modalities. They applied the image denoising techniques and then used the statistical moments calculated from the histogram decomposition technique to estimate an optimal threshold value. After thresholding the image, the ROI is determined. Fooladivanda and friends utilize the local adaptive thresholding (LAT) technique to dominate on intensity inhomogeneity owing to the bias field and the low contrast intensity of the boundary between breast and pectoral muscle. The goal of the study is to segment breast and fibroglandular tissue in breast MR images. The dataset is generated by 2520 bilateral axial breast MR images from 45 women. Manual and automatic segmentation results are compared by calculating five quantitative metrics called Dice Similarity Coefficient (DSC), Jaccard Coefficient (JC), total overlap, False Negative (FN) and False Positive (FP). The obtained results are 0.90, 0.82, 0.89, 0.1 and 0.09, respectively [5]. In [6], breast lesion automatic detection and diagnosis system (BLADes) is introduced to support the radiologist during the breast cancer diagnosis. The performance of the system is evaluated on histopathologically proven lesions and promising results are obtained. In [7], the authors propose an automatic and fast segmentation of breast ROI and density for breast MR images. They validate their study by evaluating 1350 breast images from 15 female voluntaries. The pixel-based analysis shows the performance of the study is high when compared with manually drawn ROI. In another study, the authors aim to evaluate the differences in peritumoral apparent diffusion coefficient values by four different ROI selection methods. 22 breast cancer patients are included to the study. The study shows that there is no statistically significant difference among the four ROI selection methods [8].
Then, an automatic three-dimensional segmentation method of whole breast for breast MRI is proposed. 99 breast MRI scans with varying imaging protocols is used for evaluation. According to the obtained results, the proposed method provides 96.04%, 97.27%, and 98.77% dice similarity coefficient, sensitivity, and specificity, respectively [9]. In [10], the authors suggest a fully-automatic geometrical based breast mask extraction method in DCE-MRI. In this study, 2D fuzzy c-means (FCM) method is combined with geometrical breast anatomy characterization. The achieved median segmentation accuracy 97.86 (±0.49) %. In another study, cancerous areas in breast MR images are segmented by utilizing region growing and watershed segmentation techniques. The authors proved that the agreement of automatic and manual segmentation results over 20 breast images [11]. In the study given with [12], to overcome the drawbacks of classical segmentation methods, k-means based segmentation algorithm is proposed. The researches carry out clustering and breast cancer segmentation in parallel. Two additional features: brightness and circularity are taken into account during the segmentation process. The proposed system outperformed standard kmeans by approximately 51.5%, 12%, and 3.5% for samples one, two, and three, respectively. In this study, only three breast MR samples are examined. A fully automated lesion detection system that works in three dimension is introduced in [13]. The goal of the study is only lesion detection. The database of the study includes 2064 contrast enhanced MR mammogram images from only 19 women. Breast regions in the images are extracted by a segmentation scheme based on convolutional neural networks. The performance of the automated breast segmentation is quantified using relative overlap and misclassification rate. Ninety-seven percent of the breasts are segmented properly and all the lesions are detected correctly. However, 31% of the lesions are misclassified. In [14], the authors analyse a set of 68 DCE-MRI breast tumour from 50 patients. The clustering algorithms are performed on each tumour to determine the most suspect region. Then the features that are used to train classifiers are derived. The number of correctly classified instances (CCI) is used as performance metric. The maximum CCI percentage is 78%. The authors express that simple rules of thumb are not adequate to distinguish benign lesions from malignant lesions on their dataset. Finally, in [15], an image segmentation approach based on improved Kmeans and ROI saliency map is proposed. In the study, brain and breast MR images are segmented. The segmentation performance of traditional K-means algorithm improves by utilizing ROI saliency map.
In this study, a fully automatic RIO extraction system is proposed. The proposed system includes local adaptive thresholding, connected component labeling, integral of horizontal projection and sternum midpoint detection techniques. The two most commonly used metrics DSC and JC are used to evaluate the performance of the system. The remaining of the paper is organized as follows. In Section 2, the main steps of the study and used techniques are explained to better understand the proposed system. In Section 3, obtained simulation results are discussed and Section 4 concludes the paper.

The Proposed System and Methods
In this section, the block diagram of the designed system is introduced and the performed techniques are described to better understand the study.

The Proposed System
The main steps of the proposed system are demonstrated in Fig. 1. As can be seen from the figure, at first system database that includes 10 benign and 40 malignant histopathologically proven breast lesions is constructed. Breast MR images are taken with legal permission from Sakarya Education and Research Hospital together with an expert radiologist. The images are acquired by a 1.5 Tesla MR device as T1-weighted dynamic gadolinium contrast and T2-weighted axial images. The slice thickness does not exceed 4 mm. The 50 subjects are female and their ages change between 30-72.
The most of the malignant lesions included in the study is invasive ductal carcinoma. Furthermore, lobular, papillary, apocrine, and tubular carcinomas are experienced. The types of the benign lesions are cyst, fibrocystic, Fibroadenoma and ductal gland. This variety makes the proposed system more reliable because of considering several types of the lesions.
Before performing the ROI detection techniques several filtering techniques are applied to the MR images to remove the motion and breathing artefacts that occur during the image acquisition.

Local Adaptive Thresholding
Thresholding is the simplest method which is commonly used to binarize images. Thresholding methods can roughly be categorized as global thresholding methods in which a single threshold is calculated for the whole image by using statistical approaches and local thresholding methods that compute the thresholds locally on a window [16]. Global thresholding methods provide acceptable results for the images having homogenous intensity. However, for the inhomogeneous images, these methods can lose the details of images.
There are different formulas to calculate the local thresholds such as Niblack's, Sauvola's, Bernsen's formulas [4,5]. The effectiveness of the formulas is dependent on the images in the database. In the proposed study, the local adaptive thresholding method is applied to the images with the Niblack's formula which is based on local mean and standard deviation. The thresholding value is calculated with the following equation where ( , ) and ( , ) are the local mean and standard deviation in a particular window around any pixel ( , ), respectively. The effect of the standard deviation is determined by the weight coefficient k. After performing several experiments, local window size and k values are determined as w = 43 and k = -0.2.
As can be seen from Equation (1), local mean and standard derivation must be calculated for each window to locally threshold the image. However, this way will increase the computational complexity of the method. So, integral image is defined to compute local mean and standard derivation [17]. For an input image g, integral image F is defined as follows: (2) After calculating integral image, instead of summing whole pixels for each window, local mean and local standard derivation for any window size can be calculated as: By using this approach, computational complexity of the local adaptive thresholding method is decreased. Besides, the computational load does not depend on the size of window [5].

Connected Component Analysis
After local adaptive thresholding, the connected component analysis is performed to the thresholded image. Connected component labelling is crucial for distinguishing different objects in a binary image, and prerequisite for image analysis.
As can be seen from Fig. 4 (a), the binarized image has some extra structures around the breast region as thorax area and patient's arm. The connected component labeling process is used to remove these extra structures from the binary image and determine the largest area in the binary image. In this study, two-dimensional connectivity is used with 8 neighbours. For a given pixel ( , ) , the four pixels ( − 1, ) , ( , − 1) , ( + 1, ) and ( , + 1) are called the 4 neighbours of the pixel; these four neighbours together with the four pixels ( − 1, − 1) , ( + 1, − 1) , ( − 1, + 1) and ( + 1, + 1) are called the 8 neighbours of pixel [18,19]. 4   The connected component labeling assigns the same label to the pixels which touch each other with their edges or corners. In Fig. 3, a simple connected component labeling illustration is given. In this figure, 8 neighbour connectivity is used.  In the presented study, 8-neighbour connected component labeling is performed and then the largest two connections are selected. Finally, the obtained breast MR images are demonstrated in Fig. 4. Fig. 4 (a) shows the original T1-weighted post-contrast breast MR image, (b) shows the noise-filtered and thresholded binary image, and (c) gives the connected component labeling process applied image. According to Fig. 4 (c), the output of the labeling is a gyrate region which includes the breasts and fat tissue over the backbone. Breast lesions can occur in the left or right breast and around the lymph nodes. Thus, the extra region, that does not include breast area and lymph nodes, must be separated from the image.
where and are the first and last non-zero elements of the IHP vector [5]. The breast region is starting from (nipple points). The region between and is the target region is shown in Fig. 5 (c). After demarcating the target region including breasts and lymph nodes, the final separation process, in which the left and right breasts are disconnected by locating the sternum midpoint, is performed. This separation is given in Fig. 6  Hence, at the end of this step ROI is detected.

Results and Discussions
The presented study aims to detect ROI for breast MR images in a fully automatic framework. To achieve this goal, the noise filtering, LAT, CCA, and IHP methods are applied to the images, respectively. The processed images are T1-weighted dynamic gadolinium contrast and T2weighted axial images taken with 1.5 Tesla MR devices. The slice thickness does not exceed 4mm. In the provided database 10 benign and 40 malignant lesions including breast MR images obtained from 50 female subjects are available.
The performance of the proposed approach is evaluated by comparing the automatically determined ROI with the ROI which is manually selected by an expert radiologist. DSC and JC metrics are calculated to measure mean overlap and union overlap between automatic and manual segmentation results [5]. These metrics can be expressed as follows where A and M denote the total volume of breast obtained automatically and manually. If the values of these metrics are close to 1, the automatic segmentation results are close to the manual segmentation results. DSC and JC values provided by the proposed study are 91±0.06% and 85±0.08%, respectively.
According to the calculated metrics and acquired breast ROI, the designed system provides an exact region in which breast lesions can occur. Detecting RIO decreases the computational load of the breast lesion segmentation techniques, in which the whole MR image is scanned to segment breast lesion. The success of the proposed system can be improved by taking images more attentively. Because, the motion and breathing artefacts decrease the quality and standardization of images.
In future studies, an automatic breast lesion classification system that supports the radiologists will be designed by exploiting ROI detection, lesion segmentation, feature extraction, and classification techniques.

Conclusion
Cancer researches gain importance every year because of the increasing number of cancer related deaths. Breast cancer is one of the most common cancer types among women. In this study, we aim to detect left and right breast regions as correct as possible. By this initial step, a comprehensive and automatic breast cancer diagnosis system can be designed to support the specialists.
Thus, we aim to determine a region of interest for breast MR images in which one or more lesion can appear. The achieved region includes two breasts and lymph nodes. The proposed region of interest detection system is fully automatic and it utilizes several image processing techniques. Noise filtering, local adaptive thresholding, connected component labeling, integral of horizontal protection, sternum midpoint detection and masking operations are applied to the breast MR images, consequently.
To evaluate the performance of the proposed study, the results of automatic region of interest detection system are compared with the manual region of interest selection performed by an expert radiologist. For this comparison, DSC and JC metrics are calculated. The achieved result show that the proposed system determines the boundaries of left and right breast region, successfully. Thus, by using proposed ROI detection system as an initial phase a fullyautomatic breast cancer diagnosis system can be designed. In our future work, the target is to design mentioned decision support system for the radiologists.