CALCULATION OF COVERAGE AND FLAKE SIZE OF MONOLAYERS GROWN BY CHEMICAL VAPOR DEPOSITION TECHNIQUE

Two-dimensional (2D) materials such as transition metal dichalcogenides (TMDs) are prominent candidates to be utilized in integrated circuits. However, growing uniform and large-area 2D materials, specifically monolayers, that can be used in electronic component production is still one of the main challenges for these 2D materials to be incorporated in integrated circuits or other active device applications. The aim of this study is to demonstrate a practical and reliable MATLAB computational method, which calculates the ratio of the chemical vapor deposited monolayers to the whole substrate surface and the maximum area of the deposited flakes. In this study, we used the K-means clustering method to calculate surface coverage where we obtained accuracy of ~96% for the simple test images (single star and hexagonal shapes). For the multi-numbered and distributed shapes example, we achieved higher accuracy of ~98%. We also realized calculation of each flake area with ~99% accuracy indicating the flake with the maximum area. The practical calculation of the surface coverage ratio and flake size will allow for easy identification of the effects of the process parameters during novel material growth, which will pave way for future optoelectronic and electronic devices.


INTRODUCTION
The ever-increasing performance of transistors changes and gives shape to our lives. Regarding integrated-circuits electronics, there have been profound advancements thanks to the continuous scaling-down process. However, this scaling-down in silicon-based electronics is reaching its fundamental physical limits where the exploration of a new material system is one of the major challenges in the field of electronics. Recently, there has been a growing interest in twodimensional (2D) materials (Keyes, 2005;Manzeli et al., 2017;Zhang et al., 2018). The rise of 2D materials started with the realization of graphene which is the first single-layer 2D material shown experimentally (Novoselov et al., 2004). Graphene is a semimetal, and it has remarkable features such as high electron mobility (>200000 cm 2 Reddy et al., 2011). For this reason, transition metal dichalcogenides (TMDs) have become among the future emerging materials for devices such as transistors, photodetectors and sensors. One of the most extensively studied TMDs is MoS2 and the transistors based on this material exhibit high mobilities (200 cm 2 V −1 s −1 ) and high current on/off ratios (10 8 ) (Radisavljevic et al., 2011;Şar et al., 2018). For fabricating 2D novel devices, the first challenge is to obtain large-area deposition with a uniform thickness where in many cases single atomic layer thickness is a requirement for efficient light matter interaction (Schwierz et al., 2015). Our group has extensive know-how and experience in growing different 2D materials and fabricate their devices (Öper et Yorulmaz et al., 2016). In these studies, we have used chemical vapor deposition method (CVD) technique, which is one of the most favorable methods to obtain large area growth. The total area and coverage ratio of the grown sample determines the total number and complexity of the devices that could be fabricated from the grown structures while applying the photolithography process. For this reason, we prepared a user-friendly and accurate image processing program on MATLAB environment to calculate the surface coverage of the grown samples and the size of the grown MoS2 flakes.
Image processing is a powerful tool to analyze the acquired images. We can perform image enhancement, image processing, image compression and image analysis operations. To analyze the images taken by optical microscopy, we can use segmentation algorithms, which segregate images into various regions. For instance, in the amplitude thresholding method, each region is defined depending upon the amplitudes of the pixels. Also, we can use clustering algorithms to define regions with the set of pixels that have resembling characteristics. There are several types of clustering methods including hierarchical clustering, partitioning clustering, fuzzy clustering, density-based clustering, and distribution model-based clustering (Jain et al., 1999). Since clustering approaches require numerous parameters, work in high dimensional spaces, and have to manage noisy, fragmented, and sampled-data, their performance can differ significantly depending on the application area (Reynolds et al., 2006; Rodriguez et al., 2019; Scheunders, 1997). Therefore, selecting the optimal clustering algorithm can be a difficult issue. For this study, to obtain high accuracy, we use the K-means clustering method that is one of the most commonly Even though there is a study to calculate surface coverage of 2D materials within our knowledge, in earlier studies, there is not any study that aims to calculate the flake sizes of 2D materials (Jessen et al., 2018). For this reason, in this research, we demonstrate a high accuracy image processing tool to evaluate morphological properties oft CVD grown 2D MoS2 flakes.

2D Material Growth and Characterization
The CVD growth of 2D TMD structures is carried out in a home-built, multi-zone furnace of a horizontal quartz tube at atmospheric pressure. Monolayer flakes are synthesized by using a configuration as defined by Ozden et al (Özden et al., 2017). The grown layers are analyzed elsewhere at room temperature by using Witec Alpha 300 R μ-Raman and photoluminescence (PL) spectroscopy system with a Zeiss 50X microscope objective having a numerical aperture (NA) of 0.8. A 532 nm, a CW laser with 300 nm spot size and 0.5 mW laser power and 1 s integration time is used for Raman spectra and PL measurements.

Calculation of the Coverage Percentage
Color models are mathematical models that are used to describe colors and are designed in three-dimensions to represent all colors. There are many color models, and they all have advantages and disadvantages for an application. RGB, CMYK, L*a*b*, and HSB (Hue, Saturation, Brightness), HSV (Hue, Saturation, Value) and HSI (Hue, Saturation, Intensity) are some of the color models (Ibraheem et al., 2012). In this study, the L*a*b* space is used as color model. It includes a luminous layer "L*", a chromaticity layer "a*" that specifies the position of the color along the red-green axis, and a chromaticity layer "b*" that determines the position of the color along the blue-yellow axis (Kuehni, 2001).
Herein, we use the K-means clustering method in the MATLAB software program to calculate the surface coverage of the samples. K-means clustering is utilized as a segmentation procedure. The aim is to ensure that the clusters acquired at the end of the segmentation process are maximum within clusters and minimum between clusters (Wagstaff et al., 2001). K-means treats each observation in the data as an object with a position in space. Figure 1a shows an exemplary image of the produced MoS2 monolayers on SiO2/Si substrate, where the images are obtained by the optical microscope. Since the used images are composed of two different colors, we have two color images where the darker parts show the deposited monolayers, and the lighter part is the substrate, in other words, background.
For using K-means clustering for color segmentation, firstly, the image must be converted to an L*a*b* color space. This allows us to distinguish colors on the image and ignore changes in brightness (Figure 1b).  K-means clustering algorithm finds a partition in which objects within each cluster are as close to each other as possible, and as far from objects in other clusters as possible (Dhanachandra et al., 2015). Since the color information is in the color space "a*b*" in this study, objects consist of pixels with values "a*" and "b*". We obtain the image in Figure 2 by using the L* a*b* color space and the K-means cluster. Thanks to the clustering method applied, the program divides the image into two clusters. Cluster 1 indicates the deposited area on the substrate, as shown in Figure  2a and Cluster 2 demonstrates the surface of the substrate, as shown in Figure 2b.

Detection of the Largest Area in the Image
To find the largest flake area in the image, the process is separated into two main steps. Firstly, morphological operation on the input image is applied, then, calculation of the areas for each shape is carried out (Pozzo et al., 1999). In the beginning of the process, the image is taken as input, and morphological operations are applied to it. The input image is presented in Figure 3a. MATLAB grayscale function is used to convert the shape to a gray level. Grayscale images are just images in which the only color is shaded gray. This image is distinguished faster than any other color image because it carries less information for each pixel (Kanan et al., 2012). In RGB SPACE, red, green, and blue components with the same intensity represent "gray" color.

a. b. Figure 2: a. The deposited area (non-black) in Cluster 1, b. The background area (non-black) in Cluster 2.
Therefore, instead of specifying the three densities required for each pixel in a color image, only one density value can be used for each pixel.
Generally, the grayscale intensity gives 256 different shades of gray from white to black, and it is stored as an 8-bit integer. Grayscale images are sufficient for most of the tasks. Therefore, in general, current display hardware support 8-bit images. In the study, firstly, we convert the image to binary grayscale form, then we apply adaptive thresholding to it (Figure 3b). Herein, the label matrix of the continuous area returns as a non-negative integer matrix of the same size as a binary image. Background pixels are labeled as "0", object pixels are labeled as "1" and the pixels that constitute a second object labeled as "2". In Figure 3, the steps of the area detection process are presented.

a.
b. c. (Bay, 2019). b. The image after hole filling and applying adaptive threshold. c. The numbered flakes after the process.

Figure 3: a. Optical image of the CVD grown MoS2(1-x)Se2x monolayers. Darker regions are the deposited materials and the background (lighter region) is the SiO2/Si substrate
To obtain the error rate, firstly, we calculate the covered area in the image mathematically (Actual area). Secondly, we utilize our program to calculate the surface coverage (computational area). The difference between the two results provides the error rate of the program (Eq. 1). By using error rate, we can also calculate the accuracy of the program. Accuracy means the proximity of a measured or a calculated value to a standard or known value (Eq. 2) (Avcı et al., 2015). (1)

Measurement of the Total Percentage of Colors in an Image (Flake Coverage)
To calculate the ratio of the darker regions to the whole surface, our program counts every non-black pixel in the image and divides this number by the number of pixels in the entire area. When we run the program to calculate the flake coverage of the sample in Figure 2, we find a coverage ratio of 54.2%. To check accuracy of the program, we use different geometrical shapes (Figure 4 and 5) and, firstly, we perform calculations by simple mathematical equations where the coverage ratio of the hexagonal-shaped is found as 50.3% (Figure 4 a-c). Our program calculates this ratio as 51.7%. The error rate for this example is found as 2.8%. We carry out the same procedure for the star shape. The surface coverage ratio of the star is 30.6% and our program calculates this ratio as %31.7. The difference between two calculations is 1.1% indicating 96.4% accuracy (Figure 4 d-f). These examples confirm that our program can be used as a practical way to find the coverage of the grown samples for our growth experiments.

a.
b. c.
d. e. f. We check the accuracy of our program by using another test image where the flakes are spread over the sample as shown in Figure 5a. The actual coverage ratio of the image is calculated as 24.3%. When we process this image with our program, we find a coverage ratio of 24.7%. The difference between the two operations is 0.4%, and the accuracy is 98.4%. For the Figure 5b, accuracy is calculated as 98.9% Hence our program provides higher accuracies with multiple shapes spread over the sample.

a.
b. Figure 5:  a., b. Test images with triangles that are spread over the sample.

Calculation of the Flake Areas
Detection of the areas of the flakes is another important step for the identification of the grown samples. For this reason, we perform another process to analyze the acquired images. After converting the image to grayscale and labelling it, we obtain numbered figures. By finding and comparing the area of each shape, we can find the flake with the largest area.
As shown in Figure 6, our program calculates the number of the triangles in the image and label them. It also calculates the area of the triangles and finds the triangle with the largest area. For this test image, the shape-17 was found to be the largest triangle. Our program calculates its area as 2494 µm 2 , the real area of the triangle is 2412 µm 2 , and the total area of the image is 50397 µm 2 . Therefore, the accuracy of the area calculation program can be found as ~99%.

a.
b. Figure 6: a. Original test image, b. Labelled test image As another example, Figure 7 indicates each of the labelled flakes and Table 1 lists the areas of the chosen labelled areas. For this sample, we find that the flake labeled as 1 has the largest area of 5103.2 µm 2 . In order to examine the accuracy of our study, we take images that have already been calculated manually by our group so that we compare our results with the previously accomplished calculations, which exhibits the practical operation of our program (Bay, 2019). Obviously, when we calculate the areas of the grown structures manually, this process consumes a huge amount of time also considering that the flakes do not have regular shapes and it is not possible to calculate the areas with a high precision. Hence we show that our technique is highly beneficial for the practical area detection processes.

CONCLUSIONS
Practical and accurate assessment of the CVD grown 2D materials is important for the future of the novel optoelectronic applications considering their high performance as active devices. In this study, we use K-means clustering method to calculate the surface coverage and flake size of the CVD grown 2D monolayers. When the accuracy of single-shape and multi-shape examples is compared, there is an improvement in multi-shape example suggesting that the selected method works well with the CVD grown 2D monolayers. For the distributed multi-shape example, accuracy of ~98% and accuracy of ~99% are obtained for the surface coverage and flake area calculations, respectively. Our study can be applied to the other 2D material families, which will facilitate practical identification of the grown samples simplifying the design parameters of the devices that will be fabricated using these monolayers.

AUTHOR CONTRIBUTIONS
Assoc. Prof. Dr. Nihan Kosku Perkgöz designed the research subject, the main conceptual ideas and proof outline. Merve Öper provided optical images of the samples. Fırat Aslancı and Fatma Can designed computer programs, worked on the implementation of the computer code and supporting algorithms. Fırat Aslancı, Fatma Can, Merve Öper performed data analysis and interpretation of results. Assoc. Prof. Dr. Nihan Kosku Perkgöz supported Fırat Aslancı, Fatma Can and Merve Öper to investigate and analyze the findings of this work. All authors discussed the results and contributed to the final manuscript.