Employing Barcode High-Resolution Melting Technique for Authentication of Apricot Cultivars

Fast, accurate and affordable identification of food products is important to ensure authenticity and safety. There are various apricot ( Prunus armeniaca L.) cultivars are being produced in Turkey. Each cultivar differs in quality and purpose of use. In this paper, we aimed to develop an easy and reliable method, Barcode High-Resolution Melting (Bar-HRM), to distinguish apricot cultivars. We designed and tested novel Bar-HRM primer sets HRM-ITS1 and HRM-ITS2, targeting the most popular barcoding region ITS1 and ITS2, specific to apricot cultivars. According to the results, HRM analysis distinguished 31 cultivars of 35 for ITS1, and 35 for ITS2. We recommend using ITS2 barcode region, amplified with using HRM-ITS2 primer set, for Bar-HRM analysis of different apricot cultivars.


Introduction
Apricot (Prunus armeniaca L.) is an important drupe fruit with a rich gene pool that makes the plant capable to adapt from Siberia to North Africa and Greece to China (Mehlenbacher et al. 1991). Turkey leads to world apricot production with 677,000 tons on average (FAOSTAT 2020). Apricot production is specialised especially in Malatya, Erzincan and Iğdır regions in Turkey (Ercisli 2004). Because apricot has been cultivated in Inner Anatolia since ancient times, some cultivars adapted to different climatic areas of the country i.e. Hacıhaliloğlu in Malatya, Hasanbey in Erzincan and Şalak in Iğdır (Güleryüz et al. 1997). According to the Turkish Apricot Research Institute data, there are 28 registered cultivars and numerous genotypes in Turkey including the Protected Designation of Origin (PDO) cultivars Hacıhaliloğlu, Hasanbey, Kabaaşı, Soğancı, Çataloğlu, Çöloğlu, Şalak (Anonymous 2021).
The economic importance of the PDO products is higher than the common products. Both producers and consumers must ensure PDO authenticity against adulteration. This makes "tracing the original food product" is crucial.
DNA barcoding, a method based on comparing nucleotide sequences of a specific DNA fragment, is a widely used tool for identifying the species, reconstructing the phylogenetics, assessing the biodiversity since the early 2000s (Cheng et al. 2016). Different DNA regions have been using as DNA barcodes for various plant groups such as rbcL, matK, ycf1, LEAFY and the most used Internal Transcribed Spacer (ITS) region (Won & Renner 2005;Chase et al. 2007;Kress & Erickson 2007;Hollingsworth 2011;Li et al. 2011;Dong et al. 2015). It is also confirmed that ITS barcode region distinguishes apricot cultivars successfully (Hürkan 2020). Although the DNA barcoding technique is very useful for species identification, it prolongs the workflow, increases the expenses due to post-PCR sequencing analysis, and needs a bioinformatics background for the researcher. Recently, DNA barcoding technique supported with other techniques to overcome these disadvantages. In this paper, we urged on the High Resolution Melting (HRM) technique coupled with DNA barcoding region to develop a cost-effective method to discriminate closely related apricot cultivars.
High Resolution Melting analysis is a technique that used for genotyping by discriminating DNA sequence differences of Single Nucleotide Polymorphisms (SNPs) and sequence length polymorphisms in PCR products (Zhou et al. 2005). In this technique, the shape of the melting transitions of the PCR products is being acquiesced continuously. HRM is a more powerful, cheaper and easier technique than other approaches since requires neither post-PCR processing nor bioinformatics skills. Moreover, it is faster and more economical since being a sequencing-free method. Although HRM technique has majorities over conventional methods, it had some disadvantages that could directly influence the results. During the early stages of the HRM, SYBR® Green was used as dsDNA binding dye, which inhibits the DNA polymerase. Therefore, SYBR™ Green dye does not allow discriminating closely related genotypes that have small sequence variations on HRM (Reed et al. 2007). However, saturation dyes e.g. LC Green™, SYTO9™ or Eva Green™, which is selected for this study, do not affect the DNA polymerase performance even at high concentrations (Vossen et al. 2009). The discriminating power of HRM directly related to the markers used in the analysis. In recent researches, there are two marker types are being used for HRM analysis, microsatellites and DNA barcoding regions. Both have advantages and disadvantages. While microsatellites are usually organism-group specified and need more experience to design and mine the primers, DNA barcoding regions are universal.
Combining the discrimination success of DNA barcoding and the easiness of HRM results in successful identification of food products such as PDO cheese (Ganopoulos et al. 2013), codfish species (Shi et al. 2020), poisonous plants ( Thongkhao et al. 2020), medicinal plants Sun et al. 2017;Mishra et al. 2018), and bean crops (Madesis et al. 2012). In a recent study, 16 SSR markers from the literature were used to characterise the wild apricot genotypes from Nevşehir region (Turkey) (Bakır et al. 2019).
Because of the advantages above, herein, we developed and tested a Bar-HRM based method, using specific primer sets targeting the most used barcoding region ITS1 and ITS2, to rapid, affordable, reliable and quantitative identification of apricot cultivars and apricot products.

Samples collection and DNA extraction
The Republic of Turkey, Ministry of Agriculture and Forestry Apricot Research Institute (Malatya, Turkey) kindly provided fresh leaf samples of the 35 apricot cultivars available in April 2020 (Table 1). To extract DNA, we used ~100 mg of fresh leaf tissues and followed the modified CTAB protocol described in the literature (Aydın et al. 2018). The DNA concentration was measured by NanoDrop (Maestrogen) and integrity was confirmed on agarose gel electrophoresis. We normalised the concentration of all the DNA samples to 10 ng µL -1 . The DNA samples were stored at -20 °C for further analyses.

Primers mining
Since there is no available nucleotide sequence of the studied cultivars available on GenBank, we designed the primer sets for ITS1 and ITS2 barcode regions according to the GenBank apricot nucleotide sequence records with the accession number MT072696, EF211085, EF211084, EF211083 and MG735482. We downloaded the GenBank formatted files with annotations, imported them to the Geneious R8 software (Kearse et al. 2012) and evaluated them for quality and variable characters. Sequence characteristics were also analysed using MEGA Version X (Kumar et al. 2018). Then, the sequences were aligned (Geneious Alignment Tool) and the novel primer sets were designed by considering the variable positions, GC content, expected amplicon size and melting temperature (Tm) in the same software (Table 2). We in silico confirmed the specificity of the primers on the Primer-BLAST tool in the National Center for Biotechnology Information (NCBI).

HRM analysis and sequencing of PCR products
We performed the HRM amplifications using Rotor-Gene-Q 5plex thermal cycler (Qiagen, USA) with a 72-well carousel. The HRM reaction mix was prepared as 5 µL Luminaris Colour HRM Master Mix (Thermo Scientific, USA), 0.5 µL of 10 mM each primer (Sentebiolab, Turkey), 10 ng template and nuclease-free water to 10 µL total volume. The cycling protocol was 95 °C 10 min initial denaturation followed by 40 cycles of 95 °C 10 s denaturation, 60 °C 30 s annealing, and 72 °C 30 s extension. Data acquiesced following each extension step. We added 95 °C 30 s and 50 °C 30 s steps for heteroduplex formation to the end of the cycle. We performed HRM immediately after the amplification in increments of 0.1 °C s -1 from 75 °C to 95 °C and data acquiesced continuously. All the reactions were performed as triplicates and no template control (NTC) was included in the reactions.
For HRM data analysis, we used the Rotor-Gene Q 2.3.5 (Qiagen, Germany) software. We calculated the Cycle Threshold (Cq) values by comparative quantification method of the software, Melting Temperatures (Tm), and normalised the HRM curves by removing the background fluorescence. The difference plots were generated regarding AS. Then, the software calculated the Genotype Confidence Percentage (GCP) for each cultivar. We set the confidence threshold to 95%.
To validate HRM analysis results, four ITS1 PCR products for the cultivars AS, HB, HR, KA which could not be separated by HRM (confidence percentage ˃95% threshold) and five ITS2 PCR products for the cultivars AS, AZ, PV, SM, TF which were separated by HRM (confidence percentage ˃85%). We sent the PCR products to the direct sequencing (Macrogen Europe) were performed in both directions using the same primers used for HRM amplifications by on ABI 3730xl System. We used Geneious R8 (Kearse et al. 2012) software for assembling the sequences and generating the consensus sequences. The sequences were aligned by the Geneious Aligner algorithm in the same software.

DNA extraction and data mining
We obtained a sufficient amount of DNA from the samples with the followed extraction protocol. The DNA concentrations ranged from 29.69 to 187.29 ng µL -1 , and the A260/230 ratio ranged from 1.540 to 2.000.
We first analysed the sequences of ITS1 and ITS2 regions from the cultivars retrieved from GenBank to generate an initial comparison of the characteristics (Table 3). The length of the ITS1 region was identical among the cultivars as 215 bp. Ten variable sites (4.65%), 205 conserved sites (95.35%), and one parsimony informative sites were observed for the ITS1 region. ITS2 region was relatively longer (277.8±0.4 bp) and three folds by the variable sites (37 sites equals 13.32%) compared to ITS1. The conserved sites were lower 241 (86.75%) than ITS1, and there was no parsimony-informative site on ITS2. The average GC contents were almost the same for both regions (Figure 1). We would like to include the coding 5.8S region in the comparison table to demonstrate why this coding region is not suitable for any genotyping analysis for apricot since all the sequence was conserved and there was no variable site available.

HRM and sequencing results
ITS1 primer set amplified the expected size products, approximately 160 bp long. The software calculated the threshold cycles (Cq) and melting temperatures (Tm) of the amplicons (Table 1) Table 1. Normalisation ranges were adjusted to 84 °C (leading) and 90 °C (trailing).
ITS2 marker yielded better results than the ITS1. ITS2 primer set amplified approximately 170 bp long PCR products, as expected. The Cq values were ranged from 10.62 to 14.70, and each value was unique (Table 1). The Tm values ranged from 90.04 to 91.30, and the amplicons had 18 unique Tm values. HRM analysis of ITS2 distinguished all the cultivars by melting shapes (Figure 3), as we aimed. In contrast to conventional Tm difference analysis, the normalisation of the fluorescence by the software distinguished all the cultivars by melting curve shapes for ITS2.  Table 1. Normalisation ranges were adjusted to 86.6 °C (leading) and 92.2 °C (trailing).
We analysed the amplicon sequences to validate the results of the HRM analysis. After trimming the primer binding sites, we obtained 110 bp sequence for ITS1 and 136 bp sequence for the ITS2 region (Table 2). Sequencing results showed while ITS1 sequences of the AS, HB, HR, KA were identical, six variable sites were detected for ITS2 sequences of AS, AZ, PV, SM, TF (Figure 4, Figure 5 and Table 3).

Discussion
The ITS region is a suitable DNA barcoding region for plants (Kress et al. 2005), but this region has some problems such as duplications, paralogue copies and causes pseudogenes on some plant groups (Chase et al. 2007). The success of the ITS as a barcoding region and an HRM marker was reported for apricot cultivars (Hürkan 2020), Fabaceae (Gao & Chen 2009), Artemisia spp. , and Medicago lupulina and Trifolium pratense (Ganopoulos et al. 2012). The ITS region consists of two non-coding, ITS1 and ITS2, and one coding, 5.8S, parts (Cheng et al. 2016). Systematic researches have proposed the ITS2 could be the core DNA barcode due to the region has high interspecific divergence (Xin et al. 2013). In this study, the basic comparison of the ITS parts showed parallel results to the literature as ITS2 was the most variable (13.22%) region followed by ITS1 (4.65%) for the studied apricot cultivars. Although the 5.8S gene coding region was well characterised for inter-specific level, the region has very limited nucleotide variations in deeper levels since it is a conserved region (Hershkovitz & Lewis 1996). Supporting the literature, the 5.8S region showed no variable regions for the studied apricot cultivars.
The optimal amplicon length for HRM analysis should be shorter than 300 bp (Reed & Wittwer 2004). Shorter amplicons emphasize Single Nucleotide Polymorphisms (SNPs) in HRM analysis. Both ITS1 and ITS2 primers worked fine for each studied cultivar and yielded approximately 160 and 170 bp amplicons, respectively, which were in the "ideal" range for HRM analysis. The melting temperature of the PCR products depends on both sequence length and nucleotide content. However, solely Tm values are not reliable for discrimination the organisms as seen in the results. ITS1 primer yielded 20 unique Tm values, and ITS2 was only 18. During the HRM reaction, following the PCR amplification, the thermal cycler applies temperature increment to PCR products to denature the double-strand DNA, while a detector continuously tracking the fluorescence change. Thus, the software considers not only Tm values but also the melting shapes of the amplicons. This provides the software with better discrimination ability (Reja et al. 2010, Reed & Wittwer 2004. After the normalisation of the melting shapes on the software, HRM analysis distinguished 32 cultivars of the 35 for ITS1, and 35 for ITS2. This result supports the comparison of the variable sites (Table 3 and Figure 1), and the literature (Ganopoulos et al. 2012;Song et al. 2016;Pereira et al. 2018;Mishra et al. 2018). ITS2 region had three times more variable sites than the ITS1 and yielded better discrimination result.
The sequencing results were parallel to both comparisons of the sequences retrieved from GenBank (Table 3) and the HRM analysis results. Although obtained amplicon lengths were shorter than expected, we found the variable sites for ITS2, which was necessary to distinguish the cultivars. Sequencing results showed ITS2 (six variable sites) was the more variable region than ITS1 (no variable sites). The lack of variable sites on ITS1 sequences for the cultivars, which have a confidence percentage higher than 95%, clearly explains why HRM analysis could not distinguish the cultivars. The identical sequences resulted in similar melting profiles. In contrast to ITS1, ITS2 had six variable sites on the sequences. All these variations we detected were transversion mutations. There was no insertion or deletion. The selected ITS2 samples for sequencing were the samples that have more than 85% confidence percentage. The sequencing result of ITS2 validated the sensitivity and reliability of the Bar-HRM analysis by distinguishing genetically closely related cultivars.
Standard DNA barcoding based Sanger-sequencing is relatively expensive. In this study, we used a commercial HRM kit and a single HRM reaction was costed 0.31 USD, while single direction Sanger-sequencing was costed 5.5 USD. Next-generation sequencing (NGS) based genotyping is a comparable level on cost. However, it needs sophisticated workflow e.g. pre-sequencing library preparation, high-level computing for post-sequencing assembly of the reads. It also needs experienced and expert researcher on bioinformatics. Therefore, we believe a well-designed Bar-HRM assay would be a fast, robust and cost-effective way to genotype organism groups.

Conclusions
Bar-HRM is a cost-effective, fast and robust identification method. Moreover, no specialisation is needed for handling the data. In this study, ITS based Bar-HRM has proven for the identification of the 35 apricot cultivars. ITS2 primers set had the highest discrimination rate and can be used for the identification of various apricot cultivars. ITS2 marker can also be used for identification of Prunus species, authentication of apricot products.