Genomic Variability and Recombination Analysis of Grapevine leafroll- associated virus-1 Isolates from Turkey

Grapevine leafroll-associated virus-1 (GLRaV-1), one of the causal agents of Grapevine leafroll disease (GLRD), is one of the most important viral diseases of grapevine worldwide. In this study, the prevalence of GLRaV-1, genetic variation and recombination events among GLRaV-1 isolates in Turkey were investigated. Initially, 197 grapevine samples from different provinces of the country were serologically tested. Of the total samples, 109 (55.32%) were identified as GLRaV-1 infected. Subsequently, 9 samples representing different geographic distribution were selected for further sequence analysis of the heat-shock protein 70 homolog (HSP70h), open reading frame 9 (p24), coat protein (CP) and coat protein duplicate 2 (CPd2). Among the four gene regions, CPd2 was found the most divergent region while HSP70h gene exhibited the lowest genetic diversity. The phylogenetic analysis of four genomic regions including GenBank records clustered all variants in two major groups and grouped Turkish isolates mostly together. However, the isolate clusters were not correlated to their geographic origin. Furthermore, several putative recombination events were detected with trace to moderate evidence support of algorithms implemented in Recombination Detection Program (RDP). Taken together, the results provide a better understanding on genetic variation of Turkish GLRaV-1 isolates in the country and worldwide and can help to improve sanitation of propagated material programs for the grape growers.


Introduction
Grapevine (Vitis sp.) is one of the most widely grown woody crops worldwide. Turkey produces 4.175.356 tons of grapes and ranks the 6 th in production after China, USA, Italy, Spain and France (FAOSTAT 2014). Many viruses can cause damages to grapevine in the form of yield losses, short survival life and quality decreases. Among the grapevine virus diseases, the most widespread and economically destructive one is grapevine leafroll disease (GLRD), which was first described in the mid-19 th century. The main symptoms of GLRD include, leaf rolling, interveinal reddening of the leaves and reduced pigmentation in red-fruited cultivars, leaf chlorosis in white-fruited cultivars and reduced yields, lower Brix and uneven ripening. GLRD effects on fruits are poor maturation of berries, lower Brix content and yield, and reduced wine pigmentation. The phloem limited filamentous-shape viruses associated with that disease are named as Grapevine leafroll associated viruses (GLRaVs). They are represented by the species GLRaV-1, -2, -3 and -4 and recently, GLRaV-5, -6, -9, GLRaV-Pr, GLRaV-De, and GLRaV-Car were recognized as strains of GLRaV-4 (Martelli et al 2012). All these viruses belong to the family Closteroviridae, with GLRaV-2 belonging to the genus Closterovirus, GLRaV-7 belongs to a newly proposed genus Velarivirus and the other GLRaVs to the genus Ampelovirus (Martelli 2014).
The GLRaV-1 genome has a positive (+) sense single-stranded (ss) RNA genome 19.5 kb in size organized into ten major open reading frames (ORFs) and those frames encoding proteins have different functions. A putative RNA helicase is encoded by ORF-1a and an RNA-dependent RNA polymerase is encoded by ORF-1b. Other ORFs encodes one hydrophobic protein, capsid protein (CP), a HSP70 family of heat shock proteinshomologue (HSP70h), a HSP90-like protein, two diverged CP gene copies (CPd1 and CPd2) and two more proteins. Moreover, GLRaV-1 is the only one among the GLRaVs which includes two diverged CP gene copies (Fazeli & Rezaian 2000).
Understanding the genome of the species, variants of the virus and genetic diversity of viral populations are of significant importance for epidemiological studies and certification testing to prevent its spread and improve efficient control strategies. RNA viruses, such as GLRaVs, have high ratio of genetic diversity due to an error-prone replication with high mutation rates; therefore, the evolutionary processes which allow them to spread throughout the species need to be clarified (Fazeli & Rezaian 2000). Although GLRaV-1 is a common virus infecting grapevine plants worldwide, limited information is available in literature about the genomic variability and molecular evolution of GLRaV-1 (Little et al 2001;Kominek et al 2005;Alabi et al 2011;Predajna et al 2013;Cseh et al 2013;Fan et al 2015).
GLRaV-1 has been detected in Turkey since 1997 (Çağlayan 1997;Köklü et al 1998;Çığsar et al 2002;Akbaş et al 2007;2009;Değer et al 2015;Önder 2016) however the nucleotide sequence variation between isolates was unknown. Therefore, genomic variability and recombination events of Grapevine leafroll-associated virus-1 isolates in Turkey were evaluated through the sequence analysis of four different genomic regions of the virus. For this purpose, the HSP70h, CP, ORF9 (p24) and CPd2 genes of GLRaV-1 were amplified, sequenced and possible recombination events were investigated.

Materials and Methods
Grapevine leaves showing suspicious GLRD symptoms were collected from Hatay, Gaziantep and Tekirdağ provinces of Turkey during autumn 2015. In total, 197 grapevine samples, cvs. Antep karası, Pafu, Kalecik which are the local varieties and cv. Syrah, were collected and the leaves were stored at 20 ºC until use.
The collected samples were analyzed by a double antibody sandwich, enzyme-linked immune sorbent assay (DAS-ELISA) using the Bioreba's commercial kit (Bioreba, Switzerland) based on Clark & Adams (1977) method. BIOTEK-EL800 spectrophotometer (BioTek, USA) was used for measurement of absorbance values at 405 nm. Assays were done following the manufacturers' instructions.
The RNAs were isolated and purified from leaf tissues by a commercial RNA isolation kit following the manufacturer's instructions (RNeasy Plant Mini Kit, Qiagen Sci., Germany) and their yield and quality were estimated by a NanoDrop spectrophotometer (NanoDrop2000c, Thermo Sci., USA).
The cDNA was synthesized based on the twostep protocol and used as a template for PCR analysis. For reverse transcription of total RNAs, random hexamer primers and the cDNA synthesis kit were used (SuperScript™ Choice System for cDNA Synthesis, Thermo Sci., USA). The PCR was conducted with 2 µL of template, 0.5 µL of 10 mM dNTPs, 1.5 µL of 25 mM MgCl 2 , 2.5 µL of 5x green reaction buffer and 0.5 µL of 10 µM of each specific primers for HSP70h, p24, CP, CPd2 genes of GLRaV-1 (Alabi et al 2011) adding with 0.2 µL of 5 units µL -1 Taq DNA polymerase (GoTaq® DNA Polymerase, Promega Corp., USA). PCR amplifications were performed as: denaturation at 94 °C for 5 min; 40 cycles of annealing at 94 °C for 30 s, 55 °C for 45 s, and 72 °C for 1 min; extension at 72 °C for 10 min.
The PCR amplicons were sequenced in both directions (forward and reverse) by an Automated Genetic Analyzer (ABI3730, MedSanTek Company, Turkey). The obtained nucleotide sequences were analyzed by the program Molecular Evolutionary Genetics Analysis (software MEGA 6.06, Tamura et al 2013). BLASTn and BLASTx modules were used to determine nucleotide and amino acid identities of GLRaV-1 Turkish isolates to reference isolate (Nucleotide ID: Acc. No. NC016509, Protein ID: Phylogenetic analysis was conducted on the 4 different partial gene regions of GLRaV-1. The GenBank (NCBI) nucleotides of different GLRaV-1 isolates were included in the analysis. Multiple nucleotide and amino acid alignments were performed with CLUSTAL W (Larkin et al 2007) and the phylogenetic tree was drawn based on the neighbor-joining method implemented in the program MEGA 6.06 (Tamura et al 2013). The Bootstrap analysis was performed with 1000 replications. Little cherry virus-2 (LCV-2, Acc.No. NC005065) was used as out-group.
The aligned nucleotides of the isolates were analyzed for the recombination events using seven recombination detection algorithms in Recombination Detection Program (RDP) version 4.16 (Martin et al 2015). Sequences were masked to be sure for optimal recombination detection before algorithm analysis. ''Sequences are linear'' option and Bonferroni-corrected P value cut-off of 0.05 were selected.

Results and Discussion
According to DAS-ELISA results, 109 out of 197 collected from Hatay (61 out of 107 samples), Gaziantep (23 out of 46 samples) and Tekirdağ (25 out of 44 samples) grapevine samples were found to be infected by GLRaV-1. Hatay and Gaziantep are Southeast regions of the country and rather far from the third region, Tekirdağ. Although they distantly located, they exhibited approximately same level of the virus prevalence. To date, there are several studies on GLRaV-1 incidence in grapevine growing areas of Turkey. Özaslan & Yılmaz (1985) reported GLRaV-1 as a common virus in some provinces and Çağlayan (1997) found GLRaV-1 always as mixed infection with GVA in Hatay province of Turkey. Moreover, GLRaV-1 was found to be the most common virus in Central Anatolia with the infection rate of 8.36% by DAS-ELISA (Akbaş et al 2007). Recently another survey was conducted in Eastern Mediterranean Region of Turkey and the most common virus was found to be GLRaV-1 with the infection rate of 55.56% followed by GLRaV-4 (43.14%), GLRaV-2 (15.69%) and GLRaV-3 (12.42%) by RT-PCR analysis (Değer et al 2015). Önder (2016) studied on prevalence of GLRaVs (GLRaV-1, GLRaV-2, GLRaV-3, GLRaV-4, GLRaV-5, GLRaV-6, GLRaV-7, GLRaV-9, GLRaV-Pr and GLRaV-De) at Manisa, Denizli, İzmir, Aydın and Uşak provinces and reported that 133 out of 424 samples were infected at least by one GLRaVs and regardless the mixed infections, GLRaV-Pr was the most widespread one with 12% infection rate which followed by GLRaV-De (12%), GLRaV-3 (8.5%), GLRaV-2 (2.8%), GLRaV-4 (2.4%), GLRaV-9 (0.9%), GLRaV-1 (0.5%), GLRaV-5 (0.2%) and GLRaV-7 (0.2%), respectively. Our results are in accordance with previously reported RT-PCR analysis results on grapevine plants infected with GLRD. The high incidence of this virus and poor sanitation conditions for grapevines were indicated in Turkey. The reason of such a high incidence of this virus can be due to vector transmissions, since most samples were infested by mealybugs, a vector responsible for transmission of this virus (Sforza et  Among DAS-ELISA positive samples for GLRaV-1, nine representative isolates were chosen according to their geographical region and cultivar for genetic analysis. All these isolates were successfully amplified for four genomic regions of GLRaV-1 based on RT-PCR assays using with specific primers for CP, CPd2 and p24, HSP70h genes with amplification product sizes 734 bp, 398 bp, 634 bp, of 540 bp, respectively. All obtained PCR amplicons were purified and sequence analysis was done. After nucleotide assembly, the sequences of each RNA segment were deposited in GenBank with accession numbers of KU362237-KU362263 and KU362270-KU362278. Based on the BLAST analysis, the nucleotide and amino acid sequence identities of the nine Turkish GLRaV-1 isolates to reference isolate ranged from 79 to 94% and from 75 to 96%, respectively. The CP sequences obtained in this study showed the nucleotide identities ranged from 85 to 87%; whereas amino acid identities ranged from 93 to 96%. For CPd2 sequences, it ranged from 78 to 80% and from 75 to 79%, respectively. The level of nucleotide and amino acid sequence identities for p24 gene ranged between 79-82% and 80-84%, respectively. Only the isolate 141 showed highest identity among them (94% nt identity and 93% a.a. identity). Based on HSP70h gene sequence analysis nucleotide identities ranged from 82 to 84% and amino acid identities ranged between 90-91% (Table 1). The overall mean distance values of CP, CPd2, p24 and Hsp70h regions' nucleotides were 0.044, 0.065, 0.110 and 0.049, respectively. Moreover, the overall mean value of nucleotide diversity for 4 gene regions were 0.067, 0.110, 0.081 and 0.142. The observed most divergent region was CPd2 while the lowest region was HSP70h. These findings confirm two previous studies reported that GLRaV-1 genes have a high genetic variation and the CPd2 gene is the most variable gene (Little et al 2001;Alabi et al 2011). Phylogenetic relationships among Turkish GLRaV-1 isolates were determined for four gene regions and compared with the other isolates deposited in GenBank from different countries such as California, Washington and New York-USA, Portugal, Slovenia, Czech Republic, Iran, Hungary, China, Poland, India, Poland, Chile, Canada, Italy. There are a few studies on genetic diversity of GLRaV-1 from Europe. Kominek et al (2005) reported that GLRaV-1 can be grouped into two sequence variants based on sequences derived from the HSP70h gene of eight isolates from Slovakia and the Czech Republic. Also, from partial nucleotide sequences of this gene, the authors reported that GLRaV-1 isolates consisted of two variant groups, tentatively designated as groups A (North America and Australia) and E (Europe). However, more recent data from the USA supported the grouping of a wider range of GLRaV-1 isolates into three main variant groups based on the p24 and HSP70h gene sequences (Alabi et al 2011).
In this study, the 54 partial CP gene sequences (15 from China, 13 from USA, 13 from Portugal, 9 from Turkey, 1 from Iran, 1 from Poland, 1 from Canada and 1 as reference isolate (RefSeq) were analyzed and segregated into two major groups. Group 1 includes most of the isolates from Portugal, China, USA, Canada, Poland, and Iran additional to Turkish isolates. Turkish GLRaV1 CP isolates were closely clustered in the same subgroup of Group 1. Group 2 includes only some of the isolates from China and USA (Figure 1a). Based on the dendrogram, there is no distinct separation within and between the isolates collected from the same geographical conditions. Here, it can be concluded that there is no high degree of variation at the CP gene of analyzed Turkish GLRaV1 isolates. In accordance with other studies (Alabi et al 2011;Esteves et al 2013) the results of phylogenetic analysis did not show a clear correlation between phylogeny and geographical origin. The most common GLRaV-1 variants obtained from CP gene sequences belonged to Group I (Esteves et al 2013) whereas the majority of Chinese GLRaV-1 variants belong to Group II (Fan et al 2015). They reported that natural selection rather than a random process has led to the evolution of CP gene sequence variants in Group II. For the CPd2 analysis; 23 isolates from USA, 9 from Turkey, 1 from Poland, 1 from Iran, 1 from Chile, and 1 RefSeq were used. Two main groups were obtained and the Turkish isolates were clustered into Group 2. Some of the Californian isolates (CA3, CA6, CA10, CA11, CA16, CA18, CA20) were found highly similar to Turkish isolates with high bootstrap values. The GenBank sequences obtained from Poland, Iran and most of the American isolates were clustered into Group 1 with reference sequence (NC016509) while none of the Turkish isolates were clustered there (Figure 1b).
Phylogenetic analyses of 45 sequences of p24 (USA: 32, India: 2, Iran: 1, Turkey: 9, RefSeq: 1) resulted into two main groups with three subgroups. The Turkish isolates were mostly clustered together into group 1 and showed highest similarity with three Californian (CA21, CA2, CA6) and one Iranian (IR-S7) isolates. The isolate 141p24 which was extracted from a local cultivar, Kalecik, was found highly similar to RefSeq (NC016509) isolate. The out-group control LCV-2 was distinctly separated from all the isolates as expected ( Figure  1c). Global HSP70h-specific sequences of GLRaV-1 (Total 51; USA: 19, Slovenia: 11, Portugal: 6, Czech Republic: 2, Hungary: 1, Iran: 1, Italy: 1, Turkey: 9, reference isolate (RefSeq):1) segregated into two major phylogroups with two subgroups. Group 1 consists of American, Portugal, Slovenian, Czech, Iranian and Turkish isolates while Group 2 consists of Slovenian, American and Hungarian isolates. Turkish isolates were clearly separated and clustered together in the same group (Figure 1d). According to Kominek et al (2005) this cluster can be separated into two groups that were designated as groups A and E however Alabi et al (2011) reported three distinct variant groups and they could not found any evidence for precisely defined geographical structuring of GLRaV-1 isolates among the three groups. Our phylogenetic analysis of GLRaV-1 HSP70h gene is corresponded to the results of Kominek et al (2005) and Cseh et al (2013) regarding the number of the phylogroups however no correlation was found for  Alabi et al (2011). The cell to cell movement protein, which is HSP70 gene product, has highly conserved sequence among Closteroviridae family members (Dolja et al 1994). Based on the phylogenetic analysis in this study, the lowest divergent genomic region was also detected as HSP70h region of GLRaV1 among the analyzed gene sequences. GLRaV-1 sequences of CPd2-derived phylogenetic trees constructed with American isolates confirmed recombination event possibilities in this gene. Therefore, it is suggested to use both p24 and HSP70h genes for the significant analysis of the phylogeny of GLRaV-1 variants (Alabi et al 2011).
Recombination analysis of the four gene sequences (the HSP70h, CP, CPd2, and p24) has been performed and several putative recombination events with moderate support were detected in the CP, CPd2, and p24 regions (Table 2). Recombination events in RNA viruses are reported as a powerful inducement for generating new variants (Simon-Loriere & Holmes 2011). No significant recombination events in the HSP70 gene (9 from this study and 22 from GenBank) were detected by any of software implemented in RDP program. Recombination analysis in the p24 region (9 from this study and 22 from GenBank) have resulted in detection of 3 putative recombination events and the event 2 and 3 were included also in Turkish isolates, 141 (GenBank accession no: KU362253), 102 (KU362249) and 129 (KU362251). The putative recombination event 2 involved 102 (KU362249) as potential parent and the GenBank isolate CA3 (JF811776) was used to infer the unknown parent. This recombination event was detected in 8 isolates 141, NJ016509, JF811768, JF811767, JF811766, JF811765, and JF811764. Trace evidence of the same recombination event was also detected in the GenBank records, NJ016509, JF811763, and JF811756. Recombination position of the event 2 was depicted at Figure 2. It is noticeable that based on recombinant sequence of p24, the Turkish isolate 141 placed close to sequence of NYR isolates rather than grouping with the remaining Turkish isolates. Recombination event 3 were found in 8 isolates including a Turkish isolate 129 as major parent but this recombination event was supported with only one program (SiSican). One putative recombination event was detected in CP sequences generating three putative recombinant sequences (JF811846-CA2, JF811848-CA2, and JF811860-WAC). The event involved KC567911-CdG as minor parental sequences and no major parents were identified among the sequences examined. With respect to CPd2 gene, 3 recombination events detected and none of them involved Turkish isolates. The recombination is an important evolutionary trait of Closteroviridae family members (Karasev 2000). Thereby, RNA recombinations play a significant role in this virus evolution and variation. Putative recombination events were also previously found in Table 2

Conclusions
In conclusion, a high frequency of GLRaV-1 in grapevines was detected. Taken together the nucleotide comparisons, phylogenetic and recombination analysis, the results indicate there is no distinct grouping according to geographic source although samples were taken from distantly located regions. This finding indicates dissemination of the virus occurs via propagation materials transfer and emphasizes use of virus-free plant material in preventing the dissemination of this virus.