Mitochondrial genome sequencing and phylogenetic analysis of Cynodon dactylon × Cynodon transvaalensis

: Cynodon dactylon × Cynodon transvaalensis is one of the most important turfgrasses. Sequencing the C. dactylon × C. transvaalensis mitochondrial genome can help us learn more about its genomic composition and allow further study of the population genetics, taxonomy, and evolutionary biology of Poaceae plants and other related species. Here the C. dactylon × C. transvaalensis mitochondrial genome was sequenced using Illumina HiSeq combined with PacBio sequencing technology, and the map of the mitochondrial genome was constructed after de novo assembly and annotation. The C. dactylon × C. transvaalensis mitochondrial genome has 366,612 bp and contains 53 genes, including 31 protein-coding genes, 3 rRNA genes, and 16 tRNA genes. The result of relative synonymous codon usage (RSCU) showed an A or U preference at the third position of the codons. Thirty-three chloroplast DNA fragments were found in C. dactylon × C. transvaalensis mitochondrial DNA, ranging from 72 to 3003 bp. Phylogenetic trees built based on the chloroplast genome were congruent with the plant taxonomy and NCBI taxonomy common tree, while the phylogenetic trees built based on 9 mitochondrial genes showed some differences from the common tree. dactylon × Cynodon transvaalensis mitochondrial genome, phylogenetics, repeat sequence, gene transfer


Introduction
Mitochondria are genetically semiautonomous organelles that can encode some of the genes associated with their own functions. The mitochondrial genomes of plants have the characteristics of polymorphism, heterogeneity, complexity, and variability. Angiosperms have the largest and most complex mitochondrial genomes. Although the size of mitochondrial genomes ranges from approximately 220 kb (Brassica napus) (Handa, 2003) to 11.3 Mb (Silene conica) , the number of basic functional genes has little variation, and the complexity of the genome is relatively conserved (Kubo and Newton, 2008). With the exception of maize S-type cytoplasmic sterile line (Allen et al., 2007) and japonica rice (Notsu et al., 2002), which have linear molecular structures, the mitochondrial genomes of other plants that have been sequenced all have annular structures.
The size differences of plant mitochondrial genomes are mainly caused by repetitive sequences. However, there is no positive correlation between the enlargement of mitochondrial genomes and increased gene number . Species with larger mitochondrial genomes do not necessarily contain more genes . Plant mitochondrial genomes mainly encode respiratory metabolism and oxidative phosphorylationrelated genes, such as cytochrome complex I-V subunit genes, ribosomal protein genes, cytochrome C synthesisrelated genes, rRNAs, tRNAs, and a large number of unknown open reading frames (Millar et al., 2011). Plant mitochondrial genomes contain a large number of repetitive sequences in different sizes and are highly diverse in configuration. In the same species, size difference in the mitochondrial genome is mainly caused by repeat sequences, especially the noncoding sequences between the gene intervals .
Horizontal gene transfer (HGT) is a process by which receptor cells acquire genetic materials from donors; HGT is the driving force of eukaryotic genome evolution (Bermthorsson et al., 2003;Xiong et al., 2008). In the mitochondrial genomes of higher plants, gene fragments derived from chloroplasts are ubiquitous and occupy a high proportion of the mitochondrial genomes. For example, there are 17 chloroplast-derived DNA fragments in the rice mitochondrial genome, accounting for 6.3% of the mitochondrial genome (Notsu et al., 2002). However, there are no chloroplast-derived sequences in the mitochondrial genome of Marchantia polymorpha L. This suggests that the transfer of chloroplast gene sequences to mitochondria is likely to be endemic to flowering plants (Oda et al., 1992).
Cynodon dactylon × Cynodon transvaalensis is a hybrid whose paternal parent is Cynodon dactylon and maternal parent is Cynodon transvaalensis. Due to its strong vegetative growth and its abilities to tolerate trampling, heat, and drought stresses, as well as its good texture and fast vegetative establishment, it is widely used in sports fields, lawns, parks, and golf courses (Huang et al., 2018). Therefore, it has the potential for adversity gene mining. However, little genetic information is known for triploid bermudagrass. We have de novo assembled and annotated the complete mitochondrial genome of Cynodon dactylon × Cynodon transvaalensis. We have discussed the content, structure, and organization of the mitochondrial genome. We have also analyzed the chloroplast-derived genes/ fragments in the mitochondrial genome. We then explored phylogenetic relationships among various plant mitochondrial genomes. Sequencing of the mitochondrial genome of Cynodon dactylon × C. transvaalensis will allow further study of the population genetics, taxonomy, and evolution biology of Poaceae plants and related species.

DNA extraction, mitochondrial genome sequence, and assembly
The method of mitochondrial DNA extraction is the same as mentioned by Richardson et al. (2013). Sequencing of the mitochondrial DNA was carried out by using Illumina HiSeq (Illumina, Inc, San Diego, CA, USA) combined with PacBio sequencing technology (Pacific Biosciences, Menlo Park CA, USA). The Illumina sequencing data were preliminarily assembled by SOAPdenovo (v2.04) (Luo et al., 2012); the scaffolds were then compared to the sequencing data of PacBio to correct the single molecule sequencing data. Canu v1.5 (Koren et al., 2017) software was used for subsequent assembly. For the detailed methods, we referred to Koren et al. (2012).

Phylogenetic analysis
The blast alignment of the mitochondria with the chloroplast genome was performed using Circoletto (http://tools.bat.infspire.org/circoletto/) (Nikos, 2010) with e values ranging from -10 to -20. The alignment of 26 mitochondrial genomes and 24 chloroplast genomes was performed using HomBlocks (Bi et al., 2017), and the sequence was trimmed using the Gblocks method. ModelFinder (Kalyaanamoorthy et al., 2017) was used for model selection. Bayesian analysis (BI) was performed on MrBayes 3.2.5 (Ronquist et al., 2012) with 2 cold chains and 2 hot chains. The Markov Chain Monte Carlo (MCMC) chain was set to 10,000,000 generations, sampling once every 1000 steps, with a relative burnin of 25%. The convergence of independent runs was evaluated by the average standard deviation of the splitting frequency (<0.01). The maximum likelihood (ML) analysis was performed on IQ-TREE version 1.6.6 (Nguyen et al., 2015) under the substitution model GTR + F + R3. The node support values were estimated with 1000 replicates of ultrafast likelihood bootstrap and SH-aLRT.
Fifty-three genes were identified in the mitochondrial genome of Cynodon dactylon × C. transvaalensis, including 31 out of all 41 protein-coding genes in the mitochondrial genome of ancestral flowering plants (Mower et al., 2012), 3 rRNA genes, 6 complete native mitochondrial tRNA genes, and 10 chloroplast-derived tRNA genes (Table  1). To our knowledge, among all sequenced angiosperm mitochondrial genomes, only Amborella trichopoda Baill.  and Liriodendron tulipifera L.  possess 41 complete protein-coding genes. Loss and metastasis of a large number of mitochondrial genes in other mitochondrial genomes, especially the loss and transfer of ribosomal protein-coding genes and succinate dehydrogenase (sdh) genes, lead to changes in gene contents in angiosperms (Adams et al., 2002). All of the sdh genes were lost in the mitochondrial genome of Cynodon dactylon × C. transvaalensis.
Repetitive genes are prevalent in vascular plants (Goremykin et al., 2009); e.g., Nelumbo nucifera Gaertn. and maize (CMS-C) contain 6 and 10 duplicated protein genes, respectively (Allen et al., 2007;Gui et al., 2016). In the mitochondrial genome of Cynodon dactylon × C. transvaalensis, psaB (a chloroplast encoding gene) and 2 rRNA genes (rrn18 and rrn5) have 2 copies. Large repeats are very active and frequently recombined, which may lead to recombination between genes, causing mutations in gene sequences and transcripts (Carlson and Kemble, 1985). In addition, 46 unknown functional open reading frames (ORFs) were also predicted in this study, accounting for 6.65% (24, 369 bp) of the total length (Table 1). Numerous studies have shown that these ORF are not completely dysfunctional (Siqueira et al., 2001). Some conserved ORFs between species may be similar to functional genes lost in some mitochondrial genomes (such as atp4, atp8, sdh3, or sdh4) and exert important functions during respiratory metabolism (Heazlewood et al., 2003). For example, the presence of a conserved ORF, as identified in the mitochondrial genome of some angiosperms, is functionally similar to the rpl10 gene (Kubo and Arimura, 2010).
The preference of codon usage is a common phenomenon existing in nature and is mainly determined by the dynamic equilibrium related to gene mutation and natural selection (Bulmer, 1991;Wong et al., 2002). Natural selection often makes organisms prefer to use optimal codons, and mutation can lead to the existence of some nonoptimal codons. Different genes of different species or the same species have different codon preference through long-term evolution (Murray et al., 1989). The result of relative synonymous codon usage (RSCU) showed that the frequency of A or U at the third position of Cynodon dactylon × C. transvaalensis mitochondrial genome codons is higher than that of G or C and reflects a high A/T content in the third position of each codon (Table  4). Furthermore, the codons AGA, CUU, GAU, GCU, UAU, and UCU showed a higher usage frequency, as the numerical values of RSCU are greater than 1.3.

Chloroplast DNA insertions in Cynodon dactylon × C. transvaalensis mitochondrial DNA
Plant mitochondrial genomes typically contain DNA from plasmid and nuclear genomes and, in some cases, from other species including bacteria, viruses, and plants (Timmis et al., 2004;Goremykin et al., 2009;Alverson et al., 2010;Rodríguez-Moreno et al., 2011;Rice et al., 2013). The plastid-like sequences were first identified by Nakazono and Hirai (1993). The plastid-derived sequences are variable in the mitochondrial genomes of seed plants, accounting for 1%-12% (Mower et al., 2012). In this study, 33 chloroplast DNA fragments were identified in the Cynodon dactylon × C. transvaalensis mitochondrial genome, ranging from 72 to 3003 bp. The total length of   (rrn4.5, rrn5), and numerous partial genes and intergenic spacer regions were identified. All plasmid fragments are located in intergenic regions. The protein-coding genes transferred from the plastid seem to be nonfunctional. However, tRNAs of plastid origin are likely to be functional (Gui et al., 2016;Nakazono and Hirai, 1993).

Phylogenetic analyses
Because organelles are maternally inherited and have unique characteristics in evolution, they are important for reconstructing phylogenetic relationships between organisms (Jansen et al., 2005). Nine common genes in 26 mitochondrial genomes, including atp6, atp9, ccmC, cob, cox1, cox3, nad3, nad6, and rps3, were selected for phylogenetic analysis. The trees were completed with both maximum likelihood (ML; Figure 3A) and neighbor joining (NJ; Figure 3B) methods. The 24 complete chloroplast genome sequences were also used for constructing trees with the methods of ML ( Figure 4A) and NJ ( Figure 4B). In the phylogenetic  (Byng et al., 2016). The phylogenetic trees built on the chloroplast genome or 9 mitochondrial genes by using different methods showed the same result. The phylogenetic trees built on the chloroplast genome were congruent with the plant taxonomy and NCBI taxonomy common tree (Liu et al., 2013). However, the phylogenetic trees built on 9 mitochondrial genes showed some differences from the common tree. This may be due to different evolutionary rates of mitochondrial genes in different plants (Ma et al., 2012). In contrast to the chloroplast genome, the mitochondrial genome evolves at a slower rate (less than Figure 2. Schematic representation of chloroplast DNA transferred into the C. dactylon × C. transvaalensis mitochondrial genome. The connected regions represent highly similar regions on both genomes. The color of the wiring area, according to the length of the match, is standard and appears in red, yellow, green, and blue. 1/3 the rate of the chloroplast genome), with sequence variations smaller than chloroplasts and nuclear DNA, providing limited evolutionary information (Norman and Gray, 2001;Perrotta et al., 2002).

Contribution of authors
Shilian HUANG and Yancai SHI contributed equally to this work.