Research Article

Clustering of Mitochondrial D-loop Sequences Using Similarity Matrix, PCA and K-means Algorithm

Volume: 4 Number: Special Issue-1 December 26, 2016
EN

Clustering of Mitochondrial D-loop Sequences Using Similarity Matrix, PCA and K-means Algorithm

Abstract

In this study, mitochondrial displacement-loop (D-loop) sequences isolated from different hominid species are clustered using similarity matrix, Principal Component Analysis (PCA) and K-means algorithm. Firstly, the mitochondrial D-loop sequence data are retrieved from the GenBank database and copied into MATLAB. Pairwise distances are computed using p distance and Jukes-Cantor methods. A phylogenetic tree is created and then a similarity matrix is generated according to the pairwise distances. Furthermore, the clustering is performed using only K-means algorithm. After that PCA and K-means are used together in order to cluster mitochondrial D-loop sequences.

Keywords

References

  1. H. Zischler, H. Geisert, A. Von Haeseler, and S. Pääbo, “A nuclear 'fossil' of the mitochondrial D-loop and the origin of modern humans,” Nature, vol. 378, no. 6556, pp. 489–492, November 1995.
  2. W. M. Brown, E. M. Prager, A. Wang, and A. C. Wilson, “Mitochondrial DNA sequences of primates: tempo and mode of evolution,” Journal of Molecular Evolution, vol. 18, no. 4, pp. 225–239, July 1982.
  3. D. R. Maddison, M. Ruvolo, and D. L. Swofford, “Geographic origins of human mitochondrial DNA phylogenetic inference from control region sequences,” Systematic Biology, vol. 41, no. 1, pp. 111−124, 1992.
  4. A. R. Hoelzel, J. M. Hancock, and G. A. Dover, “Evolution of the Cetacean Mitochondrial D-Loop Region,” Molecular Biology and Evolution, vol. 8, no. 3, pp. 475−493, 1991.
  5. W. M. Brown, “The mitochondrial genome of animals,” MacIntyre RJ (ed) Molecular Evolutionary Genetics, Plenum Press, New York, pp. 95−130, 1985.
  6. A. C. Wilson, R. L. Cann, S. M. Carr, M. George, U. B. Gyllensten, K. M. Helm-Bychowski, R. G. Higuchi, S. R. Palumbi, E. M. Prager, R. D. Sage, and M. Stoneking, “Mitochondrial DNA and two perspectives on evolutionary genetics,” Biological Journal of the Linnean Society, vol. 26, no. 4, pp. 375−400, December 1985.
  7. W. B. Upholt and I. B. Dawid, “Mapping of mitochondrial DNA of individual sheep and goats: rapid evolution in the D loop region,” Cell, vol. 11, no. 3, pp. 571−583, July 1977.
  8. M. W. Walberg and D. A. Clayton, “Sequence and properties of the human KB cell and mouse L cell D-loop regions of mitochondrial DNA,” Nucleic Acids Research, vol. 9, no. 20, pp. 5411−5421, October 1981.

Details

Primary Language

English

Subjects

Engineering

Journal Section

Research Article

Authors

Can Eyüpoğlu
İSTANBUL TİCARET ÜNİVERSİTESİ
Türkiye

Publication Date

December 26, 2016

Submission Date

December 27, 2016

Acceptance Date

December 1, 2016

Published in Issue

Year 2016 Volume: 4 Number: Special Issue-1

APA
Eyüpoğlu, C. (2016). Clustering of Mitochondrial D-loop Sequences Using Similarity Matrix, PCA and K-means Algorithm. International Journal of Intelligent Systems and Applications in Engineering, 4(Special Issue-1), 244-248. https://doi.org/10.18201/ijisae.281814
AMA
1.Eyüpoğlu C. Clustering of Mitochondrial D-loop Sequences Using Similarity Matrix, PCA and K-means Algorithm. International Journal of Intelligent Systems and Applications in Engineering. 2016;4(Special Issue-1):244-248. doi:10.18201/ijisae.281814
Chicago
Eyüpoğlu, Can. 2016. “Clustering of Mitochondrial D-Loop Sequences Using Similarity Matrix, PCA and K-Means Algorithm”. International Journal of Intelligent Systems and Applications in Engineering 4 (Special Issue-1): 244-48. https://doi.org/10.18201/ijisae.281814.
EndNote
Eyüpoğlu C (December 1, 2016) Clustering of Mitochondrial D-loop Sequences Using Similarity Matrix, PCA and K-means Algorithm. International Journal of Intelligent Systems and Applications in Engineering 4 Special Issue-1 244–248.
IEEE
[1]C. Eyüpoğlu, “Clustering of Mitochondrial D-loop Sequences Using Similarity Matrix, PCA and K-means Algorithm”, International Journal of Intelligent Systems and Applications in Engineering, vol. 4, no. Special Issue-1, pp. 244–248, Dec. 2016, doi: 10.18201/ijisae.281814.
ISNAD
Eyüpoğlu, Can. “Clustering of Mitochondrial D-Loop Sequences Using Similarity Matrix, PCA and K-Means Algorithm”. International Journal of Intelligent Systems and Applications in Engineering 4/Special Issue-1 (December 1, 2016): 244-248. https://doi.org/10.18201/ijisae.281814.
JAMA
1.Eyüpoğlu C. Clustering of Mitochondrial D-loop Sequences Using Similarity Matrix, PCA and K-means Algorithm. International Journal of Intelligent Systems and Applications in Engineering. 2016;4:244–248.
MLA
Eyüpoğlu, Can. “Clustering of Mitochondrial D-Loop Sequences Using Similarity Matrix, PCA and K-Means Algorithm”. International Journal of Intelligent Systems and Applications in Engineering, vol. 4, no. Special Issue-1, Dec. 2016, pp. 244-8, doi:10.18201/ijisae.281814.
Vancouver
1.Can Eyüpoğlu. Clustering of Mitochondrial D-loop Sequences Using Similarity Matrix, PCA and K-means Algorithm. International Journal of Intelligent Systems and Applications in Engineering. 2016 Dec. 1;4(Special Issue-1):244-8. doi:10.18201/ijisae.281814