An alignment-free method for bulk comparison of protein sequences from different species
Öz
The available number
of protein sequences rapidly increased with the development of new sequencing
techniques. This in turn led to an urgent need for the development of new
computational methods utilizing these data for the solution of different
biological problems. One of these problems is the comparison of protein
sequences from different species to reveal their evolutional relationship.
Recently, several alignment-free methods proposed for this purpose. Here in
this study, we also proposed an alignment-free method for the same purpose.
Different from the existing methods, the proposed method not only allows for a
pairwise comparison of two protein sequences, but also it allows for a bulk
comparison of multiple protein sequences simultaneously. Computational results
performed on gold-standard datasets showed that, bulk comparison of multiple
sequences is much faster than its pairwise counterpart and the proposed method
achieves a performance which is quite competitive with the state-of-the-art
alignment-based method, ClustalW.0000-0003-4810-1970
Anahtar Kelimeler
Kaynakça
- Z. Jiang and Z. Yanhong, "Using bioinformatics for drug target identification from the genome." American Journal of Pharmacogenomics 5.6 (2005): 387-396.
- M.S. Waterman, "Identification of common molecular subsequence." Mol. Biol 147 (1981): 195-197.
- S. F. Altschul, et al., "Basic local alignment search tool." Journal of molecular biology 215.3 (1990): 403-410.
- J. Yang and L. Zhang, "Run probabilities of seed-like patterns and identifying good transition seeds." Journal of Computational Biology 15.10 (2008): 1295-1313.
- A. Chakraborty and B. Sanghamitra, "FOGSAA: Fast optimal global sequence alignment algorithm." Scientific reports 3 (2013): 1746.
- O. Gotoh, "An improved algorithm for matching biological sequences." Journal of molecular biology 162.3 (1982): 705-708.
- X. Liu, et al., "Number of distinct sequence alignments with k-match and match sections." Computers in biology and medicine 63 (2015): 287-292.
- C. Li, et al., "Protein Sequence Comparison and DNA-binding Protein Identification with Generalized PseAAC and Graphical Representation." Combinatorial chemistry & high throughput screening 21.2 (2018): 100-110.
Ayrıntılar
Birincil Dil
İngilizce
Konular
Elektrik Mühendisliği
Bölüm
Araştırma Makalesi
Yazarlar
Berat Dogan
*
0000-0003-4810-1970
Türkiye
Yayımlanma Tarihi
30 Ekim 2019
Gönderilme Tarihi
16 Mart 2019
Kabul Tarihi
23 Eylül 2019
Yayımlandığı Sayı
Yıl 2019 Cilt: 7 Sayı: 4