A Framework for Query Optimization Algorithms for Biological Data

Khalid Mohammad Jaber; Nesreen A. Hamad; Fatima M. Quıam

doi:10.22399/ijcesen.508889

Araştırma Makalesi

A Framework for Query Optimization Algorithms for Biological Data

Yıl 2019, Cilt: 5 Sayı: 2, 76 - 79, 31.07.2019

Khalid Mohammad Jaber , Nesreen A. Hamad Fatima M. Quıam

https://doi.org/10.22399/ijcesen.508889

https://izlik.org/JA44SL23TN

Öz

Recently,
the size of biological databases has significantly increased, with a
growing number of users and rates of queries. As a result, some
databases have reached a terabyte size. On the other hand, the need
to access the databases at the fastest possible rates is increasing.
At this point, the computer scientists could assist to organize the
data and query in a way that allows biologists to quickly search
existing information. In this paper, a query model for DNA and
protein sequence datasets is proposed. This method of dealing with
the query can effectively and rapidly retrieve all similar
proteins/DNA from a large database. A theoretical and conceptual
proposed framework is derived using query techniques form different
applications. The results show that the query optimization algorithms
reduce the query processing time in comparison with the normal query
searching algorithm.

Anahtar Kelimeler

Query Optimization , Searching Algorithms , bioinformatics , parallel computing

Kaynakça

[1] D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and E. W. Sayers (2010). Genbank, Nucleic acids research. 38(1), D46-D51, DOI: 10.1093/nar/gkx1094.
[2] P. Rice, I. Longden, and A. Bleasby (2000). Emboss: the european molecular biology open software suite. Trends in genetics, 16 (6), 276-277. DOI: 10.1016/S0168-9525(00)02024-2
[3] A. Bairoch and R. Apweiler (2000). The swiss-prot protein sequence database and its supplement trembl in 2000. Nucleic acids research, 28 (1), 45-48. DOI:10.1093/nar/28.1.45
[4] K. D. Pruitt, T. Tatusova, and D. R. Maglott (2007), Ncbi reference sequences (refseq): a curated non-redundant sequence database of genomes, transcripts and proteins, Nucleic acids research. 35(1), D61-D65. doi: [10.1093/nar/gkl842]
[5] P. Librado and J. Rozas, Dnasp (2009). v5: a software for comprehensive analysis of dna polymorphism data, Bioinformatics. 25(11), 1451-1452. DOI: 10.1093/bioinformatics/btp187
[6] C. Plot (2000). The sequence manipulation suite: Javascript programs for analyzing and formatting protein and dna sequences, Biotechniques. 28(6), 1102-1104. DOI:10.2144/00286ir01
[7] Jaber, K. M., Abdullah, R., and Rashid, N (2014). A. Fast decision tree-based method to index large DNA-protein sequence databases using hybrid distributed-shared memory programming model. International Journal of Bioinformatics Research and Applications. 10(3), 321-340. doi: 10.1504/IJBRA.2014.060765.
[8] R. J. Block, D. Bolling et al. (1945). The amino acid composition of proteins and foods. analytical methods and results. The amino acid composition of proteins and foods. Analytical methods and results. 17(4).
[9] R. Leinonen, R. Akhtar, E. Birney, L. Bower, A. Cerdeno-Tarraga, Y. Cheng, I. Cleland, N. Faruque, N. Goodgame, R. Gibson et al. (2011). The european nucleotide archive, Nucleic acids research. 39, D28-D31.
[10] Ian Korf, M.Y., Joseph Bedell (2003). BLAST.
[11] Rieffel, M. A., Gill, T. G. and White, W. R. (2004). Bioinformatics clusters in action., Cluster World.
[12] Prasan Roy(2000). Rule-Based Query Optimization using the Volcano Framework., PhD thesis, IIT Bombay.
[13] NCBI Website, URL: http://blast.ncbi.nlm.nih.gov,2018.
[14] Whitford, D., Proteins (2005). Structure and Function., 1 Edition, Wiley, 2005.
[15] DDBJ Database Available at: http://www.ddbj.nig.ac.jp/breakdown_stats/dbgrowth-old-e.html. [Accessed 12 April 2017].

Yıl 2019, Cilt: 5 Sayı: 2, 76 - 79, 31.07.2019

Khalid Mohammad Jaber , Nesreen A. Hamad Fatima M. Quıam

https://doi.org/10.22399/ijcesen.508889

https://izlik.org/JA44SL23TN

Öz

Kaynakça

[1] D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and E. W. Sayers (2010). Genbank, Nucleic acids research. 38(1), D46-D51, DOI: 10.1093/nar/gkx1094.
[2] P. Rice, I. Longden, and A. Bleasby (2000). Emboss: the european molecular biology open software suite. Trends in genetics, 16 (6), 276-277. DOI: 10.1016/S0168-9525(00)02024-2
[3] A. Bairoch and R. Apweiler (2000). The swiss-prot protein sequence database and its supplement trembl in 2000. Nucleic acids research, 28 (1), 45-48. DOI:10.1093/nar/28.1.45
[4] K. D. Pruitt, T. Tatusova, and D. R. Maglott (2007), Ncbi reference sequences (refseq): a curated non-redundant sequence database of genomes, transcripts and proteins, Nucleic acids research. 35(1), D61-D65. doi: [10.1093/nar/gkl842]
[5] P. Librado and J. Rozas, Dnasp (2009). v5: a software for comprehensive analysis of dna polymorphism data, Bioinformatics. 25(11), 1451-1452. DOI: 10.1093/bioinformatics/btp187
[6] C. Plot (2000). The sequence manipulation suite: Javascript programs for analyzing and formatting protein and dna sequences, Biotechniques. 28(6), 1102-1104. DOI:10.2144/00286ir01
[7] Jaber, K. M., Abdullah, R., and Rashid, N (2014). A. Fast decision tree-based method to index large DNA-protein sequence databases using hybrid distributed-shared memory programming model. International Journal of Bioinformatics Research and Applications. 10(3), 321-340. doi: 10.1504/IJBRA.2014.060765.
[8] R. J. Block, D. Bolling et al. (1945). The amino acid composition of proteins and foods. analytical methods and results. The amino acid composition of proteins and foods. Analytical methods and results. 17(4).
[9] R. Leinonen, R. Akhtar, E. Birney, L. Bower, A. Cerdeno-Tarraga, Y. Cheng, I. Cleland, N. Faruque, N. Goodgame, R. Gibson et al. (2011). The european nucleotide archive, Nucleic acids research. 39, D28-D31.
[10] Ian Korf, M.Y., Joseph Bedell (2003). BLAST.
[11] Rieffel, M. A., Gill, T. G. and White, W. R. (2004). Bioinformatics clusters in action., Cluster World.
[12] Prasan Roy(2000). Rule-Based Query Optimization using the Volcano Framework., PhD thesis, IIT Bombay.
[13] NCBI Website, URL: http://blast.ncbi.nlm.nih.gov,2018.
[14] Whitford, D., Proteins (2005). Structure and Function., 1 Edition, Wiley, 2005.
[15] DDBJ Database Available at: http://www.ddbj.nig.ac.jp/breakdown_stats/dbgrowth-old-e.html. [Accessed 12 April 2017].

Toplam 15 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Mühendislik
Bölüm	Araştırma Makalesi
Yazarlar	Khalid Mohammad Jaber 0000-0002-8458-401X Nesreen A. Hamad Bu kişi benim Fatima M. Quıam Bu kişi benim
Gönderilme Tarihi	6 Ocak 2019
Kabul Tarihi	10 Haziran 2019
Yayımlanma Tarihi	31 Temmuz 2019
DOI	https://doi.org/10.22399/ijcesen.508889
IZ	https://izlik.org/JA44SL23TN
Yayımlandığı Sayı	Yıl 2019 Cilt: 5 Sayı: 2

Kaynak Göster

APA	Jaber, K. M., Hamad, N. A., & Quıam, F. M. (2019). A Framework for Query Optimization Algorithms for Biological Data. International Journal of Computational and Experimental Science and Engineering, 5(2), 76-79. https://doi.org/10.22399/ijcesen.508889