Araştırma Makalesi
BibTex RIS Kaynak Göster

A Framework for Query Optimization Algorithms for Biological Data

Yıl 2019, , 76 - 79, 31.07.2019
https://doi.org/10.22399/ijcesen.508889

Öz

Recently,
the size of biological databases has significantly increased, with a
growing number of users and rates of queries. As a result, some
databases have reached a terabyte size. On the other hand, the need
to access the databases at the fastest possible rates is increasing.
At this point, the computer scientists could assist to organize the
data and query in a way that allows biologists to quickly search
existing information. In this paper, a query model for DNA and
protein sequence datasets is proposed. This method of dealing with
the query can effectively and rapidly retrieve all similar
proteins/DNA from a large database. A theoretical and conceptual
proposed framework is derived using query techniques form different
applications. The results show that the query optimization algorithms
reduce the query processing time in comparison with the normal query
searching algorithm.

Kaynakça

  • [1] D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and E. W. Sayers (2010). Genbank, Nucleic acids research. 38(1), D46-D51, DOI: 10.1093/nar/gkx1094.
  • [2] P. Rice, I. Longden, and A. Bleasby (2000). Emboss: the european molecular biology open software suite. Trends in genetics, 16 (6), 276-277. DOI: 10.1016/S0168-9525(00)02024-2
  • [3] A. Bairoch and R. Apweiler (2000). The swiss-prot protein sequence database and its supplement trembl in 2000. Nucleic acids research, 28 (1), 45-48. DOI:10.1093/nar/28.1.45
  • [4] K. D. Pruitt, T. Tatusova, and D. R. Maglott (2007), Ncbi reference sequences (refseq): a curated non-redundant sequence database of genomes, transcripts and proteins, Nucleic acids research. 35(1), D61-D65. doi:  [10.1093/nar/gkl842]
  • [5] P. Librado and J. Rozas, Dnasp (2009). v5: a software for comprehensive analysis of dna polymorphism data, Bioinformatics. 25(11), 1451-1452. DOI: 10.1093/bioinformatics/btp187
  • [6] C. Plot (2000). The sequence manipulation suite: Javascript programs for analyzing and formatting protein and dna sequences, Biotechniques. 28(6), 1102-1104. DOI:10.2144/00286ir01
  • [7] Jaber, K. M., Abdullah, R., and Rashid, N (2014). A. Fast decision tree-based method to index large DNA-protein sequence databases using hybrid distributed-shared memory programming model. International Journal of Bioinformatics Research and Applications. 10(3), 321-340.  doi: 10.1504/IJBRA.2014.060765.
  • [8] R. J. Block, D. Bolling et al. (1945). The amino acid composition of proteins and foods. analytical methods and results. The amino acid composition of proteins and foods. Analytical methods and results. 17(4).
  • [9] R. Leinonen, R. Akhtar, E. Birney, L. Bower, A. Cerdeno-Tarraga, Y. Cheng, I. Cleland, N. Faruque, N. Goodgame, R. Gibson et al. (2011). The european nucleotide archive, Nucleic acids research. 39, D28-D31.
  • [10] Ian Korf, M.Y., Joseph Bedell (2003). BLAST.
  • [11] Rieffel, M. A., Gill, T. G. and White, W. R. (2004). Bioinformatics clusters in action., Cluster World.
  • [12] Prasan Roy(2000). Rule-Based Query Optimization using the Volcano Framework., PhD thesis, IIT Bombay.
  • [13] NCBI Website, URL: http://blast.ncbi.nlm.nih.gov,2018.
  • [14] Whitford, D., Proteins (2005). Structure and Function., 1 Edition, Wiley, 2005.
  • [15] DDBJ Database Available at: http://www.ddbj.nig.ac.jp/breakdown_stats/dbgrowth-old-e.html. [Accessed 12 April 2017].
Yıl 2019, , 76 - 79, 31.07.2019
https://doi.org/10.22399/ijcesen.508889

Öz

Kaynakça

  • [1] D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and E. W. Sayers (2010). Genbank, Nucleic acids research. 38(1), D46-D51, DOI: 10.1093/nar/gkx1094.
  • [2] P. Rice, I. Longden, and A. Bleasby (2000). Emboss: the european molecular biology open software suite. Trends in genetics, 16 (6), 276-277. DOI: 10.1016/S0168-9525(00)02024-2
  • [3] A. Bairoch and R. Apweiler (2000). The swiss-prot protein sequence database and its supplement trembl in 2000. Nucleic acids research, 28 (1), 45-48. DOI:10.1093/nar/28.1.45
  • [4] K. D. Pruitt, T. Tatusova, and D. R. Maglott (2007), Ncbi reference sequences (refseq): a curated non-redundant sequence database of genomes, transcripts and proteins, Nucleic acids research. 35(1), D61-D65. doi:  [10.1093/nar/gkl842]
  • [5] P. Librado and J. Rozas, Dnasp (2009). v5: a software for comprehensive analysis of dna polymorphism data, Bioinformatics. 25(11), 1451-1452. DOI: 10.1093/bioinformatics/btp187
  • [6] C. Plot (2000). The sequence manipulation suite: Javascript programs for analyzing and formatting protein and dna sequences, Biotechniques. 28(6), 1102-1104. DOI:10.2144/00286ir01
  • [7] Jaber, K. M., Abdullah, R., and Rashid, N (2014). A. Fast decision tree-based method to index large DNA-protein sequence databases using hybrid distributed-shared memory programming model. International Journal of Bioinformatics Research and Applications. 10(3), 321-340.  doi: 10.1504/IJBRA.2014.060765.
  • [8] R. J. Block, D. Bolling et al. (1945). The amino acid composition of proteins and foods. analytical methods and results. The amino acid composition of proteins and foods. Analytical methods and results. 17(4).
  • [9] R. Leinonen, R. Akhtar, E. Birney, L. Bower, A. Cerdeno-Tarraga, Y. Cheng, I. Cleland, N. Faruque, N. Goodgame, R. Gibson et al. (2011). The european nucleotide archive, Nucleic acids research. 39, D28-D31.
  • [10] Ian Korf, M.Y., Joseph Bedell (2003). BLAST.
  • [11] Rieffel, M. A., Gill, T. G. and White, W. R. (2004). Bioinformatics clusters in action., Cluster World.
  • [12] Prasan Roy(2000). Rule-Based Query Optimization using the Volcano Framework., PhD thesis, IIT Bombay.
  • [13] NCBI Website, URL: http://blast.ncbi.nlm.nih.gov,2018.
  • [14] Whitford, D., Proteins (2005). Structure and Function., 1 Edition, Wiley, 2005.
  • [15] DDBJ Database Available at: http://www.ddbj.nig.ac.jp/breakdown_stats/dbgrowth-old-e.html. [Accessed 12 April 2017].
Toplam 15 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Mühendislik
Bölüm Research Articles
Yazarlar

Khalid Mohammad Jaber 0000-0002-8458-401X

Nesreen A. Hamad Bu kişi benim

Fatima M. Quıam Bu kişi benim

Yayımlanma Tarihi 31 Temmuz 2019
Gönderilme Tarihi 6 Ocak 2019
Kabul Tarihi 10 Haziran 2019
Yayımlandığı Sayı Yıl 2019

Kaynak Göster

APA Jaber, K. M., Hamad, N. A., & Quıam, F. M. (2019). A Framework for Query Optimization Algorithms for Biological Data. International Journal of Computational and Experimental Science and Engineering, 5(2), 76-79. https://doi.org/10.22399/ijcesen.508889