Research Article

LSTM-GRU Based Deep Learning Model with Word2Vec for Transcription Factors in Primates

Volume: 11 Number: 1 January 30, 2023
EN

LSTM-GRU Based Deep Learning Model with Word2Vec for Transcription Factors in Primates

Abstract

The study of the structures of proteins and the relationships of amino acids remains a challenging problem in biology. Although some bioinformatics-based studies provide partial solutions, some major problems remain. At the beginning of these problems are the logic of the sequence of amino acids and the diversity of proteins. Although these variations are biologically detectable, these experiments are costly and time-consuming. Considering that there are many unclassified sequences in the world, it is inevitable that a faster solution must be found. For this reason, we propose a deep learning model to classify transcription factor proteins of primates. Our model has a hybrid structure that uses Recurrent Neural Network (RNN) based Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks with Word2Vec preprocessing step. Our model has 97.96% test accuracy, 97.55% precision, 95.26% recall, 96.22% f1-score. Our model was also tested with 5-fold cross-validation and reached 97.42% result. In the prepared model, LSTM was used in layers with fewer units, and GRU was used in layers with more units, and it was aimed to make the model a model that can be trained and run as quickly as possible. With the added dropout layers, the overfitting problem of the model is prevented.

Keywords

References

  1. J. J. Shu, “A new integrated symmetrical table for genetic codes,” Biosystems, vol. 151, pp. 21–26, Jan. 2017, doi: 10.1016/J.BIOSYSTEMS.2016.11.004.
  2. J. D. WATSON and F. H. C. CRICK, “Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid,” Nature, vol. 171, no. 4356, pp. 737–738, Apr. 1953, doi: 10.1038/171737a0.
  3. D. R. Ferrier, “Protein Yapısı ve İşlevi,” in Lippincott Biyokimya: Görsel Anlatımlı Çalışma Kitapları, B. A. Jameson, Ed. İstanbul: Nobel Tıp Kitapevleri, 2019, pp. 1–68.
  4. Pfam, “Family: HLH (PF00010).” http://pfam.xfam.org/family/pf00010 (accessed Feb. 02, 2019).
  5. T. Kaplan and M. D. Biggin, “Quantitative Models of the Mechanisms that Control Genome-Wide Patterns of Animal Transcription Factor Binding,” Methods Cell Biol, vol. 110, pp. 263–283, Jan. 2012, doi: 10.1016/B978-0-12-388403-9.00011-4.
  6. D. S. Latchman, “Transcription factors: an overview Function of transcription factors,” Int. J. Exp. Path, vol. 74, pp. 417–422, 1993.
  7. M. Karin, “Too many transcription factors: positive and negative interactions,” New Biol, vol. 2, no. 2, pp. 126–131, 1990.
  8. D. S. Latchman, “Transcription factors: An overview,” Int J Biochem Cell Biol, vol. 29, no. 12, pp. 1305–1312, Dec. 1997, doi: 10.1016/S1357-2725(97)00085-X.

Details

Primary Language

English

Subjects

Artificial Intelligence

Journal Section

Research Article

Publication Date

January 30, 2023

Submission Date

October 18, 2022

Acceptance Date

November 13, 2022

Published in Issue

Year 2023 Volume: 11 Number: 1

APA
Öncül, A. B. (2023). LSTM-GRU Based Deep Learning Model with Word2Vec for Transcription Factors in Primates. Balkan Journal of Electrical and Computer Engineering, 11(1), 42-49. https://doi.org/10.17694/bajece.1191009
AMA
1.Öncül AB. LSTM-GRU Based Deep Learning Model with Word2Vec for Transcription Factors in Primates. Balkan Journal of Electrical and Computer Engineering. 2023;11(1):42-49. doi:10.17694/bajece.1191009
Chicago
Öncül, Ali Burak. 2023. “LSTM-GRU Based Deep Learning Model With Word2Vec for Transcription Factors in Primates”. Balkan Journal of Electrical and Computer Engineering 11 (1): 42-49. https://doi.org/10.17694/bajece.1191009.
EndNote
Öncül AB (January 1, 2023) LSTM-GRU Based Deep Learning Model with Word2Vec for Transcription Factors in Primates. Balkan Journal of Electrical and Computer Engineering 11 1 42–49.
IEEE
[1]A. B. Öncül, “LSTM-GRU Based Deep Learning Model with Word2Vec for Transcription Factors in Primates”, Balkan Journal of Electrical and Computer Engineering, vol. 11, no. 1, pp. 42–49, Jan. 2023, doi: 10.17694/bajece.1191009.
ISNAD
Öncül, Ali Burak. “LSTM-GRU Based Deep Learning Model With Word2Vec for Transcription Factors in Primates”. Balkan Journal of Electrical and Computer Engineering 11/1 (January 1, 2023): 42-49. https://doi.org/10.17694/bajece.1191009.
JAMA
1.Öncül AB. LSTM-GRU Based Deep Learning Model with Word2Vec for Transcription Factors in Primates. Balkan Journal of Electrical and Computer Engineering. 2023;11:42–49.
MLA
Öncül, Ali Burak. “LSTM-GRU Based Deep Learning Model With Word2Vec for Transcription Factors in Primates”. Balkan Journal of Electrical and Computer Engineering, vol. 11, no. 1, Jan. 2023, pp. 42-49, doi:10.17694/bajece.1191009.
Vancouver
1.Ali Burak Öncül. LSTM-GRU Based Deep Learning Model with Word2Vec for Transcription Factors in Primates. Balkan Journal of Electrical and Computer Engineering. 2023 Jan. 1;11(1):42-9. doi:10.17694/bajece.1191009

Cited By

All articles published by BAJECE are licensed under the Creative Commons Attribution 4.0 International License. This permits anyone to copy, redistribute, remix, transmit and adapt the work provided the original work and source is appropriately cited.Creative Commons Lisansı