LSTM-GRU Based Deep Learning Model with Word2Vec for Transcription Factors in Primates

Ali Burak Öncül

doi:10.17694/bajece.1191009

Research Article

Year 2023, , 42 - 49, 30.01.2023

Ali Burak Öncül

https://doi.org/10.17694/bajece.1191009

Cited By: 3

Abstract

References

J. J. Shu, “A new integrated symmetrical table for genetic codes,” Biosystems, vol. 151, pp. 21–26, Jan. 2017, doi: 10.1016/J.BIOSYSTEMS.2016.11.004.
J. D. WATSON and F. H. C. CRICK, “Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid,” Nature, vol. 171, no. 4356, pp. 737–738, Apr. 1953, doi: 10.1038/171737a0.
D. R. Ferrier, “Protein Yapısı ve İşlevi,” in Lippincott Biyokimya: Görsel Anlatımlı Çalışma Kitapları, B. A. Jameson, Ed. İstanbul: Nobel Tıp Kitapevleri, 2019, pp. 1–68.
Pfam, “Family: HLH (PF00010).” http://pfam.xfam.org/family/pf00010 (accessed Feb. 02, 2019).
T. Kaplan and M. D. Biggin, “Quantitative Models of the Mechanisms that Control Genome-Wide Patterns of Animal Transcription Factor Binding,” Methods Cell Biol, vol. 110, pp. 263–283, Jan. 2012, doi: 10.1016/B978-0-12-388403-9.00011-4.
D. S. Latchman, “Transcription factors: an overview Function of transcription factors,” Int. J. Exp. Path, vol. 74, pp. 417–422, 1993.
M. Karin, “Too many transcription factors: positive and negative interactions,” New Biol, vol. 2, no. 2, pp. 126–131, 1990.
D. S. Latchman, “Transcription factors: An overview,” Int J Biochem Cell Biol, vol. 29, no. 12, pp. 1305–1312, Dec. 1997, doi: 10.1016/S1357-2725(97)00085-X.
D. Petrey and B. Honig, “Is protein classification necessary? Toward alternative approaches to function annotation,” Curr Opin Struct Biol, vol. 19, no. 3, pp. 363–368, Jun. 2009, doi: 10.1016/J.SBI.2009.02.001.
P. Baldi and S. Brunak, Bioinformatics, Second Edition: The Machine Learning Approach. Cambridge: MIT Press, 2001.
S. R. Eddy, “Hidden Markov models,” Curr Opin Struct Biol, vol. 6, no. 3, pp. 361–365, Jun. 1996, doi: 10.1016/S0959-440X(96)80056-X.
M. M. Gromiha, “Protein Sequence Analysis,” Protein Bioinformatics, pp. 29–62, Jan. 2010, doi: 10.1016/B978-8-1312-2297-3.50002-3.
S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, “Basic local alignment search tool,” J Mol Biol, vol. 215, no. 3, pp. 403–410, Oct. 1990, doi: 10.1016/S0022-2836(05)80360-2.
M. N. Price et al., “Mutant phenotypes for thousands of bacterial genes of unknown function,” Nature, vol. 557, no. 7706, p. 503—509, May 2018, doi: 10.1038/s41586-018-0124-0.
N. Strodthoff, P. Wagner, M. Wenzel, and W. Samek, “UDSMProt: universal deep sequence models for protein classification,” Bioinformatics, vol. 36, no. 8, pp. 2401–2409, Apr. 2020, doi: 10.1093/bioinformatics/btaa003.
K. S. Naveenkumar, B. R. Mohammed Harun, R. Vinayakumar, and K. P. Soman, “Protein Family Classification using Deep Learning,” bioRxiv, p. 414128, Jan. 2018, doi: 10.1101/414128.
X. Du, Y. Cai, S. Wang, and L. Zhang, “Overview of deep learning,” in 2016 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC), 2016, pp. 159–164. doi: 10.1109/YAC.2016.7804882.
M. Huerta, F. Haseltine, Y. Liu, G. Downing, and B. Seto, “NIH working definition of bioinformatics and computational biology,” Jul. 2000.
Q. Gong, W. Ning, and W. Tian, “GoFDR: A sequence alignment based method for predicting protein functions,” Methods, vol. 93, pp. 3–14, Jan. 2016, doi: 10.1016/J.YMETH.2015.08.009.
H. bin Shen and K. C. Chou, “EzyPred: A top–down approach for predicting enzyme functional classes and subclasses,” Biochem Biophys Res Commun, vol. 364, no. 1, pp. 53–59, Dec. 2007, doi: 10.1016/J.BBRC.2007.09.098.
A. Dalkiran, A. S. Rifaioglu, M. J. Martin, R. Cetin-Atalay, V. Atalay, and T. Doğan, “ECPred: a tool for the prediction of the enzymatic functions of protein sequences based on the EC nomenclature,” BMC Bioinformatics, vol. 19, no. 1, p. 334, 2018, doi: 10.1186/s12859-018-2368-y.
D. Cozzetto, F. Minneci, H. Currant, and D. T. Jones, “FFPred 3: feature-based function prediction for all Gene Ontology domains,” Sci Rep, vol. 6, no. 1, p. 31865, 2016, doi: 10.1038/srep31865.
E. Asgari and M. R. K. Mofrad, “Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics,” PLoS One, vol. 10, no. 11, Nov. 2015.
N. Q. K. Le, E. K. Y. Yapp, N. Nagasundaram, M. C. H. Chua, and H. Y. Yeh, “Computational identification of vesicular transport proteins from sequences using deep gated recurrent units architecture,” Comput Struct Biotechnol J, vol. 17, pp. 1245–1254, Jan. 2019, doi: 10.1016/J.CSBJ.2019.09.005.
F. G. Furat and T. Ibrikci, “Classification of Down Syndrome of Mice Protein Dataset on MongoDB Database,” Balkan Journal of Electrical and Computer Engineering, pp. 44–49, Apr. 2018, doi: 10.17694/bajece.419553.
İ. ÖZER, “Classification of Urease Activity in Full-Fat Soybean Production by Extrusion Using Machine Learning Algorithms,” Balkan Journal of Electrical and Computer Engineering, Aug. 2021, doi: 10.17694/bajece.941007.
S. Li, J. Chen, and B. Liu, “Protein remote homology detection based on bidirectional long short-term memory,” BMC Bioinformatics, vol. 18, no. 1, p. 443, 2017, doi: 10.1186/s12859-017-1842-2.
M. L. Bileschi et al., “Using deep learning to annotate the protein universe,” Nat Biotechnol, vol. 40, no. 6, pp. 932–937, Jun. 2022, doi: 10.1038/s41587-021-01179-w.
R. Rao et al., “Evaluating Protein Transfer Learning with TAPE,” Adv Neural Inf Process Syst, vol. 32, pp. 9689–9701, Dec. 2019, [Online]. Available: https://pubmed.ncbi.nlm.nih.gov/33390682
J. Upmeier zu Belzen et al., “Leveraging implicit knowledge in neural networks for functional dissection and engineering of proteins,” Nat Mach Intell, vol. 1, no. 5, pp. 225–235, 2019, doi: 10.1038/s42256-019-0049-9.
M. Torrisi, G. Pollastri, and Q. Le, “Deep learning methods in protein structure prediction,” Comput Struct Biotechnol J, vol. 18, pp. 1301–1310, Jan. 2020, doi: 10.1016/j.csbj.2019.12.011.
S. Lim et al., “A review on compound-protein interaction prediction methods: Data, format, representation and model,” Comput Struct Biotechnol J, vol. 19, pp. 1541–1556, Jan. 2021, doi: 10.1016/J.CSBJ.2021.03.004.
C. Gustafsson, J. Minshull, S. Govindarajan, J. Ness, A. Villalobos, and M. Welch, “Engineering genes for predictable protein expression,” Protein Expr Purif, vol. 83, no. 1, pp. 37–46, May 2012, doi: 10.1016/J.PEP.2012.02.013.
Pfam, “HSF-type DNA-binding PF00447.” https://www.ebi.ac.uk/interpro/entry/pfam/PF00447/logo/ (accessed Sep. 11, 2022).
H. Hu, Y.-R. Miao, L.-H. Jia, Q.-Y. Yu, Q. Zhang, and A.-Y. Guo, “AnimalTFDB 3.0: a comprehensive resource for annotation and prediction of animal transcription factors,” Nucleic Acids Res, vol. 47, no. D1, pp. D33–D38, Jan. 2019, doi: 10.1093/nar/gky822.
IUPAC-IUB Comm. on Biochem. Nomenclature, “A one-letter notation for amino acid sequences. Tentative rules,” Biochemistry, vol. 7, no. 8, pp. 2703–2705, Aug. 1968, doi: 10.1021/bi00848a001.
D. Ofer, N. Brandes, and M. Linial, “The language of proteins: NLP, machine learning & protein sequences,” Comput Struct Biotechnol J, vol. 19, pp. 1750–1758, Jan. 2021, doi: 10.1016/J.CSBJ.2021.03.022.
A. B. Oncul, Y. Celik, N. M. Unel, and M. C. Baloglu, “Bhlhdb: A next generation database of basic helix loop helix transcription factors based on deep learning model,” J Bioinform Comput Biol, Jun. 2022, doi: 10.1142/S0219720022500147.
B. Ay Karakuş, M. Talo, İ. R. Hallaç, and G. Aydin, “Evaluating deep learning models for sentiment classification,” Concurr Comput, vol. 30, no. 21, pp. 1–14, Nov. 2018, doi: 10.1002/cpe.4783.
J. K. Vries, X. Liu, and I. Bahar, “The relationship between N-gram patterns and protein secondary structure,” Proteins: Structure, Function, and Bioinformatics, vol. 68, no. 4, pp. 830–838, May 2007, doi: 10.1002/prot.21480.
J. K. Vries and X. Liu, “Subfamily specific conservation profiles for proteins based on n-gram patterns,” BMC Bioinformatics, vol. 9, no. 1, p. 72, Dec. 2008, doi: 10.1186/1471-2105-9-72.
T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” Jan. 2013.
K. Greff, R. K. Srivastava, J. Koutnik, B. R. Steunebrink, and J. Schmidhuber, “LSTM: A Search Space Odyssey,” IEEE Trans Neural Netw Learn Syst, vol. 28, no. 10, pp. 2222–2232, Oct. 2017, doi: 10.1109/TNNLS.2016.2582924.
Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, May 2015, doi: 10.1038/nature14539.
G. van Houdt, C. Mosquera, and G. Nápoles, “A review on the long short-term memory model,” Artif Intell Rev, vol. 53, no. 8, pp. 5929–5955, Dec. 2020, doi: 10.1007/s10462-020-09838-1.
Y. Gao and D. Glowacka, “Deep Gate Recurrent Neural Network,” in Proceedings of The 8th Asian Conference on Machine Learning, Jul. 2016, vol. 63, pp. 350–365. [Online]. Available: https://proceedings.mlr.press/v63/gao30.html
A. Şeker, B. Diri, and H. H. Balık, “Derin Öğrenme Yöntemleri ve Uygulamaları Hakkında Bir İnceleme,” Gazi Mühendislik Bilimleri Dergisi, vol. 3, no. 3, pp. 47–64, Nov. 2017.
C. Sammut and G. I. Webb, Eds., Encyclopedia of Machine Learning. Boston, MA: Springer US, 2010. doi: 10.1007/978-0-387-30164-8.
A. Luque, A. Carrasco, A. Martín, and A. de las Heras, “The impact of class imbalance in classification performance metrics based on the binary confusion matrix,” Pattern Recognit, vol. 91, pp. 216–231, Jul. 2019, doi: 10.1016/J.PATCOG.2019.02.023.
B. Ozenne, F. Subtil, and D. Maucort-Boulch, “The precision–recall curve overcame the optimism of the receiver operating characteristic curve in rare diseases,” J Clin Epidemiol, vol. 68, no. 8, pp. 855–859, Aug. 2015, doi: 10.1016/J.JCLINEPI.2015.02.010.
A. Rohani, M. Taki, and M. Abdollahpour, “A novel soft computing model (Gaussian process regression with K-fold cross validation) for daily and monthly solar radiation forecasting (Part: I),” Renew Energy, vol. 115, pp. 411–422, Jan. 2018, doi: 10.1016/j.renene.2017.08.061.
Z. Xiong, Y. Cui, Z. Liu, Y. Zhao, M. Hu, and J. Hu, “Evaluating explorative prediction power of machine learning algorithms for materials discovery using k-fold forward cross-validation,” Comput Mater Sci, vol. 171, p. 109203, Jan. 2020, doi: 10.1016/j.commatsci.2019.109203.
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting,” Journal of Machine Learning Research, vol. 15, no. 56, pp. 1929–1958, 2014, [Online]. Available: http://jmlr.org/papers/v15/srivastava14a.html
L. Parisi, D. Neagu, R. Ma, and F. Campean, “Quantum ReLU activation for Convolutional Neural Networks to improve diagnosis of Parkinson’s disease and COVID-19,” Expert Syst Appl, vol. 187, p. 115892, Jan. 2022, doi: 10.1016/j.eswa.2021.115892.
A. Basturk, M. E. Yuksei, H. Badem, and A. Caliskan, “Deep neural network based diagnosis system for melanoma skin cancer,” in 2017 25th Signal Processing and Communications Applications Conference (SIU), May 2017, pp. 1–4. doi: 10.1109/SIU.2017.7960563.
R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi, “Convolutional neural networks: an overview and application in radiology,” Insights Imaging, vol. 9, no. 4, pp. 611–629, Aug. 2018, doi: 10.1007/s13244-018-0639-9.
E. YAZAN and M. F. Talu, “Comparison of the stochastic gradient descent based optimization techniques,” in 2017 International Artificial Intelligence and Data Processing Symposium (IDAP), Sep. 2017, pp. 1–5. doi: 10.1109/IDAP.2017.8090299.

LSTM-GRU Based Deep Learning Model with Word2Vec for Transcription Factors in Primates

Year 2023, , 42 - 49, 30.01.2023

Ali Burak Öncül

https://doi.org/10.17694/bajece.1191009

Cited By: 3

Abstract

The study of the structures of proteins and the relationships of amino acids remains a challenging problem in biology. Although some bioinformatics-based studies provide partial solutions, some major problems remain. At the beginning of these problems are the logic of the sequence of amino acids and the diversity of proteins. Although these variations are biologically detectable, these experiments are costly and time-consuming. Considering that there are many unclassified sequences in the world, it is inevitable that a faster solution must be found. For this reason, we propose a deep learning model to classify transcription factor proteins of primates. Our model has a hybrid structure that uses Recurrent Neural Network (RNN) based Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks with Word2Vec preprocessing step. Our model has 97.96% test accuracy, 97.55% precision, 95.26% recall, 96.22% f1-score. Our model was also tested with 5-fold cross-validation and reached 97.42% result. In the prepared model, LSTM was used in layers with fewer units, and GRU was used in layers with more units, and it was aimed to make the model a model that can be trained and run as quickly as possible. With the added dropout layers, the overfitting problem of the model is prevented.

Keywords

Protein classification, Protein classification, Hybrid deep learning, Word2Vec, LSTM, GRU

References

J. J. Shu, “A new integrated symmetrical table for genetic codes,” Biosystems, vol. 151, pp. 21–26, Jan. 2017, doi: 10.1016/J.BIOSYSTEMS.2016.11.004.
J. D. WATSON and F. H. C. CRICK, “Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid,” Nature, vol. 171, no. 4356, pp. 737–738, Apr. 1953, doi: 10.1038/171737a0.
D. R. Ferrier, “Protein Yapısı ve İşlevi,” in Lippincott Biyokimya: Görsel Anlatımlı Çalışma Kitapları, B. A. Jameson, Ed. İstanbul: Nobel Tıp Kitapevleri, 2019, pp. 1–68.
Pfam, “Family: HLH (PF00010).” http://pfam.xfam.org/family/pf00010 (accessed Feb. 02, 2019).
T. Kaplan and M. D. Biggin, “Quantitative Models of the Mechanisms that Control Genome-Wide Patterns of Animal Transcription Factor Binding,” Methods Cell Biol, vol. 110, pp. 263–283, Jan. 2012, doi: 10.1016/B978-0-12-388403-9.00011-4.
D. S. Latchman, “Transcription factors: an overview Function of transcription factors,” Int. J. Exp. Path, vol. 74, pp. 417–422, 1993.
M. Karin, “Too many transcription factors: positive and negative interactions,” New Biol, vol. 2, no. 2, pp. 126–131, 1990.
D. S. Latchman, “Transcription factors: An overview,” Int J Biochem Cell Biol, vol. 29, no. 12, pp. 1305–1312, Dec. 1997, doi: 10.1016/S1357-2725(97)00085-X.
D. Petrey and B. Honig, “Is protein classification necessary? Toward alternative approaches to function annotation,” Curr Opin Struct Biol, vol. 19, no. 3, pp. 363–368, Jun. 2009, doi: 10.1016/J.SBI.2009.02.001.
P. Baldi and S. Brunak, Bioinformatics, Second Edition: The Machine Learning Approach. Cambridge: MIT Press, 2001.
S. R. Eddy, “Hidden Markov models,” Curr Opin Struct Biol, vol. 6, no. 3, pp. 361–365, Jun. 1996, doi: 10.1016/S0959-440X(96)80056-X.
M. M. Gromiha, “Protein Sequence Analysis,” Protein Bioinformatics, pp. 29–62, Jan. 2010, doi: 10.1016/B978-8-1312-2297-3.50002-3.
S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, “Basic local alignment search tool,” J Mol Biol, vol. 215, no. 3, pp. 403–410, Oct. 1990, doi: 10.1016/S0022-2836(05)80360-2.
M. N. Price et al., “Mutant phenotypes for thousands of bacterial genes of unknown function,” Nature, vol. 557, no. 7706, p. 503—509, May 2018, doi: 10.1038/s41586-018-0124-0.
N. Strodthoff, P. Wagner, M. Wenzel, and W. Samek, “UDSMProt: universal deep sequence models for protein classification,” Bioinformatics, vol. 36, no. 8, pp. 2401–2409, Apr. 2020, doi: 10.1093/bioinformatics/btaa003.
K. S. Naveenkumar, B. R. Mohammed Harun, R. Vinayakumar, and K. P. Soman, “Protein Family Classification using Deep Learning,” bioRxiv, p. 414128, Jan. 2018, doi: 10.1101/414128.
X. Du, Y. Cai, S. Wang, and L. Zhang, “Overview of deep learning,” in 2016 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC), 2016, pp. 159–164. doi: 10.1109/YAC.2016.7804882.
M. Huerta, F. Haseltine, Y. Liu, G. Downing, and B. Seto, “NIH working definition of bioinformatics and computational biology,” Jul. 2000.
Q. Gong, W. Ning, and W. Tian, “GoFDR: A sequence alignment based method for predicting protein functions,” Methods, vol. 93, pp. 3–14, Jan. 2016, doi: 10.1016/J.YMETH.2015.08.009.
H. bin Shen and K. C. Chou, “EzyPred: A top–down approach for predicting enzyme functional classes and subclasses,” Biochem Biophys Res Commun, vol. 364, no. 1, pp. 53–59, Dec. 2007, doi: 10.1016/J.BBRC.2007.09.098.
A. Dalkiran, A. S. Rifaioglu, M. J. Martin, R. Cetin-Atalay, V. Atalay, and T. Doğan, “ECPred: a tool for the prediction of the enzymatic functions of protein sequences based on the EC nomenclature,” BMC Bioinformatics, vol. 19, no. 1, p. 334, 2018, doi: 10.1186/s12859-018-2368-y.
D. Cozzetto, F. Minneci, H. Currant, and D. T. Jones, “FFPred 3: feature-based function prediction for all Gene Ontology domains,” Sci Rep, vol. 6, no. 1, p. 31865, 2016, doi: 10.1038/srep31865.
E. Asgari and M. R. K. Mofrad, “Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics,” PLoS One, vol. 10, no. 11, Nov. 2015.
N. Q. K. Le, E. K. Y. Yapp, N. Nagasundaram, M. C. H. Chua, and H. Y. Yeh, “Computational identification of vesicular transport proteins from sequences using deep gated recurrent units architecture,” Comput Struct Biotechnol J, vol. 17, pp. 1245–1254, Jan. 2019, doi: 10.1016/J.CSBJ.2019.09.005.
F. G. Furat and T. Ibrikci, “Classification of Down Syndrome of Mice Protein Dataset on MongoDB Database,” Balkan Journal of Electrical and Computer Engineering, pp. 44–49, Apr. 2018, doi: 10.17694/bajece.419553.
İ. ÖZER, “Classification of Urease Activity in Full-Fat Soybean Production by Extrusion Using Machine Learning Algorithms,” Balkan Journal of Electrical and Computer Engineering, Aug. 2021, doi: 10.17694/bajece.941007.
S. Li, J. Chen, and B. Liu, “Protein remote homology detection based on bidirectional long short-term memory,” BMC Bioinformatics, vol. 18, no. 1, p. 443, 2017, doi: 10.1186/s12859-017-1842-2.
M. L. Bileschi et al., “Using deep learning to annotate the protein universe,” Nat Biotechnol, vol. 40, no. 6, pp. 932–937, Jun. 2022, doi: 10.1038/s41587-021-01179-w.
R. Rao et al., “Evaluating Protein Transfer Learning with TAPE,” Adv Neural Inf Process Syst, vol. 32, pp. 9689–9701, Dec. 2019, [Online]. Available: https://pubmed.ncbi.nlm.nih.gov/33390682
J. Upmeier zu Belzen et al., “Leveraging implicit knowledge in neural networks for functional dissection and engineering of proteins,” Nat Mach Intell, vol. 1, no. 5, pp. 225–235, 2019, doi: 10.1038/s42256-019-0049-9.
M. Torrisi, G. Pollastri, and Q. Le, “Deep learning methods in protein structure prediction,” Comput Struct Biotechnol J, vol. 18, pp. 1301–1310, Jan. 2020, doi: 10.1016/j.csbj.2019.12.011.
S. Lim et al., “A review on compound-protein interaction prediction methods: Data, format, representation and model,” Comput Struct Biotechnol J, vol. 19, pp. 1541–1556, Jan. 2021, doi: 10.1016/J.CSBJ.2021.03.004.
C. Gustafsson, J. Minshull, S. Govindarajan, J. Ness, A. Villalobos, and M. Welch, “Engineering genes for predictable protein expression,” Protein Expr Purif, vol. 83, no. 1, pp. 37–46, May 2012, doi: 10.1016/J.PEP.2012.02.013.
Pfam, “HSF-type DNA-binding PF00447.” https://www.ebi.ac.uk/interpro/entry/pfam/PF00447/logo/ (accessed Sep. 11, 2022).
H. Hu, Y.-R. Miao, L.-H. Jia, Q.-Y. Yu, Q. Zhang, and A.-Y. Guo, “AnimalTFDB 3.0: a comprehensive resource for annotation and prediction of animal transcription factors,” Nucleic Acids Res, vol. 47, no. D1, pp. D33–D38, Jan. 2019, doi: 10.1093/nar/gky822.
IUPAC-IUB Comm. on Biochem. Nomenclature, “A one-letter notation for amino acid sequences. Tentative rules,” Biochemistry, vol. 7, no. 8, pp. 2703–2705, Aug. 1968, doi: 10.1021/bi00848a001.
D. Ofer, N. Brandes, and M. Linial, “The language of proteins: NLP, machine learning & protein sequences,” Comput Struct Biotechnol J, vol. 19, pp. 1750–1758, Jan. 2021, doi: 10.1016/J.CSBJ.2021.03.022.
A. B. Oncul, Y. Celik, N. M. Unel, and M. C. Baloglu, “Bhlhdb: A next generation database of basic helix loop helix transcription factors based on deep learning model,” J Bioinform Comput Biol, Jun. 2022, doi: 10.1142/S0219720022500147.
B. Ay Karakuş, M. Talo, İ. R. Hallaç, and G. Aydin, “Evaluating deep learning models for sentiment classification,” Concurr Comput, vol. 30, no. 21, pp. 1–14, Nov. 2018, doi: 10.1002/cpe.4783.
J. K. Vries, X. Liu, and I. Bahar, “The relationship between N-gram patterns and protein secondary structure,” Proteins: Structure, Function, and Bioinformatics, vol. 68, no. 4, pp. 830–838, May 2007, doi: 10.1002/prot.21480.
J. K. Vries and X. Liu, “Subfamily specific conservation profiles for proteins based on n-gram patterns,” BMC Bioinformatics, vol. 9, no. 1, p. 72, Dec. 2008, doi: 10.1186/1471-2105-9-72.
T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” Jan. 2013.
K. Greff, R. K. Srivastava, J. Koutnik, B. R. Steunebrink, and J. Schmidhuber, “LSTM: A Search Space Odyssey,” IEEE Trans Neural Netw Learn Syst, vol. 28, no. 10, pp. 2222–2232, Oct. 2017, doi: 10.1109/TNNLS.2016.2582924.
Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, May 2015, doi: 10.1038/nature14539.
G. van Houdt, C. Mosquera, and G. Nápoles, “A review on the long short-term memory model,” Artif Intell Rev, vol. 53, no. 8, pp. 5929–5955, Dec. 2020, doi: 10.1007/s10462-020-09838-1.
Y. Gao and D. Glowacka, “Deep Gate Recurrent Neural Network,” in Proceedings of The 8th Asian Conference on Machine Learning, Jul. 2016, vol. 63, pp. 350–365. [Online]. Available: https://proceedings.mlr.press/v63/gao30.html
A. Şeker, B. Diri, and H. H. Balık, “Derin Öğrenme Yöntemleri ve Uygulamaları Hakkında Bir İnceleme,” Gazi Mühendislik Bilimleri Dergisi, vol. 3, no. 3, pp. 47–64, Nov. 2017.
C. Sammut and G. I. Webb, Eds., Encyclopedia of Machine Learning. Boston, MA: Springer US, 2010. doi: 10.1007/978-0-387-30164-8.
A. Luque, A. Carrasco, A. Martín, and A. de las Heras, “The impact of class imbalance in classification performance metrics based on the binary confusion matrix,” Pattern Recognit, vol. 91, pp. 216–231, Jul. 2019, doi: 10.1016/J.PATCOG.2019.02.023.
B. Ozenne, F. Subtil, and D. Maucort-Boulch, “The precision–recall curve overcame the optimism of the receiver operating characteristic curve in rare diseases,” J Clin Epidemiol, vol. 68, no. 8, pp. 855–859, Aug. 2015, doi: 10.1016/J.JCLINEPI.2015.02.010.
A. Rohani, M. Taki, and M. Abdollahpour, “A novel soft computing model (Gaussian process regression with K-fold cross validation) for daily and monthly solar radiation forecasting (Part: I),” Renew Energy, vol. 115, pp. 411–422, Jan. 2018, doi: 10.1016/j.renene.2017.08.061.
Z. Xiong, Y. Cui, Z. Liu, Y. Zhao, M. Hu, and J. Hu, “Evaluating explorative prediction power of machine learning algorithms for materials discovery using k-fold forward cross-validation,” Comput Mater Sci, vol. 171, p. 109203, Jan. 2020, doi: 10.1016/j.commatsci.2019.109203.
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting,” Journal of Machine Learning Research, vol. 15, no. 56, pp. 1929–1958, 2014, [Online]. Available: http://jmlr.org/papers/v15/srivastava14a.html
L. Parisi, D. Neagu, R. Ma, and F. Campean, “Quantum ReLU activation for Convolutional Neural Networks to improve diagnosis of Parkinson’s disease and COVID-19,” Expert Syst Appl, vol. 187, p. 115892, Jan. 2022, doi: 10.1016/j.eswa.2021.115892.
A. Basturk, M. E. Yuksei, H. Badem, and A. Caliskan, “Deep neural network based diagnosis system for melanoma skin cancer,” in 2017 25th Signal Processing and Communications Applications Conference (SIU), May 2017, pp. 1–4. doi: 10.1109/SIU.2017.7960563.
R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi, “Convolutional neural networks: an overview and application in radiology,” Insights Imaging, vol. 9, no. 4, pp. 611–629, Aug. 2018, doi: 10.1007/s13244-018-0639-9.
E. YAZAN and M. F. Talu, “Comparison of the stochastic gradient descent based optimization techniques,” in 2017 International Artificial Intelligence and Data Processing Symposium (IDAP), Sep. 2017, pp. 1–5. doi: 10.1109/IDAP.2017.8090299.

There are 57 citations in total.

Details

Primary Language	English
Subjects	Artificial Intelligence
Journal Section	Araştırma Articlessi
Authors	Ali Burak Öncül 0000-0001-9612-1787
Publication Date	January 30, 2023
Published in Issue	Year 2023

Cite

APA	Öncül, A. B. (2023). LSTM-GRU Based Deep Learning Model with Word2Vec for Transcription Factors in Primates. Balkan Journal of Electrical and Computer Engineering, 11(1), 42-49. https://doi.org/10.17694/bajece.1191009

Cited By

An Efficient Deep Learning Approach for DNA-Binding Proteins Classification from Primary Sequences

International Journal of Computational Intelligence Systems

https://doi.org/10.1007/s44196-024-00462-3

GMean—a semi-supervised GRU and K-mean model for predicting the TF binding site

Scientific Reports

https://doi.org/10.1038/s41598-024-52933-4

Müşteri Duyarlılığını Keşfetmek İçin Yapay Zeka Destekli Analiz ile Çevrimiçi Ürün İncelemelerinden Anlamlı Bilgiler Elde Etme

Fırat Üniversitesi Mühendislik Bilimleri Dergisi

https://doi.org/10.35234/fumbd.1305932

Article Files

Full Text

All articles published by BAJECE are licensed under the Creative Commons Attribution 4.0 International License. This permits anyone to copy, redistribute, remix, transmit and adapt the work provided the original work and source is appropriately cited. Creative Commons LisansÄ±