Research Article
BibTex RIS Cite

Derin Öğrenme Teknikleri Kullanarak İkili ve Çok Etiketli Sınıflandırma İle Enzimatik Fonksiyon Tahmini

Year 2021, Issue: 32, 262 - 267, 31.12.2021
https://doi.org/10.31590/ejosat.1041643

Abstract

Biyolojik katilazör olarak görev yapan enzimler katalizlediği tepkime türüne ve mekanizmasına göre sınıflandırılırken her sınıf altında substrat seçiciliği durumlarına göre de alt sınıflar oluşturulmuştur. Aynı zamanda enzimlerin sınıflandırılmasında yapısal, kimyasal ve bağlantısallık özellikleri önemli olmaktadır. Enzim fonksiyonunu tahmini yeni enzimlerin tasarlamalarına yardımcı olmak ve enzimle ilişkili hastalıkları teşhisinde önemli olmaktadır. Enzimlerin önemli bir çoğunluğu belirli reaksiyonları gerçekleştiriken, sınırlı sayıda enzim farklı reaksiyonlar gerçekleştirebilmektedir. Bu nedenle birden fazla enzimatik fonksiyonla doğrudan ilişkilendirilebilmektedir.
Gerçekleştirilen bu çalışmada enzimatik fonksiyonun ikili ve çok etiketli sınıflandırma ile tahmini amaçlanmıştır. Enzimlerin sınıflandırılmasında daha başarılı sonuçların kimyasal özelliklerin kullanılmasında ortaya çıktığı görülmüştür. Ancak tüm özelliklerin kullanılması durumunda sınıflandırma performansının daha da arttığı görülmüştür. Enzimatik fonksiyon tahmnine yönelik kullanılan modellerin başarısı incelendiğinde Derin Öğrenme modellerinin hem ikili hemde çok etiketli sınıflandırma performansının daha yüksek olduğu görülmüştür. Sonuç olarak önerilen modellerinin enzimatik fonksiyonların sınıflandırılmasında önemli bir araç olduğu ortaya konmuştur.

References

  • Amidi, S., Amidi, A., Vlachakis, D., Paragios, N., & Zacharaki, E. I. (2017). Automatic single-and multi-label enzymatic function prediction by machine learning. PeerJ, 5, e3095.
  • Angelova, A., Krizhevsky, A., & Vanhoucke, V. (2015, May). Pedestrian detection with a large-field-of-view deep network. In 2015 IEEE international conference on robotics and automation (ICRA) (pp. 704-711). IEEE.
  • Baran M, Öztürk M, Latifoğlu F. (2021). Gaita mikrobiyotasının hastalıklarla ilişkisinde öğrenmemodellerinin karşılaştırılması. MAS International European Conference on Mathematics-Engineering-Natural&Medical Sciences-XV. September 2021 ADANA, 7-8.
  • Breiman, L. (2001). Random forest. Mach. Learn, 45: 5–32.
  • Che, Y., Ju, Y., Xuan, P., Long, R., & Xing, F. (2016). Identification of multi-functional enzyme with multi-label classifier. PloS one, 11(4), e0153503.
  • Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794).
  • Dalkiran, A., Rifaioglu, A. S., Martin, M. J., Cetin-Atalay, R., Atalay, V., & Doğan, T. (2018). ECPred: a tool for the prediction of the enzymatic functions of protein sequences based on the EC nomenclature. BMC bioinformatics, 19(1), 1-13.
  • De Ferrari, L., Aitken, S., van Hemert, J., & Goryanin, I. (2012). EnzML: multi-label prediction of enzyme classes using InterPro signatures. BMC bioinformatics, 13(1), 1-12.
  • Feltcher, M. E., & Braunstein, M. (2012). Emerging themes in SecA2-mediated protein export. Nature Reviews Microbiology, 10(11), 779-789.
  • Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 55(1), 119-139.
  • Garcia-Viloca, M., Gao, J., Karplus, M., & Truhlar, D. G. (2004). How enzymes work: analysis by modern rate theory and computer simulations. Science, 303(5655), 186-195.
  • Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., ... & Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585(7825), 357-362.
  • Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing in science & engineering, 9(03), 90-95.
  • Jiang, Y., & Zhou, Z. H. (2004, August). Editing training data for kNN classifiers with neural network ensemble. In International symposium on neural networks (pp. 356-361). Springer, Berlin, Heidelberg.
  • Li, Y., Wang, S., Umarov, R., Xie, B., Fan, M., Li, L., & Gao, X. (2018). DEEPre: sequence-based enzyme EC number prediction by deep learning. Bioinformatics, 34(5), 760-769.
  • Lu, L., Qian, Z., Cai, Y. D., & Li, Y. (2007). ECS: an automatic enzyme classifier based on functional domain composition. Computational biology and chemistry, 31(3), 226-232.
  • Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein Structure, 405(2), 442-451.
  • McKinney, W. (2010, June). Data structures for statistical computing in python. In Proceedings of the 9th Python in Science Conference, 445, pp. 51-56.
  • Quester, S., & Schomburg, D. (2011). EnzymeDetector: an integrated enzyme function prediction tool and database. BMC bioinformatics, 12(1), 1-13.
  • oy, A., Yang, J., & Zhang, Y. (2012). COFACTOR: an accurate comparative algorithm for structure-based protein function annotation. Nucleic acids research, 40(W1), W471-W477.
  • Shen, H. B., & Chou, K. C. (2007). EzyPred: a top–down approach for predicting enzyme functional classes and subclasses. Biochemical and biophysical research communications, 364(1), 53-59.
  • Van Rossum, G., & Drake Jr, F. L. (1995). Python reference manual. Centrum voor Wiskunde en Informatica Amsterdam.
  • Zhou, X. X., Fan, L. Z., Li, P., Shen, K., & Lin, M. Z. (2017). Optical control of cell signaling by single-chain photoswitchable kinases. Science, 355(6327), 836-842.
  • Zou, Q., Chen, W., Huang, Y., Liu, X., & Jiang, Y. (2013). Identifying multi-functional enzyme by hierarchical multi-label classifier. Journal of Computational and Theoretical Nanoscience, 10(4), 1038-1043.
  • Zou, H. L., & Xiao, X. (2016). Classifying multifunctional enzymes by incorporating three different models into Chou’s general pseudo amino acid composition. The Journal of membrane biology, 249(4), 551-557.
  • Zou, Z., Tian, S., Gao, X., & Li, Y.(2019). mldeepre: Multi-functional enzyme function prediction with hierarchical multi-label deep learning. Frontiers in genetics, 9, 714.

Enzymatic Function Estimation with Binary and Multilabel Classification Using Deep Learning Techniques

Year 2021, Issue: 32, 262 - 267, 31.12.2021
https://doi.org/10.31590/ejosat.1041643

Abstract

Enzymes that act as biological catalysts are classified according to the reaction type and mechanism they catalyze, while subclasses are formed under each class according to their substrate selectivity. At the same time, structural, chemical and connectivity features are important in the classification of enzymes. Predicting enzyme function is important in helping to design new enzymes and in diagnosing enzyme-related diseases. While a significant majority of enzymes carry out certain reactions, a limited number of enzymes can perform different reactions. Therefore, it can be directly associated with more than one enzymatic function. In this study, it was aimed to predict the enzymatic function by binary and multi-label classification. It has been observed that more successful results the use of chemical properties in have emerged in the classification of enzymes. However, it was observed that the classification performance increased even more when all features were used. When the success of the models used for enzymatic function estimation was examined, it was seen that the Deep Learning models had higher both binary and multi-label classification performance. As a result, it has been demonstrated that the proposed models are an important tool in the classification of enzymatic functions.

References

  • Amidi, S., Amidi, A., Vlachakis, D., Paragios, N., & Zacharaki, E. I. (2017). Automatic single-and multi-label enzymatic function prediction by machine learning. PeerJ, 5, e3095.
  • Angelova, A., Krizhevsky, A., & Vanhoucke, V. (2015, May). Pedestrian detection with a large-field-of-view deep network. In 2015 IEEE international conference on robotics and automation (ICRA) (pp. 704-711). IEEE.
  • Baran M, Öztürk M, Latifoğlu F. (2021). Gaita mikrobiyotasının hastalıklarla ilişkisinde öğrenmemodellerinin karşılaştırılması. MAS International European Conference on Mathematics-Engineering-Natural&Medical Sciences-XV. September 2021 ADANA, 7-8.
  • Breiman, L. (2001). Random forest. Mach. Learn, 45: 5–32.
  • Che, Y., Ju, Y., Xuan, P., Long, R., & Xing, F. (2016). Identification of multi-functional enzyme with multi-label classifier. PloS one, 11(4), e0153503.
  • Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794).
  • Dalkiran, A., Rifaioglu, A. S., Martin, M. J., Cetin-Atalay, R., Atalay, V., & Doğan, T. (2018). ECPred: a tool for the prediction of the enzymatic functions of protein sequences based on the EC nomenclature. BMC bioinformatics, 19(1), 1-13.
  • De Ferrari, L., Aitken, S., van Hemert, J., & Goryanin, I. (2012). EnzML: multi-label prediction of enzyme classes using InterPro signatures. BMC bioinformatics, 13(1), 1-12.
  • Feltcher, M. E., & Braunstein, M. (2012). Emerging themes in SecA2-mediated protein export. Nature Reviews Microbiology, 10(11), 779-789.
  • Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 55(1), 119-139.
  • Garcia-Viloca, M., Gao, J., Karplus, M., & Truhlar, D. G. (2004). How enzymes work: analysis by modern rate theory and computer simulations. Science, 303(5655), 186-195.
  • Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., ... & Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585(7825), 357-362.
  • Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing in science & engineering, 9(03), 90-95.
  • Jiang, Y., & Zhou, Z. H. (2004, August). Editing training data for kNN classifiers with neural network ensemble. In International symposium on neural networks (pp. 356-361). Springer, Berlin, Heidelberg.
  • Li, Y., Wang, S., Umarov, R., Xie, B., Fan, M., Li, L., & Gao, X. (2018). DEEPre: sequence-based enzyme EC number prediction by deep learning. Bioinformatics, 34(5), 760-769.
  • Lu, L., Qian, Z., Cai, Y. D., & Li, Y. (2007). ECS: an automatic enzyme classifier based on functional domain composition. Computational biology and chemistry, 31(3), 226-232.
  • Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein Structure, 405(2), 442-451.
  • McKinney, W. (2010, June). Data structures for statistical computing in python. In Proceedings of the 9th Python in Science Conference, 445, pp. 51-56.
  • Quester, S., & Schomburg, D. (2011). EnzymeDetector: an integrated enzyme function prediction tool and database. BMC bioinformatics, 12(1), 1-13.
  • oy, A., Yang, J., & Zhang, Y. (2012). COFACTOR: an accurate comparative algorithm for structure-based protein function annotation. Nucleic acids research, 40(W1), W471-W477.
  • Shen, H. B., & Chou, K. C. (2007). EzyPred: a top–down approach for predicting enzyme functional classes and subclasses. Biochemical and biophysical research communications, 364(1), 53-59.
  • Van Rossum, G., & Drake Jr, F. L. (1995). Python reference manual. Centrum voor Wiskunde en Informatica Amsterdam.
  • Zhou, X. X., Fan, L. Z., Li, P., Shen, K., & Lin, M. Z. (2017). Optical control of cell signaling by single-chain photoswitchable kinases. Science, 355(6327), 836-842.
  • Zou, Q., Chen, W., Huang, Y., Liu, X., & Jiang, Y. (2013). Identifying multi-functional enzyme by hierarchical multi-label classifier. Journal of Computational and Theoretical Nanoscience, 10(4), 1038-1043.
  • Zou, H. L., & Xiao, X. (2016). Classifying multifunctional enzymes by incorporating three different models into Chou’s general pseudo amino acid composition. The Journal of membrane biology, 249(4), 551-557.
  • Zou, Z., Tian, S., Gao, X., & Li, Y.(2019). mldeepre: Multi-functional enzyme function prediction with hierarchical multi-label deep learning. Frontiers in genetics, 9, 714.
There are 26 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Articles
Authors

Münevver Baran 0000-0003-0369-1022

Mustafa Öztürk 0000-0001-9911-2499

Fatma Latifoğlu 0000-0003-2018-9616

Publication Date December 31, 2021
Published in Issue Year 2021 Issue: 32

Cite

APA Baran, M., Öztürk, M., & Latifoğlu, F. (2021). Derin Öğrenme Teknikleri Kullanarak İkili ve Çok Etiketli Sınıflandırma İle Enzimatik Fonksiyon Tahmini. Avrupa Bilim Ve Teknoloji Dergisi(32), 262-267. https://doi.org/10.31590/ejosat.1041643