Research Article
BibTex RIS Cite

Comparison of Deep Learning and Other Classification Algorithms in Small Dataset Studies: Example of Agonist and Antagonist Ligand

Year 2022, , 356 - 371, 10.03.2022
https://doi.org/10.33715/inonusaglik.1022065

Abstract

Machine learning algorithms are used in almost all branches of science today. In particular, classification algorithms are a very popular subject in terms of science and health sciences. Deep learning is one of the machine learning techniques like other algorithms Today, it has become popular again due to the increase in processor speeds. Particularly graphics processor-based calculations have made this subject popular. The aim of this study is to classify the agonist and antiagonist molecules that bind to dopamine receptors, which are well known in the literature, with the data we obtained from chemistry databases, with machine learning algorithms. The aim of the study is also to suggest the use of a deep learning algorithm for an accurate classification when classifying in cases where the number of data is small. Scikit-learn and Tensorflow-Keras from Python libraries were used for training the algorithm. The classification process has been compared with popular machine learning algorithms and the results have been presented as a table.

References

  • Aguiar, J. A., Gong, M. L., Tasdizen, T. (2020). Crystallographic prediction from diffraction and chemistry data for higher throughput classification using machine learning. Computational Materials Science, 173, 109409.
  • Altman, N. S. (1992). An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression. The American Statistician, 46(3), 175–185.
  • Cortes, C., Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.
  • De Vito, S., Esposito, E., Salvato, M., Popoola, O., Formisano, F., Jones, R., Di Francia, G. (2018). Calibrating chemical multisensory devices for real world applications: An in-depth comparison of quantitative machine learning approaches, Sensors and Actuators B: Chemical, 255, 1191–1210.
  • Deng, L., Yu, D. (2014). Deep Learning: Methods and Applications. Foundations and Trends in Signal Processing, 7(3–4), 197–387.
  • Ding, W., Tong, Y., Zhang, Q., Yang, D. (2008). Image and video quality assessment using neural network and SVM. Tsinghua Science and Technology, 13(1), 112–116.
  • Drouhard, J.-P., Sabourin, R., Godbout, M. (1996). A neural network approach to off-line signature verification using directional PDF. Pattern Recognition, 29(3), 415–424.
  • Friedl, M. A., Brodley, C. E. (1997). Decision tree classification of land cover from remotely sensed data, Remote Sensing of Environment, 61(3), 399–409.
  • Furey, T. S., Cristianini, N., Duffy, N., Bednarski, D. W., Schummer, M., Haussler, D. (2000). Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics, 16(10), 906–914.
  • Goh, G. B., Hodas, N. O., Vishnu, A. (2017). Deep learning for computational chemistry. Journal of Computational Chemistry, 38(16), 1291–1307.
  • Grömping, U. (2009). Variable Importance Assessment in Regression: Linear Regression versus Random Forest. The American Statistician, 63(4), 308–319.
  • Gumus, O., Yasar, E., Gumus, Z. P., Ertas, H. (2020). Comparison of different classification algorithms to identify geographic origins of olive oils. Journal of Food Science and Technology, 57(4), 1535–1543.
  • Judson, R., Elloumi, F., Setzer, R. W., Li, Z., Shah, I. (2008). A comparison of machine learning algorithms for chemical toxicity classification using a simulated multi-scale data model. BMC Bioinformatics, 9(1), 241.
  • Karakaplan, M., Avcu, F. M. (2013). A parallel and non-parallel genetic algorithm for deconvolution of NMR spectra peaks. Chemometrics and Intelligent Laboratory Systems, 125, 147-152. Karakaplan, M., Avcu, F. M. (2021). Classification of some chemical drugs by genetic algorithm and deep neural network hybrid method. Concurrency and Computation: Practice and Experience, 33(13), e6242. Kumar, J., Singh, A. K. (2018). Workload prediction in cloud using artificial neural network and adaptive differential evolution. Future Generation Computer Systems, 81, 41–52.
  • Leen, T. K., Dietterich, T. G., Tresp, V. (2001). Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference. MIT Press.
  • Maron, M. E. (1961). Automatic Indexing: An Experimental Inquiry, Journal of the ACM, 8(3), 404–417.
  • Mayr, A., Klambauer, G., Unterthiner, T., Steijaert, M., K. Wegner, J., Ceulemans, H., …Hochreiter, S. (2018). Large-scale comparison of machine learning methods for drug target prediction on ChEMBL. Chemical Science, 9(24), 5441–5451.
  • Minerali, E., Foil, D. H., Zorn, K. M., Lane, T. R., Ekins, S. (2020). Comparing Machine Learning Algorithms for Predicting Drug-Induced Liver Injury (DILI). Molecular Pharmaceutics, 17(7), 2628–2637.
  • Pal, M. (2005). Random forest classifier for remote sensing classification. International Journal of Remote Sensing, 26(1), 217–222.
  • PyChem homepage | PyChem. (n.d.). 7 Kasım 2021 tarihinde, http://pychem.sourceforge.net/ adresinden erişildi.
  • Python.org.. Python.Org. 7 Kasım 2021 tarihinde,https://www.python.org/ adresinden erişildi.
  • Russo, D. P., Zorn, K. M., Clark, A. M., Zhu, H., Ekins, S. (2018). Comparing Multiple Machine Learning Algorithms and Metrics for Estrogen Receptor Binding Prediction. Molecular Pharmaceutics, 15(10), 4361–4370.
  • Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural Networks, 61, 85–117.
  • Scikit-learn: Machine learning in Python—Scikit-learn 1.0.1 documentation. 7 Kasım 2021 tarihinde https://scikit-learn.org/stable/ adresinden erişildi.
  • Sekeroglu, B. (2004). Classification of sonar images using back propagation neural network, IGARSS 2004. 2004 IEEE International Geoscience and Remote Sensing Symposium, 5, 3092–3095 vol.5.
  • Taddy, M. (2019). Business Data Science: Combining Machine Learning and Economics to Optimize, Automate, and Accelerate Business Decisions, McGraw Hill Professional.
  • TensorFlow.. TensorFlow. 7 Kasım 2021 tarihinde, https://www.tensorflow.org/ adresinden erişildi.
  • Tso, G. K. F., Yau, K. K. W. (2007). Predicting electricity energy consumption: A comparison of regression analysis, decision tree and neural networks. Energy, 32(9), 1761–1768.
  • Valiev, M., Bylaska, E. J., Govind, N., Kowalski, K., Straatsma, T. P., Van Dam, H. J. J., … de Jong, W. A. (2010). NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations. Computer Physics Communications, 181(9), 1477–1489.
  • Xie, Y., Zhang, C., Hu, X., Zhang, C., Kelley, S. P., Atwood, J. L., Lin, J. (2020). Machine Learning Assisted Synthesis of Metal–Organic Nanocapsules. Journal of the American Chemical Society, 142(3), 1475–1481.

AZ VERİ SETLİ ÇALIŞMALARINDA DERİN ÖĞRENME VE DİĞER SINIFLANDIRMA ALGORİTMALARININ KARŞILAŞTIRILMASI: AGONİST VE ANTAGONİST LİGAND ÖRNEĞİ

Year 2022, , 356 - 371, 10.03.2022
https://doi.org/10.33715/inonusaglik.1022065

Abstract

Makine öğrenme algoritmaları günümüzde hemen hemen tüm bilim dallarında kullanılmaktadır. Özellikle sınıflandırma algoritmaları fen ve sağlık bilimleri açısından oldukça popüler bir konudur. Derin öğrenme, diğer algoritmalar gibi makina öğrenme tekniklerinden biridir. Günümüzde işlemci hızlarının artması nedeni ile tekrar popüler olmuştur. Özellikle grafik işlemci tabanlı hesaplamalar bu konuyu popüler yapmıştır. Bu çalışmanın amacı, kimyasal veri tabanlarından elde edilen veriler ile literatürde iyi bilinen, dopamin reseptörlerine bağlanan agonist ve antiagonist moleküllerini makine öğrenme algoritmaları ile sınıflandırmaktır. Çalışmanın amacı ayrıca veri sayısı az olan durumlarda sınıflandırma yaparken doğru bir sınıflandırma için derin öğrenme algoritmasının kullanımını önermektir. Algoritmanın eğitmek için, Python kütüphanelerinden Scikit-learn ve Tensorflow-Keras kullanılmıştır. Sınıflandırma işlemi popüler makine öğrenme algoritmaları ile kıyaslanmış ve sonuçlar bir tablo olarak sunulmuştur.

References

  • Aguiar, J. A., Gong, M. L., Tasdizen, T. (2020). Crystallographic prediction from diffraction and chemistry data for higher throughput classification using machine learning. Computational Materials Science, 173, 109409.
  • Altman, N. S. (1992). An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression. The American Statistician, 46(3), 175–185.
  • Cortes, C., Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.
  • De Vito, S., Esposito, E., Salvato, M., Popoola, O., Formisano, F., Jones, R., Di Francia, G. (2018). Calibrating chemical multisensory devices for real world applications: An in-depth comparison of quantitative machine learning approaches, Sensors and Actuators B: Chemical, 255, 1191–1210.
  • Deng, L., Yu, D. (2014). Deep Learning: Methods and Applications. Foundations and Trends in Signal Processing, 7(3–4), 197–387.
  • Ding, W., Tong, Y., Zhang, Q., Yang, D. (2008). Image and video quality assessment using neural network and SVM. Tsinghua Science and Technology, 13(1), 112–116.
  • Drouhard, J.-P., Sabourin, R., Godbout, M. (1996). A neural network approach to off-line signature verification using directional PDF. Pattern Recognition, 29(3), 415–424.
  • Friedl, M. A., Brodley, C. E. (1997). Decision tree classification of land cover from remotely sensed data, Remote Sensing of Environment, 61(3), 399–409.
  • Furey, T. S., Cristianini, N., Duffy, N., Bednarski, D. W., Schummer, M., Haussler, D. (2000). Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics, 16(10), 906–914.
  • Goh, G. B., Hodas, N. O., Vishnu, A. (2017). Deep learning for computational chemistry. Journal of Computational Chemistry, 38(16), 1291–1307.
  • Grömping, U. (2009). Variable Importance Assessment in Regression: Linear Regression versus Random Forest. The American Statistician, 63(4), 308–319.
  • Gumus, O., Yasar, E., Gumus, Z. P., Ertas, H. (2020). Comparison of different classification algorithms to identify geographic origins of olive oils. Journal of Food Science and Technology, 57(4), 1535–1543.
  • Judson, R., Elloumi, F., Setzer, R. W., Li, Z., Shah, I. (2008). A comparison of machine learning algorithms for chemical toxicity classification using a simulated multi-scale data model. BMC Bioinformatics, 9(1), 241.
  • Karakaplan, M., Avcu, F. M. (2013). A parallel and non-parallel genetic algorithm for deconvolution of NMR spectra peaks. Chemometrics and Intelligent Laboratory Systems, 125, 147-152. Karakaplan, M., Avcu, F. M. (2021). Classification of some chemical drugs by genetic algorithm and deep neural network hybrid method. Concurrency and Computation: Practice and Experience, 33(13), e6242. Kumar, J., Singh, A. K. (2018). Workload prediction in cloud using artificial neural network and adaptive differential evolution. Future Generation Computer Systems, 81, 41–52.
  • Leen, T. K., Dietterich, T. G., Tresp, V. (2001). Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference. MIT Press.
  • Maron, M. E. (1961). Automatic Indexing: An Experimental Inquiry, Journal of the ACM, 8(3), 404–417.
  • Mayr, A., Klambauer, G., Unterthiner, T., Steijaert, M., K. Wegner, J., Ceulemans, H., …Hochreiter, S. (2018). Large-scale comparison of machine learning methods for drug target prediction on ChEMBL. Chemical Science, 9(24), 5441–5451.
  • Minerali, E., Foil, D. H., Zorn, K. M., Lane, T. R., Ekins, S. (2020). Comparing Machine Learning Algorithms for Predicting Drug-Induced Liver Injury (DILI). Molecular Pharmaceutics, 17(7), 2628–2637.
  • Pal, M. (2005). Random forest classifier for remote sensing classification. International Journal of Remote Sensing, 26(1), 217–222.
  • PyChem homepage | PyChem. (n.d.). 7 Kasım 2021 tarihinde, http://pychem.sourceforge.net/ adresinden erişildi.
  • Python.org.. Python.Org. 7 Kasım 2021 tarihinde,https://www.python.org/ adresinden erişildi.
  • Russo, D. P., Zorn, K. M., Clark, A. M., Zhu, H., Ekins, S. (2018). Comparing Multiple Machine Learning Algorithms and Metrics for Estrogen Receptor Binding Prediction. Molecular Pharmaceutics, 15(10), 4361–4370.
  • Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural Networks, 61, 85–117.
  • Scikit-learn: Machine learning in Python—Scikit-learn 1.0.1 documentation. 7 Kasım 2021 tarihinde https://scikit-learn.org/stable/ adresinden erişildi.
  • Sekeroglu, B. (2004). Classification of sonar images using back propagation neural network, IGARSS 2004. 2004 IEEE International Geoscience and Remote Sensing Symposium, 5, 3092–3095 vol.5.
  • Taddy, M. (2019). Business Data Science: Combining Machine Learning and Economics to Optimize, Automate, and Accelerate Business Decisions, McGraw Hill Professional.
  • TensorFlow.. TensorFlow. 7 Kasım 2021 tarihinde, https://www.tensorflow.org/ adresinden erişildi.
  • Tso, G. K. F., Yau, K. K. W. (2007). Predicting electricity energy consumption: A comparison of regression analysis, decision tree and neural networks. Energy, 32(9), 1761–1768.
  • Valiev, M., Bylaska, E. J., Govind, N., Kowalski, K., Straatsma, T. P., Van Dam, H. J. J., … de Jong, W. A. (2010). NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations. Computer Physics Communications, 181(9), 1477–1489.
  • Xie, Y., Zhang, C., Hu, X., Zhang, C., Kelley, S. P., Atwood, J. L., Lin, J. (2020). Machine Learning Assisted Synthesis of Metal–Organic Nanocapsules. Journal of the American Chemical Society, 142(3), 1475–1481.
There are 30 citations in total.

Details

Primary Language Turkish
Subjects Clinical Sciences
Journal Section Araştırma Makalesi
Authors

Fatih Mehmet Avcu 0000-0002-1973-7745

Publication Date March 10, 2022
Submission Date November 11, 2021
Acceptance Date January 17, 2022
Published in Issue Year 2022

Cite

APA Avcu, F. M. (2022). AZ VERİ SETLİ ÇALIŞMALARINDA DERİN ÖĞRENME VE DİĞER SINIFLANDIRMA ALGORİTMALARININ KARŞILAŞTIRILMASI: AGONİST VE ANTAGONİST LİGAND ÖRNEĞİ. İnönü Üniversitesi Sağlık Hizmetleri Meslek Yüksek Okulu Dergisi, 10(1), 356-371. https://doi.org/10.33715/inonusaglik.1022065