Research Article
BibTex RIS Cite

İkili parçacık sürü optimizasyonu ve destek vektör makinelerinin hibrit kullanımı ile ilaç keşfi için özellik seçimi

Year 2021, Volume: 11 Issue: 1, 169 - 178, 15.01.2021
https://doi.org/10.17714/gumusfenbil.776329

Abstract

Hastalıkların tedavisini ve önlenmesini sağlayan yeni bir ilacın keşif süreci oldukça maliyetli, karmaşık ve zaman alan bir süreç olduğu için ilaç endüstrisinde kritik bir konudur. Bu çalışma, ilaç keşif sürecinde klinik öncesi aşamayı in silico olarak da anılan hesaplamalı yöntemler ile kısaltmayı hedeflemektedir. Çalışma kapsamında potansiyel ilaç moleküllerini belirlemekte etkin ve ilgili olan özelliklerin seçimi için destek vektör makineleri ile iki sezgisel algoritma -sürekli ve ikili parçacık sürü optimizasyonu- hibritlenmiştir. İlaç molekülleri ve ilgili 161 özellikten oluşan ayrık iki veri seti eğitim ve sınama setleri olarak kullanılmış, uygun parametreler seçilerek farklı parçacık sayıları ile hem sürekli hem de ikili olarak karşılaştırmalı özellik seçimleri gerçekleştirilmiştir. İkili parçacık sürü optimizasyonunda 30 parçacık sayısıyla 49 özellik seçilmiş ve %92.54 doğruluk oranı elde edilmiştir. Diğer taraftan, doğruluk oranı sürekli parçacık sürü optimizasyonunda 50 parçacık ve 82 özellik sayısıyla %94.03 olarak bulunmuştur.

Thanks

Bu çalışma, Mimar Sinan Güzel Sanatlar Üniversitesi Fen Bilimleri Enstitüsü, İstatistik Anabilim Dalı Yüksek Lisans Programı’nda, Nilay Subaş tarafından, Doç. Dr. Ayça Çakmak Pehlivanlı danışmanlığında tamamlanan “Sürekli/İkili Parçacık Sürü Optimizasyonu ve Destek Vektör Makinelerinin Hibrit Kullanımı ile Özellik Seçimi” başlıklı Yüksek Lisans tezinden üretilmiştir. Tezin inceleme ve değerlendirme aşamasında yapmış oldukları katkılardan dolayı juri üyelerine teşekkür ederiz.

References

  • Ajay, W., Walters, P. ve Murcko, M. A. (1998). Can we learn to distinguish between “drug-like” and “nondrug-like” molecules? Journal of Medicinal Chemistry, 41, 3314-3324. https://doi.org/10.1021/jm970666c
  • Al-Thanoon, N. A., Qasim, O. S. ve Algamal, Z. Y. (2019). A new hybrid firefly algorithm and particle swarm optimization for tuning parameter estimation in penalized support vector machine with application in chemometrics. Chemometrics and Intelligent Laboratory Systems, 184, 142-152. https://doi.org/10.1016/j.chemolab.2018.12.003
  • Arciniegas, F., Bennett, K., Breneman, C. ve Embrechts, M.J. (2000). Molecular database mining using self-organizing maps for the design of novel pharmaceuticals. Intelligent Engineering Systems through Artificial Neural Networks: Smart Engineering System Design, 10, (pp. 477-481). St. Louis, MO.
  • Byvatov, E., Fechner, U., Sadowski, J. ve Schneider, G. (2003). Comparison of support vector machine and artificial neural networks systems for drug/nondrug classification. Journal of Chemical Information and Computer Sciences, 43(6), 1882-1889. https://doi.org/10.1021/ci0341161
  • Cervante, L., Xue, B. ve Zhang, M. (2012). Binary particle swarm optimization for feature selection: a filter-based approach. IEEE Congress on Evolutionary Computation, (pp. 1-8). Brisbane, QLD. https://doi.org/10.1109/CEC.2012.6256452
  • Cherkasov, A. (2006). Can bacterial-metabolite-likeness model improve odds of in-silico antibiotic discovery? Journal of Chemical Information and Modeling, 46(3), 1214-1222. https://doi.org/10.1021/ci050480j
  • Dash, M. ve Liu, H. (1997). Feature selection for classification. Intelligent Data Analysis, 1(1-4), 131–150. https://doi.org/10.1016/S1088-467X(97)00008-5
  • Der, O., Vural, A. ve Yıldırım, T. (2008). Parçacık sürü optimizasyonu tabanlı evirici tasarımı. Elektrik-Elektronik ve Biyomedikal Mühendisliği Konferansı (pp. 1-4). Bursa.
  • Guyon, I. ve Elisseeff, A., (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3(7-8), 1157-1162. https://doi.org/10.1162/153244303322753616.
  • Kennedy, J. ve Eberhart, R. C. (1995). Particle swarm optimization. Proceedings of the IEEE International Conference on Neural Networks 4, (pp 1942–1948). Piscataway, NJ. https://doi.org/10.1109/ICNN.1995.488968
  • Kennedy, J. ve Eberhart, R. C. (1997). A discrete binary version of the particle swarm algorithm. IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation, 5, (pp. 4104–4108). Orlando, FL. http://doi.org/10.1109/ICSMC.1997.637339
  • Khanesar, M, A., Tavakoli, H., Teshnehlab, M. ve Shoorehdeli, A., M. (2007). A novel binary particle swarm optimization. Mediterranean Conference on Control & Automation (pp. 1-6). Athens. https://doi.org/10.1109/MED.2007.4433821
  • Mafarja, M., Jarrar, R., Ahmad, S. ve Abusnaina, A. A. (2018). Feature selection using binary particle swarm optimization with time varying inertia weight strategies. Proceedings of the 2nd International Conference on Future Networks and Distributed Systems Association for Computing Machinery, 18 (pp. 1–9). New York, NY. https://doi.org/10.1145/3231053.3231071
  • Mirjalili, S. ve Lewis, A. (2013). S-shaped versus V-shaped transfer functions for binary particle swarm optimization. Swarm and Evolutionary Computation, 9, 1–14. https://doi.org/10.1016/j.swevo.2012.09.002
  • MOE, Molecular Operational Environment, (2006). Chemical Computing Group Inc., Montreal, Canada.
  • Murcia-Soler, M., Pe´Rez-Gimenez, F., Garcia-M., J., Salabert-Salvador, M. T., Diaz-Villanueva, W. ve Castro-Bleda, M. J. (2003). Drugs and nondrugs: an effective discrimination with topological methods and artificial neural networks. Journal of Chemical Information and Computer Sciences, 43(5), 1688-1702. https://doi.org/10.1021/ci0302862
  • Ortakcı, Y. ve Güloğlu, C., (2012). Parçacık sürü optimizasyonu ile küme sayısının belirlenmesi. Akademik Bilişim Konferansı. (ss. 335-342). Uşak.
  • Pehlivanlı, A.Ç. ve Gümüştaş, E. (2019). Mutajenisite tahmininde in-silico istatistiksel öğrenme modeli. Mimar Sinan Güzel Sanatlar Üniviversitesi. Bilimsel Araştırma Projesi, BAP 2018-30.
  • Pehlivanlı, A.Ç., (2008). Consensual classification of drug/nondrug compounds for drug design. Doktora Tezi, Çukurova Üniversitesi Fen Bilimleri Enstitüsü, Adana.
  • Pehlivanlı, A.Ç., Ersoy, O.K. ve Ibrikci, T. (2008). Drug/nondrug classification with consensual Self-Organising Map and Self-Organising Global Ranking algorithms. International Journal of Computational Biology and Drug Design, 1(4), 436. https://doi.org/10.1504/ijcbdd.2008.022212
  • Pehlivanlı, A.Ç. (2016). A novel feature selection scheme for high-dimensional data sets: four-Staged Feature Selection. Journal of Applied Statistics, 43(6), 1140-1154. https://doi.org/10.1080/02664763.2015.1092112
  • Qasim, O.S. ve Algamal, Z.Y. (2018). Feature selection using particle swarm optimization-based logistic regression model. Chemometrics and Intelligent Laboratory Systems, 182, 41-46. https://doi.org/10.1016/j.chemolab.2018.08.016
  • Rockhold, F. W. (2000). Strategic use of statistical thinking in drug development. Statistics in Medicine, 19, 3211–3217. https://doi.org/10.1002/1097-0258(20001215)19:23<3211::aid-sim622>3.0.co;2-f
  • Sakri, S. B., Abdul Rashid, N. B. ve Zain, Z. M. (2018). Particle swarm optimization feature selection for breast cancer recurrence prediction. IEEE Access, 6, 29637-29647. https://doi.org/10.1109/ACCESS.2018.2843443
  • Sokolova, M., Japkowicz, N. ve Szpakowicz, S. (2006). Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. Australasian Joint Conference on Artificial Intelligence, Springer. (pp. 1015-1021). Berlin Heidelberg. https://doi.org/10.1007/11941439_114
  • Subaş, N. (2019). Sürekli/İkili parçacık sürü optimizasyonu ve destek vektör makinelerinin hibrit kullanımı ile özellik seçimi. Yüksek Lisans Tezi, Mimar Sinan Güzel Sanatlar Üniversitesi Fen Bilimleri Enstitüsü, İstanbul.
  • Tretea, I.C. (2003). The Particle swarm optimization algorithms: Convergence analysis and parameter selection. Information Processing Letters, 85, 317-325. https://doi.org/10.1016/S0020-0190(02)00447-7
  • Ünler, A. ve Murat, A. (2010). A discrete particle swarm optimization method for feature selection in binary classification problems. European Journal of Operational Research, 206(3), 528–534. https://doi.org/10.1016/j.ejor.2010.02.032
  • Vapnik, V. (1995). The nature of statistical learning theory. New York, Inc. Springer-Verlag
  • Vashishtha, N. ve Vashishtha, J. (2016). Particle swarm optimization-based feature selection. International Journal of Computer Applications, 146(6), 11-17. https://doi.org/10.5120/ijca2016910789
  • Wagener, M. ve Van Geerestein, V. J. (2000). Potential drug and non-drugs: prediction and identification of important structural features. Journal of Chemical Information and Computer Sciences, 40(2), 280-292. https://doi.org/10.1021/ci990266t

Feature selection for drug discovery with hybrid usage of binary particle swarm optimization and support vector machines

Year 2021, Volume: 11 Issue: 1, 169 - 178, 15.01.2021
https://doi.org/10.17714/gumusfenbil.776329

Abstract

The discovery process of a new drug that provides treatment and prevention of diseases is a critical issue in the pharmaceutical industry, as it is a costly, complex and time-consuming process. This study aims to shorten the preclinical stage in the drug discovery process with computational methods, also called in silico. Within the scope of this study, support vector machines have been hybridized with two heuristic algorithms -binary and continues particle swarm optimizations- in order to select the most relevant and informative properties for determining potential drug molecules. Two distinct datasets which consist of drug molecules with related 161 features were used as train and test sets, and both continuous and binary particle swarm optimizations were conducted with tuned parameters and different particle numbers for comparative feature selections. In binary particle swarm optimization, 49 features had been selected with 30 particles and an accuracy rate of 92.54% was obtained. On the other hand, the accuracy rate was found as 94.03% with 50 particles and 82 features by continuous particle swarm optimization.

References

  • Ajay, W., Walters, P. ve Murcko, M. A. (1998). Can we learn to distinguish between “drug-like” and “nondrug-like” molecules? Journal of Medicinal Chemistry, 41, 3314-3324. https://doi.org/10.1021/jm970666c
  • Al-Thanoon, N. A., Qasim, O. S. ve Algamal, Z. Y. (2019). A new hybrid firefly algorithm and particle swarm optimization for tuning parameter estimation in penalized support vector machine with application in chemometrics. Chemometrics and Intelligent Laboratory Systems, 184, 142-152. https://doi.org/10.1016/j.chemolab.2018.12.003
  • Arciniegas, F., Bennett, K., Breneman, C. ve Embrechts, M.J. (2000). Molecular database mining using self-organizing maps for the design of novel pharmaceuticals. Intelligent Engineering Systems through Artificial Neural Networks: Smart Engineering System Design, 10, (pp. 477-481). St. Louis, MO.
  • Byvatov, E., Fechner, U., Sadowski, J. ve Schneider, G. (2003). Comparison of support vector machine and artificial neural networks systems for drug/nondrug classification. Journal of Chemical Information and Computer Sciences, 43(6), 1882-1889. https://doi.org/10.1021/ci0341161
  • Cervante, L., Xue, B. ve Zhang, M. (2012). Binary particle swarm optimization for feature selection: a filter-based approach. IEEE Congress on Evolutionary Computation, (pp. 1-8). Brisbane, QLD. https://doi.org/10.1109/CEC.2012.6256452
  • Cherkasov, A. (2006). Can bacterial-metabolite-likeness model improve odds of in-silico antibiotic discovery? Journal of Chemical Information and Modeling, 46(3), 1214-1222. https://doi.org/10.1021/ci050480j
  • Dash, M. ve Liu, H. (1997). Feature selection for classification. Intelligent Data Analysis, 1(1-4), 131–150. https://doi.org/10.1016/S1088-467X(97)00008-5
  • Der, O., Vural, A. ve Yıldırım, T. (2008). Parçacık sürü optimizasyonu tabanlı evirici tasarımı. Elektrik-Elektronik ve Biyomedikal Mühendisliği Konferansı (pp. 1-4). Bursa.
  • Guyon, I. ve Elisseeff, A., (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3(7-8), 1157-1162. https://doi.org/10.1162/153244303322753616.
  • Kennedy, J. ve Eberhart, R. C. (1995). Particle swarm optimization. Proceedings of the IEEE International Conference on Neural Networks 4, (pp 1942–1948). Piscataway, NJ. https://doi.org/10.1109/ICNN.1995.488968
  • Kennedy, J. ve Eberhart, R. C. (1997). A discrete binary version of the particle swarm algorithm. IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation, 5, (pp. 4104–4108). Orlando, FL. http://doi.org/10.1109/ICSMC.1997.637339
  • Khanesar, M, A., Tavakoli, H., Teshnehlab, M. ve Shoorehdeli, A., M. (2007). A novel binary particle swarm optimization. Mediterranean Conference on Control & Automation (pp. 1-6). Athens. https://doi.org/10.1109/MED.2007.4433821
  • Mafarja, M., Jarrar, R., Ahmad, S. ve Abusnaina, A. A. (2018). Feature selection using binary particle swarm optimization with time varying inertia weight strategies. Proceedings of the 2nd International Conference on Future Networks and Distributed Systems Association for Computing Machinery, 18 (pp. 1–9). New York, NY. https://doi.org/10.1145/3231053.3231071
  • Mirjalili, S. ve Lewis, A. (2013). S-shaped versus V-shaped transfer functions for binary particle swarm optimization. Swarm and Evolutionary Computation, 9, 1–14. https://doi.org/10.1016/j.swevo.2012.09.002
  • MOE, Molecular Operational Environment, (2006). Chemical Computing Group Inc., Montreal, Canada.
  • Murcia-Soler, M., Pe´Rez-Gimenez, F., Garcia-M., J., Salabert-Salvador, M. T., Diaz-Villanueva, W. ve Castro-Bleda, M. J. (2003). Drugs and nondrugs: an effective discrimination with topological methods and artificial neural networks. Journal of Chemical Information and Computer Sciences, 43(5), 1688-1702. https://doi.org/10.1021/ci0302862
  • Ortakcı, Y. ve Güloğlu, C., (2012). Parçacık sürü optimizasyonu ile küme sayısının belirlenmesi. Akademik Bilişim Konferansı. (ss. 335-342). Uşak.
  • Pehlivanlı, A.Ç. ve Gümüştaş, E. (2019). Mutajenisite tahmininde in-silico istatistiksel öğrenme modeli. Mimar Sinan Güzel Sanatlar Üniviversitesi. Bilimsel Araştırma Projesi, BAP 2018-30.
  • Pehlivanlı, A.Ç., (2008). Consensual classification of drug/nondrug compounds for drug design. Doktora Tezi, Çukurova Üniversitesi Fen Bilimleri Enstitüsü, Adana.
  • Pehlivanlı, A.Ç., Ersoy, O.K. ve Ibrikci, T. (2008). Drug/nondrug classification with consensual Self-Organising Map and Self-Organising Global Ranking algorithms. International Journal of Computational Biology and Drug Design, 1(4), 436. https://doi.org/10.1504/ijcbdd.2008.022212
  • Pehlivanlı, A.Ç. (2016). A novel feature selection scheme for high-dimensional data sets: four-Staged Feature Selection. Journal of Applied Statistics, 43(6), 1140-1154. https://doi.org/10.1080/02664763.2015.1092112
  • Qasim, O.S. ve Algamal, Z.Y. (2018). Feature selection using particle swarm optimization-based logistic regression model. Chemometrics and Intelligent Laboratory Systems, 182, 41-46. https://doi.org/10.1016/j.chemolab.2018.08.016
  • Rockhold, F. W. (2000). Strategic use of statistical thinking in drug development. Statistics in Medicine, 19, 3211–3217. https://doi.org/10.1002/1097-0258(20001215)19:23<3211::aid-sim622>3.0.co;2-f
  • Sakri, S. B., Abdul Rashid, N. B. ve Zain, Z. M. (2018). Particle swarm optimization feature selection for breast cancer recurrence prediction. IEEE Access, 6, 29637-29647. https://doi.org/10.1109/ACCESS.2018.2843443
  • Sokolova, M., Japkowicz, N. ve Szpakowicz, S. (2006). Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. Australasian Joint Conference on Artificial Intelligence, Springer. (pp. 1015-1021). Berlin Heidelberg. https://doi.org/10.1007/11941439_114
  • Subaş, N. (2019). Sürekli/İkili parçacık sürü optimizasyonu ve destek vektör makinelerinin hibrit kullanımı ile özellik seçimi. Yüksek Lisans Tezi, Mimar Sinan Güzel Sanatlar Üniversitesi Fen Bilimleri Enstitüsü, İstanbul.
  • Tretea, I.C. (2003). The Particle swarm optimization algorithms: Convergence analysis and parameter selection. Information Processing Letters, 85, 317-325. https://doi.org/10.1016/S0020-0190(02)00447-7
  • Ünler, A. ve Murat, A. (2010). A discrete particle swarm optimization method for feature selection in binary classification problems. European Journal of Operational Research, 206(3), 528–534. https://doi.org/10.1016/j.ejor.2010.02.032
  • Vapnik, V. (1995). The nature of statistical learning theory. New York, Inc. Springer-Verlag
  • Vashishtha, N. ve Vashishtha, J. (2016). Particle swarm optimization-based feature selection. International Journal of Computer Applications, 146(6), 11-17. https://doi.org/10.5120/ijca2016910789
  • Wagener, M. ve Van Geerestein, V. J. (2000). Potential drug and non-drugs: prediction and identification of important structural features. Journal of Chemical Information and Computer Sciences, 40(2), 280-292. https://doi.org/10.1021/ci990266t
There are 31 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Articles
Authors

Nilay Subaş This is me 0000-0002-3173-4942

Ayça Çakmak Pehlivanlı 0000-0001-9884-6538

Publication Date January 15, 2021
Submission Date August 1, 2020
Acceptance Date December 10, 2020
Published in Issue Year 2021 Volume: 11 Issue: 1

Cite

APA Subaş, N., & Çakmak Pehlivanlı, A. (2021). İkili parçacık sürü optimizasyonu ve destek vektör makinelerinin hibrit kullanımı ile ilaç keşfi için özellik seçimi. Gümüşhane Üniversitesi Fen Bilimleri Dergisi, 11(1), 169-178. https://doi.org/10.17714/gumusfenbil.776329