Research Article
BibTex RIS Cite
Year 2019, Volume: 3 Issue: 1, 32 - 39, 15.04.2019

Abstract

References

  • 1. Hashemi, H.B., Shakery, A., Naeini, M.P, Protein fold pattern recognition using Bayesian ensemble of RBF neural networks, in SOCPAR2009: Malaysia. p. 436-441.
  • 2. Cantoni, V., Ferone, A., Ozbudak, O. and Petrosino, A., Searching structural blocks by SS exhaustive matching, Lecture Notes in Bioinformatics. Leif Peterson, Giuseppe Russo, Francesco Masulli (Eds.), 2013. p. 57-69.
  • 3. Protein Data Bank, http://www.rcsb.org, last access date: 31.12.2018.
  • 4. Murzin, A.G., Brenner, S.E., Hubbard, T. and Chothia, C., SCOP: A structural classification of proteins database for the investigation of sequences and structures, Journal of Molecular Biology, 1995. 247(4), p. 536–540.
  • 5. Dubchak, I., Muchnik, I., Mayor, C., Dralyuk, I. and Kim, S.H., Recognition of a protein fold in the context of the structural classifications of proteins (SCOP) classification, Proteins: Structure, Function and Bioinformatics, 1999. 35(4), p. 401–407.
  • 6. Reczko, M. and Bohr, H., The DEF data base of sequence based protein fold class predictions, Nucleic acids research, 1994. 22(17), p. 3616-3619.
  • 7. Edler, L., Grassmann, J. and Suhai, S., Role and results of statistical methods in protein fold class prediction, Mathematical and Computer Modelling, 2001. 33(12), p. 1401–1417.
  • 8. Ding, C.H.Q. and Dubchak, I., Multi-class protein fold recognition problem using support vector machines and neural networks, Bioinformatics, 2001. 17(4), p. 349–358.
  • 9. Bologna, G. and Appel, R.D., A comparison study on protein fold recognition, Proceedings of the 9th International Conference on Neural Information Processing, 2002. volume 5, IEEE, p. 2492–2496.
  • 10. Igel, C., Gebert, J. and Wiebringhaus, T., Protein fold class prediction using neural networks with tailored early-stopping, , Proceedings of IEEE International Joint Conference on Neural Networks,2004. volume 3, p. 1693–1697.
  • 11. Huang, C.D., Liang, S.F., Lin, C.T. and Wu, R.C., Machine learning with automatic feature selection for multi-class protein fold classification, Journal of information science and engineering, 2005. 21(4), p. 711–720.
  • 12. Jazebi, S., Tohidi, A. and Rahgozar, M., Application of classifier fusion for protein fold recognition, Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 2009. volume 7, p.171–175.
  • 13. Chinnasamy, A., Sung, W.K. and Mittal, A., Protein structure and fold prediction using tree-augmented naive Bayesian classifier, Journal of Bioinformatics and Computational Biology, 2005. 3(04), p. 803–819.
  • 14. Okun, O., Protein fold recognition with k-local hyperplane distance nearest neighbor algorithm, Proceedings of the Second European Workshop on Data Mining and Text Mining in Bioinformatics, 2004. Pisa, Italy, Citeseer, p. 51–57.
  • 15. Shen, H.B. and Chou, K.C., Ensemble classifier for protein fold pattern recognition, Bioinformatics, 2006. 22(14), p. 1717–1722.
  • 16. Kavousi, K., Moshiri, B., Sadeghi, M., Araabi, B.N. and Moosavi-Movahedi, A.A., A protein fold classifier formed by fusing different modes of pseudo amino acid composition via PSSM, Computational Biology and Chemistry, 2011. 35(1), p. 1–9.
  • 17. Kavousi, K., Sadeghi, M., Moshiri, B. and Araabi, B. N.and Moosavi-Movahedi, A.A., Evidence theoretic protein fold classification based on the concept of hyperfold, Mathematical Biosciences, 2012. 240(2), p. 148–160.
  • 18. Markowetz, F., Edler, L. and Vingron, M., Support vector machines for protein fold class prediction, Biometrical Journal, 2003. 45(3), p. 377–389.
  • 19. Shi, S.Y.M., Suganthan, P.N. and Deb, K., Multiclass protein fold recognition using multiobjective evolutionary algorithms, Proceedings of the IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, 2004. p. 61–66.
  • 20. Shamim, M.T.A., Anwaruddin, M. and Nagarajaram, H.A., Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs, Bioinformatics, 2007. 23(24), p. 3320–3327.
  • 21. Bindewald, E., Cestaro, A., Hesser, J., Heiler, M. and Tosatto, S.C.E., MANIFOLD: protein fold recognition based on secondary structure, sequence similarity and enzyme classification, Protein Engineering, 2003. 16(11), p. 785–789.
  • 22. Nanni, L., A novel ensemble of classifiers for protein fold recognition, Neurocomputing, 2006. 69(16), p. 2434–2437.
  • 23. Nanni, L., Ensemble of classifiers for protein fold recognition, Neurocomputing, 2006. 69(7), p. 850–853.
  • 24. Chen, K. and Kurgan, L., PFRES: protein fold classification by using evolutionary information and predicted secondary structure, Bioinformatics, 2007. 23(21), p. 2843–2850.
  • 25. Guo, X. and Gao, X., A novel hierarchical ensemble classifier for protein fold recognition, Protein Engineering Design and Selection, 2008. 21(11), p. 659–664.
  • 26. Chen, P., Liu, C., Burge, L., Mahmood, M., Southerland, W. and Gloster, C., Protein fold classification with genetic algorithms and feature selection, Journal of bioinformatics and computational biology, 2009. 7(05), p. 773–788.
  • 27. Yang, T., Kecman, V., Cao, L., Zhang, C. and Huang, J.Z., Margin-based ensemble classifier for protein fold recognition, Expert Systems with Applications, 2011. 38(10), p. 12348–12355.
  • 28. Lin, C., Zou, Y., Qin, J., Liu, X., Jiang, Y., Ke, C. and Zou, Q., Hierarchical classification of protein folds using a novel ensemble classifier, PloS one, 2013. 8(2), e56499.
  • 29. Aram, R.Z. and Charkari, N.M., A two-layer classification framework for protein fold recognition, Journal of Theoretical Biology, 2015. 365, p. 32–39.
  • 30. Damoulas, T. and Girolami, M.A., Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection, Bioinformatics, 2008. 24(10), p. 1264–1270.
  • 31. Huang, C.D., Lin, C.T. and Pal, N.R., Hierarchical learning architecture with automatic feature selection for multiclass protein fold classification, NanoBioscience, IEEE Transactions on, 2003. 2(4), p. 221–232.
  • 32. Krishnaraj, Y. and Reddy, C.K., Boosting methods for protein fold recognition: an empirical comparison, IEEE International Conference on Bioinformatics and Biomedicine, 2008.
  • 33. Shen, H.B. and Chou, K.C., Predicting protein fold pattern with functional domain and sequential evolution information, Journal of Theoretical Biology, 2009. 256(3), p. 441–446.
  • 34. Dehzangi, A., Amnuaisuk, S.P. and Dehzangi, O., Using random forest for protein fold prediction problem: An empirical study, J. Inf. Sci. Eng., 2010. 26(6), p. 1941–1956.
  • 35. Dehzangi, A., Amnuaisuk, S.P., Manafi, M. and Safa, S., Using rotation forest for protein fold prediction problem: An empirical study, 8th European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics, 2010. p. 217–227.
  • 36. Wang, R. and Gao, X., A Two-Layer Learning Architecture for Multi-Class Protein Folds Classification, Interdisciplinary Research and Applications in Bioinformatics, Computational Biology, and Environmental Sciences, 2010.
  • 37. Dehzangi, A.and Karamizadeh, S., Solving protein fold prediction problem using fusion of heterogeneous classifiers, INFORMATION, An International Interdisciplinary Journal, 2011. 14(11), p. 3611–3622.
  • 38. Suvarnavani, K., Rafiah, S.B. and Kamisetti, N.R., Multiclass classification for protein fold prediction using Smote, International Journal of Advanced Research in Computer Science and Software Engineering, 2011. 2(11), p. 290–296.
  • 39. Bae, S.E., Jung, S., Ahn, I. and Son, H.S., Protein fold classification with backbone torsional characters using multi-class linear discriminant analysis, J Proteomics Bioinform, 2013. 6, p. 148–152.
  • 40. Duda RO, Hart PE. Pattern Classification and Scene Analysis. John-Wiley&Sons. Inc. 1973.
  • 41. Kohonen, T., Self-Organized Formation of Topologically Correct Feature Maps, Biological Cybernetics, 1982. 43(1), p. 59-69.
  • 42. Polat O. and Dokur Z., Protein fold recognition using self-organizing map neural network, Current Bioinformatics, 2016. 11, p. 451-458.
  • 43. Alpaydın E., Neural models of incremental supervized and unsupervized learning, Ds. Thesis, Ecole Polytecnique Federale De Lausanne, Switzerland, 1990.
  • 44. Polat O. and Dokur Z., Protein fold classification with grow-and-learn network, Turk J Elec Eng & Comp Sci, 2017. 25, p. 1184-1196.
  • 45. Ölmez, T., Dokur, Z. Uzman Sistemlerde Örüntü Tanıma: Yapay Sinir Ağları, Genetik Algoritmalar, Bulanık Mantık, Makine Öğrenmesi ders notu.

Determination of highly effective attributes in fold level classification of proteins

Year 2019, Volume: 3 Issue: 1, 32 - 39, 15.04.2019

Abstract

In this paper it is aimed to determine which of the
protein features or attributes is the most significant for classification of
proteins according to their folds. Proteins in the database used in this study
are represented by six feature groups called attributes and by a 125-dimensional
feature vector. The representation of proteins with very high dimensional
vectors such as 125 causes increasing computational load of the classification
process and extending the process time. In this study “dimension reduction”
solution is offered for this negative situation. Hence, with two different
approaches, the features and attributes having high classification performance
are determined. In the first approach, which attribute gives higher performance
is determined by testing separately each of the six attributes. In the second
approach, the most significant of the 125 features are determined using
Divergence Analysis method. In this study, a classic classifier KNN (K-nearest
neighbor) and artificial neural network models GAL (Grow and Learn) and SOM
(Self-Organizing Map) networks are used as classifier and classification
performance is analyzed for reduced dimension datasets.

References

  • 1. Hashemi, H.B., Shakery, A., Naeini, M.P, Protein fold pattern recognition using Bayesian ensemble of RBF neural networks, in SOCPAR2009: Malaysia. p. 436-441.
  • 2. Cantoni, V., Ferone, A., Ozbudak, O. and Petrosino, A., Searching structural blocks by SS exhaustive matching, Lecture Notes in Bioinformatics. Leif Peterson, Giuseppe Russo, Francesco Masulli (Eds.), 2013. p. 57-69.
  • 3. Protein Data Bank, http://www.rcsb.org, last access date: 31.12.2018.
  • 4. Murzin, A.G., Brenner, S.E., Hubbard, T. and Chothia, C., SCOP: A structural classification of proteins database for the investigation of sequences and structures, Journal of Molecular Biology, 1995. 247(4), p. 536–540.
  • 5. Dubchak, I., Muchnik, I., Mayor, C., Dralyuk, I. and Kim, S.H., Recognition of a protein fold in the context of the structural classifications of proteins (SCOP) classification, Proteins: Structure, Function and Bioinformatics, 1999. 35(4), p. 401–407.
  • 6. Reczko, M. and Bohr, H., The DEF data base of sequence based protein fold class predictions, Nucleic acids research, 1994. 22(17), p. 3616-3619.
  • 7. Edler, L., Grassmann, J. and Suhai, S., Role and results of statistical methods in protein fold class prediction, Mathematical and Computer Modelling, 2001. 33(12), p. 1401–1417.
  • 8. Ding, C.H.Q. and Dubchak, I., Multi-class protein fold recognition problem using support vector machines and neural networks, Bioinformatics, 2001. 17(4), p. 349–358.
  • 9. Bologna, G. and Appel, R.D., A comparison study on protein fold recognition, Proceedings of the 9th International Conference on Neural Information Processing, 2002. volume 5, IEEE, p. 2492–2496.
  • 10. Igel, C., Gebert, J. and Wiebringhaus, T., Protein fold class prediction using neural networks with tailored early-stopping, , Proceedings of IEEE International Joint Conference on Neural Networks,2004. volume 3, p. 1693–1697.
  • 11. Huang, C.D., Liang, S.F., Lin, C.T. and Wu, R.C., Machine learning with automatic feature selection for multi-class protein fold classification, Journal of information science and engineering, 2005. 21(4), p. 711–720.
  • 12. Jazebi, S., Tohidi, A. and Rahgozar, M., Application of classifier fusion for protein fold recognition, Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 2009. volume 7, p.171–175.
  • 13. Chinnasamy, A., Sung, W.K. and Mittal, A., Protein structure and fold prediction using tree-augmented naive Bayesian classifier, Journal of Bioinformatics and Computational Biology, 2005. 3(04), p. 803–819.
  • 14. Okun, O., Protein fold recognition with k-local hyperplane distance nearest neighbor algorithm, Proceedings of the Second European Workshop on Data Mining and Text Mining in Bioinformatics, 2004. Pisa, Italy, Citeseer, p. 51–57.
  • 15. Shen, H.B. and Chou, K.C., Ensemble classifier for protein fold pattern recognition, Bioinformatics, 2006. 22(14), p. 1717–1722.
  • 16. Kavousi, K., Moshiri, B., Sadeghi, M., Araabi, B.N. and Moosavi-Movahedi, A.A., A protein fold classifier formed by fusing different modes of pseudo amino acid composition via PSSM, Computational Biology and Chemistry, 2011. 35(1), p. 1–9.
  • 17. Kavousi, K., Sadeghi, M., Moshiri, B. and Araabi, B. N.and Moosavi-Movahedi, A.A., Evidence theoretic protein fold classification based on the concept of hyperfold, Mathematical Biosciences, 2012. 240(2), p. 148–160.
  • 18. Markowetz, F., Edler, L. and Vingron, M., Support vector machines for protein fold class prediction, Biometrical Journal, 2003. 45(3), p. 377–389.
  • 19. Shi, S.Y.M., Suganthan, P.N. and Deb, K., Multiclass protein fold recognition using multiobjective evolutionary algorithms, Proceedings of the IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, 2004. p. 61–66.
  • 20. Shamim, M.T.A., Anwaruddin, M. and Nagarajaram, H.A., Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs, Bioinformatics, 2007. 23(24), p. 3320–3327.
  • 21. Bindewald, E., Cestaro, A., Hesser, J., Heiler, M. and Tosatto, S.C.E., MANIFOLD: protein fold recognition based on secondary structure, sequence similarity and enzyme classification, Protein Engineering, 2003. 16(11), p. 785–789.
  • 22. Nanni, L., A novel ensemble of classifiers for protein fold recognition, Neurocomputing, 2006. 69(16), p. 2434–2437.
  • 23. Nanni, L., Ensemble of classifiers for protein fold recognition, Neurocomputing, 2006. 69(7), p. 850–853.
  • 24. Chen, K. and Kurgan, L., PFRES: protein fold classification by using evolutionary information and predicted secondary structure, Bioinformatics, 2007. 23(21), p. 2843–2850.
  • 25. Guo, X. and Gao, X., A novel hierarchical ensemble classifier for protein fold recognition, Protein Engineering Design and Selection, 2008. 21(11), p. 659–664.
  • 26. Chen, P., Liu, C., Burge, L., Mahmood, M., Southerland, W. and Gloster, C., Protein fold classification with genetic algorithms and feature selection, Journal of bioinformatics and computational biology, 2009. 7(05), p. 773–788.
  • 27. Yang, T., Kecman, V., Cao, L., Zhang, C. and Huang, J.Z., Margin-based ensemble classifier for protein fold recognition, Expert Systems with Applications, 2011. 38(10), p. 12348–12355.
  • 28. Lin, C., Zou, Y., Qin, J., Liu, X., Jiang, Y., Ke, C. and Zou, Q., Hierarchical classification of protein folds using a novel ensemble classifier, PloS one, 2013. 8(2), e56499.
  • 29. Aram, R.Z. and Charkari, N.M., A two-layer classification framework for protein fold recognition, Journal of Theoretical Biology, 2015. 365, p. 32–39.
  • 30. Damoulas, T. and Girolami, M.A., Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection, Bioinformatics, 2008. 24(10), p. 1264–1270.
  • 31. Huang, C.D., Lin, C.T. and Pal, N.R., Hierarchical learning architecture with automatic feature selection for multiclass protein fold classification, NanoBioscience, IEEE Transactions on, 2003. 2(4), p. 221–232.
  • 32. Krishnaraj, Y. and Reddy, C.K., Boosting methods for protein fold recognition: an empirical comparison, IEEE International Conference on Bioinformatics and Biomedicine, 2008.
  • 33. Shen, H.B. and Chou, K.C., Predicting protein fold pattern with functional domain and sequential evolution information, Journal of Theoretical Biology, 2009. 256(3), p. 441–446.
  • 34. Dehzangi, A., Amnuaisuk, S.P. and Dehzangi, O., Using random forest for protein fold prediction problem: An empirical study, J. Inf. Sci. Eng., 2010. 26(6), p. 1941–1956.
  • 35. Dehzangi, A., Amnuaisuk, S.P., Manafi, M. and Safa, S., Using rotation forest for protein fold prediction problem: An empirical study, 8th European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics, 2010. p. 217–227.
  • 36. Wang, R. and Gao, X., A Two-Layer Learning Architecture for Multi-Class Protein Folds Classification, Interdisciplinary Research and Applications in Bioinformatics, Computational Biology, and Environmental Sciences, 2010.
  • 37. Dehzangi, A.and Karamizadeh, S., Solving protein fold prediction problem using fusion of heterogeneous classifiers, INFORMATION, An International Interdisciplinary Journal, 2011. 14(11), p. 3611–3622.
  • 38. Suvarnavani, K., Rafiah, S.B. and Kamisetti, N.R., Multiclass classification for protein fold prediction using Smote, International Journal of Advanced Research in Computer Science and Software Engineering, 2011. 2(11), p. 290–296.
  • 39. Bae, S.E., Jung, S., Ahn, I. and Son, H.S., Protein fold classification with backbone torsional characters using multi-class linear discriminant analysis, J Proteomics Bioinform, 2013. 6, p. 148–152.
  • 40. Duda RO, Hart PE. Pattern Classification and Scene Analysis. John-Wiley&Sons. Inc. 1973.
  • 41. Kohonen, T., Self-Organized Formation of Topologically Correct Feature Maps, Biological Cybernetics, 1982. 43(1), p. 59-69.
  • 42. Polat O. and Dokur Z., Protein fold recognition using self-organizing map neural network, Current Bioinformatics, 2016. 11, p. 451-458.
  • 43. Alpaydın E., Neural models of incremental supervized and unsupervized learning, Ds. Thesis, Ecole Polytecnique Federale De Lausanne, Switzerland, 1990.
  • 44. Polat O. and Dokur Z., Protein fold classification with grow-and-learn network, Turk J Elec Eng & Comp Sci, 2017. 25, p. 1184-1196.
  • 45. Ölmez, T., Dokur, Z. Uzman Sistemlerde Örüntü Tanıma: Yapay Sinir Ağları, Genetik Algoritmalar, Bulanık Mantık, Makine Öğrenmesi ders notu.
There are 45 citations in total.

Details

Primary Language English
Journal Section Research Articles
Authors

Özlem Polat 0000-0002-9395-4465

Publication Date April 15, 2019
Submission Date February 28, 2018
Acceptance Date January 13, 2019
Published in Issue Year 2019 Volume: 3 Issue: 1

Cite

APA Polat, Ö. (2019). Determination of highly effective attributes in fold level classification of proteins. International Advanced Researches and Engineering Journal, 3(1), 32-39.
AMA Polat Ö. Determination of highly effective attributes in fold level classification of proteins. Int. Adv. Res. Eng. J. April 2019;3(1):32-39.
Chicago Polat, Özlem. “Determination of Highly Effective Attributes in Fold Level Classification of Proteins”. International Advanced Researches and Engineering Journal 3, no. 1 (April 2019): 32-39.
EndNote Polat Ö (April 1, 2019) Determination of highly effective attributes in fold level classification of proteins. International Advanced Researches and Engineering Journal 3 1 32–39.
IEEE Ö. Polat, “Determination of highly effective attributes in fold level classification of proteins”, Int. Adv. Res. Eng. J., vol. 3, no. 1, pp. 32–39, 2019.
ISNAD Polat, Özlem. “Determination of Highly Effective Attributes in Fold Level Classification of Proteins”. International Advanced Researches and Engineering Journal 3/1 (April 2019), 32-39.
JAMA Polat Ö. Determination of highly effective attributes in fold level classification of proteins. Int. Adv. Res. Eng. J. 2019;3:32–39.
MLA Polat, Özlem. “Determination of Highly Effective Attributes in Fold Level Classification of Proteins”. International Advanced Researches and Engineering Journal, vol. 3, no. 1, 2019, pp. 32-39.
Vancouver Polat Ö. Determination of highly effective attributes in fold level classification of proteins. Int. Adv. Res. Eng. J. 2019;3(1):32-9.



Creative Commons License

Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.