Research Article
BibTex RIS Cite

Genetic Algorithm Based Variable Selection For Partial Least Squares Regression

Year 2011, Volume: 8 Issue: 3, 75 - 85, 15.12.2011

Abstract

Partial Least Squares (PLS) regression has been an alternative to ordinary least squares for handling multicollinearity in several areas of scientific research. At the core of the PLS methodology lies a dimension reduction technique coupled with a regression model. In this paper, we investigate the genetic algorithms-partial least square regression (GAPLSR). This technique combines genetic algorithms as powerful optimization methods with PLS as a statistical method for variable selection. Variable importance for projection is a weighted sum of squares of the PLS-weights and thus a summary of the importance of a variable for the modeling of both X and Y (Wold et al., 2001). In this study, comparisons of R2adj values of GAPLSR predicting model, PLSR-NIPALS model and significant model PLSR-VIP were established according to the VIP scores of PLSR model to see which one has established a model with less error.

References

  • Andersen, C. M., Bro, J. R., 2010. Variable Selection in Regression - A Tutorial. Chemometrics, 24, 728-737.
  • Chong, I-G., Jun, C. H., 2005. Performance of Some Variable Selection Methods when Multicollinearity is Present. Chemometrics and Intelligent Laboratory Systems, 78, 103- 112.
  • Garthwaite, P. H., 1994. An Interpretation of Partial Least Squares. Journal of the American Statistical Association, 89, 122-127.
  • Gurunlu Alma Ö., Bulut E., 2012. Genetic Algorithm Based Variable Selection for Partial Least Squares Regression Using ICOMP Criterion, Asian Journal of Mathematics and Statistics, 5(3), 82-92.
  • Guyon, I., Elisseeff, A., 2003. An Introduction to Variable and Feature Selection. Journal of Machine Learning Research, 3, 1157-1182.
  • Goldberg, D.E., 1989. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, USA.
  • Hollander, M., Wolfe, D. A., 1973. Nonparametric Statistical Methods. John Wiley & Sons: New York, NY.
  • Hörchner, U., Kalivas, J. H., 1995. Further Investigation on a Comparative Study of Simulated Annealing and Genetic Algorithm for Wavelengths Selection. Analytica Chimica Acta, 311, 1-13.
  • Jöreskog, K. G., Wold, H., 1982. Systems Under Indirect Observation, Part I, 263-270. Amsterdam, New York, Oxford: North-Holland.
  • Jun, C. H., Lee, S. H., Park, H. S., Lee, J. H., 2009. Use of Partial Least Squares Regression for Variable Selection and Quality Prediction. Computers & Industrial Engineering. CIE 2009. International Conference on 6-9 July 2009.
  • Kubinyu H., 1996. Evolutionary Variable Selection in Regression and PLS Analyses. Journal of Chemometrics, 10, 110-133.
  • Leardi, R., Boggia, R., Terrile, M., 1992. Genetic Algorithms as a Strategy for Feature Selection, Journal of Chemometrics, 6, 267-281.
  • Leardi, R., 1996. Genetic Algorithms in Feature Selection, in: J. Devillers_Ed., Genetic Algorithms in Molecular Modeling, Academic Press., 67.
  • Leardi, R., 2001. Genetic Algorithms in Chemometrics and Chemistry: A Review. Journal of Chemometrics, 15, 559-569.
  • Leardi, R., Gonza´lez, A. L., 1998. Genetic Algorithms Applied to Feature Selection in PLS Regression: How and When to Use Them. Chemometrics and Intelligent Laboratory Systems, 41, 195–207.
  • Li, B., Morris, J., Martin, E. B., 2002. Model Selection for Partial Least Squares Regression. Chemometrics and Intelligent Laboratory Systems, 64, 79-89.
  • Lindgren, F., Rännar S., 1998. Alternative Partial Least-Squares (PLS) Algorithms. Perspectives in Drug Discovery and Design. 12-14, 105-113.
  • Lucasius, C. B., Beckers, M. L. M., Kateman, G., 1994. Genetic Algorithms in Wavelengths Selection: A Comparative Study. Analytical Chimica Acta, 286, 135–153.
  • Masamoto, A., Yosuke, Y., Kimito, F., 2011. Genetic Algorithm-Based Wavelength Selection Method for Spectral Calibration. Journal of Chemometrics, 25, 10–19.
  • Martens H., Naes T., 1989. Multivariate Calibration. John Wiley & Sons.
  • Naes, T., Martens, H., 1985. Comparison of Prediction Methods for Collinear Data, Communication in Statistics Simulation and Computation, 14, 545-576.
  • Paterlini, S., Minerva, T., 2010. Regression Model Selection using Genetic Algorithms. Recent Advances in Neural Networks, Fuzzy Systems & Evolutionary Computing, WSEAS Press Stevens Point, Wisconsin, 19-28.
  • Shariati-Rad, M., Hasani, M., 2010. Selection of Individual Variables versus Intervals of Variables in PLSR. Journal of Chemometrics, 24, 45–56.
  • Vitor, L., Carla C. P., José C. M. 2000. Evolutionary Programming for Variable Selection in PLSR: Predicting Qualities from a Crude Distillation Unit, Controlo’2000:4th Portuguese Conference on Automatic Control.
  • Wold, H., In David, F., 1966. Research papers in statistics. Wiley, New York, 411-444.
  • Wold, S., Martens, M., Wold, H., 1983. The Multivariate Calibration Problem in Chemistry Solved By The PLS Method. In Ruhe, and Kågstrom, B. (Eds) Matrix Pencils, Springer-Verlag, Hieldelberg, Germany. 286-293.
  • Wold, S., Ruhe, A., Wold, H., Dunn III, W. J., 1984. The Collinearity Problem in Linear Regression: The Partial Least Squares Approach to Generalized Inverses. Siam J. Sci. Stat. Comput, 5, 735-743.
  • Wold, S., Johansson, E., Cocchi, M., 1993. 3D QSAR in Drug Design; Theory, Methods and Applications. ESCOM, Leiden, Holland, 523-550.
  • Wold, S., 1994. PLS For Multivariate Linear Modelling, QSAR: Chemometric Metods in Molecular Design. Methods and Principles in Medicinal Chemistry. (Ed. H. Van de Waterbeemd), Weinheim, Germany: Verlag-Chemie.
  • Wold, S., Kettaneh, N., Tjessem, K., 1996. Hierarchical Multiblock PLS and PC Models for Easier Model Interpretation and as an Alternative to Variable Selection. Journal of Chemometrics, 10, 463-482.
  • Wold, S., Sjöström, M., Eriksson, L., 2001. PLS-Regression: A Basic Tool of Chemometrics. Chemometrics and Intelligent Laboratory Systems, 58, 109-130.

Genetik Algoritma Tabanlı Kısmi En Küçük Kareler Regresyonu için Değişken Seçimi

Year 2011, Volume: 8 Issue: 3, 75 - 85, 15.12.2011

Abstract

Kısmi En Küçük Kareler Regresyonu (KEKKR), bilimsel araştırmaların birçok alanında çoklu doğrusal bağlantı probleminin üstesinden gelmede sıradan en küçük karelere bir alternatif oluşturmaktadır. KEKKR yönteminin temelinde regresyon modeli ile iç içe geçmiş bir boyut indirgeme tekniği yer almaktadır. Bu çalışmada, genetik algoritma-kısmi en küçük kareler
regresyonu (GAKEKK) incelenmiştir. Bu yöntemde, değişken seçiminde kullanılan KEKK ile güçlü optimizasyon yöntemleri olan GA birleştirilmiştir. İz düşüm için değişken önemi, KEKK ağırlıklarının ağırlıklandırılmış kareler toplamı olarak isimlendirilmekte ve hem X hem de Y’i modellemede bir değişkenin önemini özetlemektedir (Wold ve arkadaşları, 2001). Bu çalışmada, daha küçük hataya sahip modeli belirlemede GAKEKK tahmin modeli, KEKK-NIPALS modeli ve KEKK-VIP yöntemlerinin performans karşılaştırmaları R2adj değerleri kullanılarak incelenmiştir.

References

  • Andersen, C. M., Bro, J. R., 2010. Variable Selection in Regression - A Tutorial. Chemometrics, 24, 728-737.
  • Chong, I-G., Jun, C. H., 2005. Performance of Some Variable Selection Methods when Multicollinearity is Present. Chemometrics and Intelligent Laboratory Systems, 78, 103- 112.
  • Garthwaite, P. H., 1994. An Interpretation of Partial Least Squares. Journal of the American Statistical Association, 89, 122-127.
  • Gurunlu Alma Ö., Bulut E., 2012. Genetic Algorithm Based Variable Selection for Partial Least Squares Regression Using ICOMP Criterion, Asian Journal of Mathematics and Statistics, 5(3), 82-92.
  • Guyon, I., Elisseeff, A., 2003. An Introduction to Variable and Feature Selection. Journal of Machine Learning Research, 3, 1157-1182.
  • Goldberg, D.E., 1989. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, USA.
  • Hollander, M., Wolfe, D. A., 1973. Nonparametric Statistical Methods. John Wiley & Sons: New York, NY.
  • Hörchner, U., Kalivas, J. H., 1995. Further Investigation on a Comparative Study of Simulated Annealing and Genetic Algorithm for Wavelengths Selection. Analytica Chimica Acta, 311, 1-13.
  • Jöreskog, K. G., Wold, H., 1982. Systems Under Indirect Observation, Part I, 263-270. Amsterdam, New York, Oxford: North-Holland.
  • Jun, C. H., Lee, S. H., Park, H. S., Lee, J. H., 2009. Use of Partial Least Squares Regression for Variable Selection and Quality Prediction. Computers & Industrial Engineering. CIE 2009. International Conference on 6-9 July 2009.
  • Kubinyu H., 1996. Evolutionary Variable Selection in Regression and PLS Analyses. Journal of Chemometrics, 10, 110-133.
  • Leardi, R., Boggia, R., Terrile, M., 1992. Genetic Algorithms as a Strategy for Feature Selection, Journal of Chemometrics, 6, 267-281.
  • Leardi, R., 1996. Genetic Algorithms in Feature Selection, in: J. Devillers_Ed., Genetic Algorithms in Molecular Modeling, Academic Press., 67.
  • Leardi, R., 2001. Genetic Algorithms in Chemometrics and Chemistry: A Review. Journal of Chemometrics, 15, 559-569.
  • Leardi, R., Gonza´lez, A. L., 1998. Genetic Algorithms Applied to Feature Selection in PLS Regression: How and When to Use Them. Chemometrics and Intelligent Laboratory Systems, 41, 195–207.
  • Li, B., Morris, J., Martin, E. B., 2002. Model Selection for Partial Least Squares Regression. Chemometrics and Intelligent Laboratory Systems, 64, 79-89.
  • Lindgren, F., Rännar S., 1998. Alternative Partial Least-Squares (PLS) Algorithms. Perspectives in Drug Discovery and Design. 12-14, 105-113.
  • Lucasius, C. B., Beckers, M. L. M., Kateman, G., 1994. Genetic Algorithms in Wavelengths Selection: A Comparative Study. Analytical Chimica Acta, 286, 135–153.
  • Masamoto, A., Yosuke, Y., Kimito, F., 2011. Genetic Algorithm-Based Wavelength Selection Method for Spectral Calibration. Journal of Chemometrics, 25, 10–19.
  • Martens H., Naes T., 1989. Multivariate Calibration. John Wiley & Sons.
  • Naes, T., Martens, H., 1985. Comparison of Prediction Methods for Collinear Data, Communication in Statistics Simulation and Computation, 14, 545-576.
  • Paterlini, S., Minerva, T., 2010. Regression Model Selection using Genetic Algorithms. Recent Advances in Neural Networks, Fuzzy Systems & Evolutionary Computing, WSEAS Press Stevens Point, Wisconsin, 19-28.
  • Shariati-Rad, M., Hasani, M., 2010. Selection of Individual Variables versus Intervals of Variables in PLSR. Journal of Chemometrics, 24, 45–56.
  • Vitor, L., Carla C. P., José C. M. 2000. Evolutionary Programming for Variable Selection in PLSR: Predicting Qualities from a Crude Distillation Unit, Controlo’2000:4th Portuguese Conference on Automatic Control.
  • Wold, H., In David, F., 1966. Research papers in statistics. Wiley, New York, 411-444.
  • Wold, S., Martens, M., Wold, H., 1983. The Multivariate Calibration Problem in Chemistry Solved By The PLS Method. In Ruhe, and Kågstrom, B. (Eds) Matrix Pencils, Springer-Verlag, Hieldelberg, Germany. 286-293.
  • Wold, S., Ruhe, A., Wold, H., Dunn III, W. J., 1984. The Collinearity Problem in Linear Regression: The Partial Least Squares Approach to Generalized Inverses. Siam J. Sci. Stat. Comput, 5, 735-743.
  • Wold, S., Johansson, E., Cocchi, M., 1993. 3D QSAR in Drug Design; Theory, Methods and Applications. ESCOM, Leiden, Holland, 523-550.
  • Wold, S., 1994. PLS For Multivariate Linear Modelling, QSAR: Chemometric Metods in Molecular Design. Methods and Principles in Medicinal Chemistry. (Ed. H. Van de Waterbeemd), Weinheim, Germany: Verlag-Chemie.
  • Wold, S., Kettaneh, N., Tjessem, K., 1996. Hierarchical Multiblock PLS and PC Models for Easier Model Interpretation and as an Alternative to Variable Selection. Journal of Chemometrics, 10, 463-482.
  • Wold, S., Sjöström, M., Eriksson, L., 2001. PLS-Regression: A Basic Tool of Chemometrics. Chemometrics and Intelligent Laboratory Systems, 58, 109-130.
There are 31 citations in total.

Details

Primary Language English
Subjects Statistics
Journal Section Research Articles
Authors

Özlem Gürünlü Alma This is me

Elif Bulut This is me

Publication Date December 15, 2011
Published in Issue Year 2011 Volume: 8 Issue: 3

Cite

APA Gürünlü Alma, Ö., & Bulut, E. (2011). Genetic Algorithm Based Variable Selection For Partial Least Squares Regression. İstatistik Araştırma Dergisi, 8(3), 75-85.