Research Article
BibTex RIS Cite

Particle Swarm Optimization Based Stacking Method with an Application to Text Classification

Year 2018, Volume: 6 Issue: 2, 134 - 141, 03.08.2018
https://doi.org/10.21541/apjes.329940

Abstract

Multiple classifier aims to integrate the predictions of several learners so that classification models can be constructed with high
performance of classification. Multiple classifiers can be employed in several application fields, including text categorization.
Stacking is an ensemble algorithm to construct ensembles with heterogeneous classifiers. In Stacking, the predictions of baselevel
classifiers are integrated by a meta-learner. To configure Stacking, appropriate set of learning algorithms should be selected
as base-level classifiers. Besides, the learning algorithm that will perform the meta-learning task should be identified. Hence, the
identification of an appropriate configuration for Stacking can be a challenging problem. In this paper, we introduce an efficient
method for stacking ensemble based text categorization which utilizes particle swarm optimization to upgrade arrangement of
the ensemble. In the empirical analysis on text categorization domain, particle swarm optimization based Stacking method has
been compared to genetic algorithm, ant colony optimization and artificial bee colony algorithm.

References

  • L.I. Kuncheva, Combining pattern classifiers, methods and algorithms, New York: Wiley InterScience, 2005.
  • J.Kittler and F. Roli, Multiple classifier systems, Berlin: Springer, 2000.
  • L.Rokach, “Ensemble-based classifiers”, Artificial Intelligence Review, vol.33, pp.1-39, 2010.
  • F.J.Provost and V.Kolluri, “A survey of methods for scaling up inductive learning algorithms”, Data Mining and Knowledge Discovery, vol. 3, pp. 131-169, 1999.
  • A.Onan, S.Korukoğlu and H.Bulut, “A multiobjective weighted voting ensemble classifier based on differential evolution algorithm for text sentiment classification”, Expert Systems with Applications, vol. 62, pp. 1-16, 2016.
  • D.H.Wolpert, “Stacked generalization”, Neural Networks, vol. 5, no 2, pp.241-259, 1992. M.Sewell, “Ensemble learning” UCL, Department of Computer Science Technical Report RN-11-02, (2011).
  • R.Polikar, “Ensemble based systems in decision making”, IEEE Circuits and Systems Magazine, vol. 6, no 3, pp. 21-45, 2006.
  • M.P.Sesmero, A.I.Ledezma and A.Sanchis, “Generating ensembles of heterogeneous classifier using stacked generalization”, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 5, no 1, pp. 21-34, 2015.
  • E.G.Talbi, Metaheuristics from design to implementation, New York: Wiley, 2009.
  • A. Gogna and A.Tayal, “Metaheuristics: review and application”, Journal of Experimental & Theoretical Artificial Intelligence, vol. 25, no 4, pp. 503-526, 2013.
  • A.Onan, H.Bulut and S.Korukoğlu, “An improved ant algorithm with LDA-based representation for text document clustering”, Journal of Information Science, vol.43, no 2, pp.275-292, 2017.
  • H.A.Abbas, C.S.Newton and R.Sarkar, Heuristic search-based stacking of classifiers, Berlin: Springer, 2002.
  • Y.Chen and M.L.Wong, “An ant colony optimization approach for stacking ensemble”, in Proceedings of the Second World Congress on Nature and Biologically Inspired Computing, (December 15-17, Kitakyushu, Japan, 2010), 146-151, (2010).
  • P.Shunmugapriya and S.Kanmani, “Optimization of stacking ensemble configurations through artificial bee colony algorithm”, Swarm and Evolutionary Computation, vol.12, pp.24-32, 2013.
  • M.Kantardzic, Data mining: concepts, models, methods and algorithms, New York: Wiley-IEEE Press, 2011. G.Shmueli, N.R.Patel and P.C.Bruce, Data mining for business intelligence: concepts, techniques, and applications in Microsoft Office Excel with XLMiner, New Jersey: John Wiley & Sons, 2010.
  • R.Quinlan, C4.5: programs for machine learning, San Mateo: Morgan Kaufmann, 1993.
  • X.Niuniu and L.Yuxun, “Review of decision trees”, in Proceedings of the third IEEE International Conference on Computer Science and Information Technology, (July, Chengdu, China, 2010), 105-109, (2010).
  • D.W.Aha, D.Kibler and M.K.Albert, “Instance based learning algorithm”, Machine Learning, vol.6, pp.37-66, 1991.
  • J.G.Clearly and L.E.Trigg, “K*: an instance-based learner using an entropic distance measure”, in Proceedings of the twelfth international conference on machine learning, (July 9-12, Tahoe City, California, 1995), 108-114, (1995).
  • W.Iba and P.Langley, “Induction of one-level decision trees”, in Proceedings of the 9th International Workshop on Machine Learning, (Aberdeen, UK, 1992), 233-240, (1992).
  • E.Frank and I.H.Witten, “Generating accurate rule sets without global optimization”, in Proceedings of the 15th International Conference on Machine Learning, (July 24-27, 1998), 144-151, (1998).
  • L.Breiman, “Bagging predictors”, Machine Learning, vol. 4, no 2, pp. 123-140, 1996.
  • Y.Freund and R.E.Schapire, “Experiments with a new boosting algorithm”, in Proceedings of the Thriteenth International Conference on Machine Learning, (July 3-6, 1996), 1-9, (1996).
  • D.H. Wolpert, “Stacked generalization”, Neural Networks, vol.5, no 2, pp. 241-259, 1992.
  • J.Kennedy and R.C.Eberhart, “Particle swarm optimization”, in Proceedings of the International Conference on Neural Networks, 1942-1948, (1995).
  • A.P. Engelbrecht, Computational intelligence: an introduction, New York: Wiley, 2007.
  • M.N.A.Wahab, S.Nefti-Meziani and A.Atyabi, “A comprehensive review of swarm optimization algorithms”, Plos One, doi: 10.1371/journal.pone.0122827
  • L.Y.Chuang, H.W.Chang, C.J.Tu and C.H.Yang, “Improved binary PSO for feature selection using gene expression data”, Computational Biology and Chemistry, vol. 32, pp.29-38, 2008.
  • A.Onan, S.Korukoğlu and H.Bulut, “LDA-based topic modelling in text sentiment classification: an empirical analysis”, International Journal of Computational Linguistics and Applications, vol.7,no 1, pp. 101-119, 2016.
  • A.Onan, “Hybrid supervised clustering based ensemble scheme for text classification”, Kybernetes, vol. 46, no 2, pp. 330-348, 2017.

Particle Swarm Optimization Based Stacking Method with an Application to Text Classification

Year 2018, Volume: 6 Issue: 2, 134 - 141, 03.08.2018
https://doi.org/10.21541/apjes.329940

Abstract

Multiple classifier aims to integrate the predictions of several learners so that classification models can be constructed with high performance of classification. Multiple classifiers can be employed in several application fields, including text categorization. Stacking is an ensemble algorithm to construct ensembles with heterogeneous classifiers. In Stacking, the predictions of base-level classifiers are integrated by a meta-learner. To configure Stacking, appropriate set of learning algorithms should be selected as base-level classifiers. Besides, the learning algorithm that will perform the meta-learning task should be identified. Hence, the identification of an appropriate configuration for Stacking can be a challenging problem. In this paper, we introduce an efficient method for stacking ensemble based text categorization which utilizes particle swarm optimization to upgrade arrangement of the ensemble. In the empirical analysis on text categorization domain, particle swarm optimization based Stacking method has been compared to genetic algorithm, ant colony optimization and artificial bee colony algorithm.

References

  • L.I. Kuncheva, Combining pattern classifiers, methods and algorithms, New York: Wiley InterScience, 2005.
  • J.Kittler and F. Roli, Multiple classifier systems, Berlin: Springer, 2000.
  • L.Rokach, “Ensemble-based classifiers”, Artificial Intelligence Review, vol.33, pp.1-39, 2010.
  • F.J.Provost and V.Kolluri, “A survey of methods for scaling up inductive learning algorithms”, Data Mining and Knowledge Discovery, vol. 3, pp. 131-169, 1999.
  • A.Onan, S.Korukoğlu and H.Bulut, “A multiobjective weighted voting ensemble classifier based on differential evolution algorithm for text sentiment classification”, Expert Systems with Applications, vol. 62, pp. 1-16, 2016.
  • D.H.Wolpert, “Stacked generalization”, Neural Networks, vol. 5, no 2, pp.241-259, 1992. M.Sewell, “Ensemble learning” UCL, Department of Computer Science Technical Report RN-11-02, (2011).
  • R.Polikar, “Ensemble based systems in decision making”, IEEE Circuits and Systems Magazine, vol. 6, no 3, pp. 21-45, 2006.
  • M.P.Sesmero, A.I.Ledezma and A.Sanchis, “Generating ensembles of heterogeneous classifier using stacked generalization”, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 5, no 1, pp. 21-34, 2015.
  • E.G.Talbi, Metaheuristics from design to implementation, New York: Wiley, 2009.
  • A. Gogna and A.Tayal, “Metaheuristics: review and application”, Journal of Experimental & Theoretical Artificial Intelligence, vol. 25, no 4, pp. 503-526, 2013.
  • A.Onan, H.Bulut and S.Korukoğlu, “An improved ant algorithm with LDA-based representation for text document clustering”, Journal of Information Science, vol.43, no 2, pp.275-292, 2017.
  • H.A.Abbas, C.S.Newton and R.Sarkar, Heuristic search-based stacking of classifiers, Berlin: Springer, 2002.
  • Y.Chen and M.L.Wong, “An ant colony optimization approach for stacking ensemble”, in Proceedings of the Second World Congress on Nature and Biologically Inspired Computing, (December 15-17, Kitakyushu, Japan, 2010), 146-151, (2010).
  • P.Shunmugapriya and S.Kanmani, “Optimization of stacking ensemble configurations through artificial bee colony algorithm”, Swarm and Evolutionary Computation, vol.12, pp.24-32, 2013.
  • M.Kantardzic, Data mining: concepts, models, methods and algorithms, New York: Wiley-IEEE Press, 2011. G.Shmueli, N.R.Patel and P.C.Bruce, Data mining for business intelligence: concepts, techniques, and applications in Microsoft Office Excel with XLMiner, New Jersey: John Wiley & Sons, 2010.
  • R.Quinlan, C4.5: programs for machine learning, San Mateo: Morgan Kaufmann, 1993.
  • X.Niuniu and L.Yuxun, “Review of decision trees”, in Proceedings of the third IEEE International Conference on Computer Science and Information Technology, (July, Chengdu, China, 2010), 105-109, (2010).
  • D.W.Aha, D.Kibler and M.K.Albert, “Instance based learning algorithm”, Machine Learning, vol.6, pp.37-66, 1991.
  • J.G.Clearly and L.E.Trigg, “K*: an instance-based learner using an entropic distance measure”, in Proceedings of the twelfth international conference on machine learning, (July 9-12, Tahoe City, California, 1995), 108-114, (1995).
  • W.Iba and P.Langley, “Induction of one-level decision trees”, in Proceedings of the 9th International Workshop on Machine Learning, (Aberdeen, UK, 1992), 233-240, (1992).
  • E.Frank and I.H.Witten, “Generating accurate rule sets without global optimization”, in Proceedings of the 15th International Conference on Machine Learning, (July 24-27, 1998), 144-151, (1998).
  • L.Breiman, “Bagging predictors”, Machine Learning, vol. 4, no 2, pp. 123-140, 1996.
  • Y.Freund and R.E.Schapire, “Experiments with a new boosting algorithm”, in Proceedings of the Thriteenth International Conference on Machine Learning, (July 3-6, 1996), 1-9, (1996).
  • D.H. Wolpert, “Stacked generalization”, Neural Networks, vol.5, no 2, pp. 241-259, 1992.
  • J.Kennedy and R.C.Eberhart, “Particle swarm optimization”, in Proceedings of the International Conference on Neural Networks, 1942-1948, (1995).
  • A.P. Engelbrecht, Computational intelligence: an introduction, New York: Wiley, 2007.
  • M.N.A.Wahab, S.Nefti-Meziani and A.Atyabi, “A comprehensive review of swarm optimization algorithms”, Plos One, doi: 10.1371/journal.pone.0122827
  • L.Y.Chuang, H.W.Chang, C.J.Tu and C.H.Yang, “Improved binary PSO for feature selection using gene expression data”, Computational Biology and Chemistry, vol. 32, pp.29-38, 2008.
  • A.Onan, S.Korukoğlu and H.Bulut, “LDA-based topic modelling in text sentiment classification: an empirical analysis”, International Journal of Computational Linguistics and Applications, vol.7,no 1, pp. 101-119, 2016.
  • A.Onan, “Hybrid supervised clustering based ensemble scheme for text classification”, Kybernetes, vol. 46, no 2, pp. 330-348, 2017.
There are 30 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Articles
Authors

Aytuğ Onan

Publication Date August 3, 2018
Submission Date July 20, 2017
Published in Issue Year 2018 Volume: 6 Issue: 2

Cite

IEEE A. Onan, “Particle Swarm Optimization Based Stacking Method with an Application to Text Classification”, APJES, vol. 6, no. 2, pp. 134–141, 2018, doi: 10.21541/apjes.329940.