Gaussian Radial Basis Function Neural Network with Correlation Based Feature Selection Applied to Medical Text Categorization
Abstract
Text categorization is an important field for information processing
systems. Particularly, medical text processing is a popular research area that
makes use of classification algorithms and dimension reduction strategies from
machine learning field. In this study, we propose a three stage algorithm to
automatically categorize medical text from OHSUMED corpus. In the proposed
algorithm, we use Correlation Based Feature Filtering on top of Radial Basis
Function Neural Network. The algorithm for 12 sample datasets produces 0.890 in
terms macro average F-measure. In this context, both Correlation based Feature
Filtering as a feature elimination strategy and Radial Basis Function Neural
Network as text categorization algorithm are promising methods
Keywords
References
- 1. Pons, A, Gil, P, García, R, Berlanga, L. 2007. Using Typical Testors for Feature Selection in Text Categorization. Lecture Notes in Computer Science, Springer; 643-652.
- 2. Qirui, Z, Jinghua, T, Huaying, Z, Weiye, T, Kejing, H. Machine Learning Methods for Medical Text Categorization. Circuits, Communications and Systems, Pacific-Asia Conference, 2009, pp 494-497.
- 3. Yang, Y, Joachims, T. 2008. Text Categorization. Scholarpedia Text Categorization; 4242-4245.
- 4. Janecek, A, Gansterer, W. On the Relationship Between Feature Selection and Classification Accuracy. JMLR: Workshop and Conference Proceedings, 2009, pp 90-105.
- 5. Forman, G. 2007. An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Resources; 1289-1305.
- 6. Deng, Z, Tang, S, W, Zhang, M. 2005. An Efficient Text Categorization Algorithm Based on Category Memberships. Fuzzy Systems and Knowledge Discovery; 480-485.
- 7. Sebastiani, F. 2002. Machine learning in automated text categorization. ACM Computing Surveys; 34: 1-47.
- 8. Dumais, S. 1998. Using SVMs for Text Categorization. IEEE Intelligent Systems; 13: 21-23.
Details
Primary Language
English
Subjects
Engineering
Journal Section
Research Article
Authors
Akın Özçift
*
Türkiye
Publication Date
March 22, 2019
Submission Date
October 3, 2018
Acceptance Date
February 7, 2019
Published in Issue
Year 2019 Volume: 15 Number: 1