Comparative study of feature extraction methods for automated ICD code classification using MIMIC-III medical notes and deep learning models
Abstract
Keywords
deep learning (DL), natural language processing (NLP), feature extraction, international classification of diseases (ICD), MIMIC-III medical notes
Project Number
References
- [1] Yan, C., Fu, X., Liu, X., Zhang, Y., Gao, Y., Wu, J. and Li, Q. A survey of automated International Classification of Diseases coding: development, challenges, and applications. Intelligent Medicine, 2(03), 161-173, (2022).
- [2] Niu, K., Wu, Y., Li, Y. and Li, M. Retrieve and rerank for automated ICD coding via contrastive learning. Journal of Biomedical Informatics, 143, 104396, (2023).
- [3] Kang, B., Wang, X., Xiong, Y., Zhang, Y., Zhou, C., Zhu, Y. et al. Automatic ICD coding based on segmented ClinicalBERT with hierarchical tree structure learning. In Proceedings, Database Systems for Advanced Applications (DASFAA), pp. 250-265, Tianjin, China, (2023, April).
- [4] Ayden, M.A., Yuksel, M.E. and Yuksel Erdem, S.E. A two-stream deep model for automated ICD-9 code prediction in an intensive care unit. Heliyon, 10(4), e25960, (2024).
- [5] PhysioNet, MIMIC-III Clinical Database, (2016). https://physionet.org/content/ mimiciii/1.4/
- [6] Larkey, L.S. and Croft, W.B. Combining classifiers in text categorization. In Proceedings, 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 289-297, Zurich, Switzerland, (1996, August).
- [7] Suominen, H., Ginter, F., Pyysalo, S., Airola, A., Pahikkala, T., Salanter, S. and Salakoski, T. Machine learning to automate the assignment of diagnosis codes to free-text radiology reports: a method description. In Proceedings, ICML/UAI/COLT Workshop on Machine Learning for Health-care Applications, pp. 1-8, Helsinki, Finland, (2008, July).
- [8] Perotte, A., Pivovarov, R., Natarajan, K., Weiskopf, N., Wood, F. and Elhadad, N. Diagnosis code assignment: models and evaluation metrics. Journal of the American Medical Informatics Association, 21(2), 231-237, (2014).
- [9] Marafino, B.J., Davies, J.M., Bardach, N.S., Dean, M.L. and Dudley, R.A. N-gram support vector machines for scalable procedure and diagnosis classification, with applications to clinical free text data from the intensive care unit. Journal of the American Medical Informatics Association, 21(5), 871-875, (2014).
- [10] Scheurwegs, E., Luyckx, K., Luyten, L., Daelemans, W. and Van den Bulcke, T. Data integration of structured and unstructured sources for assigning clinical codes to patient stays. Journal of the American Medical Informatics Association, 23(e1), e11-e19, (2016).