Research Article

A Weakly Supervised Clustering Method for Cancer Subgroup Identification

Volume: 10 Number: 2 April 30, 2022
EN

A Weakly Supervised Clustering Method for Cancer Subgroup Identification

Abstract

Identifying subgroups of cancer patients is important as it opens up possibilities for targeted therapeutics. A widely applied approach is to group patients with unsupervised clustering techniques based on molecular data of tumor samples. The patient clusters are found to be of interest if they can be associated with a clinical outcome variable such as the survival of patients. However, these clinical variables of interest do not participate in the clustering decisions. We propose an approach, WSURFC (Weakly Supervised Random Forest Clustering), where the clustering process is weakly supervised with a clinical variable of interest. The supervision step is handled by learning a similarity metric with features that are selected to predict this clinical variable. More specifically, WSURFC involves a random forest classifier-training step to predict the clinical variable, in this case, the survival class. Subsequently, the internal nodes are used to derive a random forest similarity metric among the pairs of samples. In this way, the clustering step utilizes the nonlinear subspace of the original features learned in the classification step. We first demonstrate WSURFC on hand-written digit datasets, where WSURFC is able to capture salient structural similarities of digit pairs. Next, we apply WSURFC to find breast cancer subtypes using mRNA, protein, and microRNA expressions as features. Our results on breast cancer show that WSURFC could identify interesting patient subgroups more effectively than the widely adopted methods.

Keywords

References

  1. [1] L. Hood and S. H. Friend, “Predictive, personalized, preventive, participatory (p4) cancer medicine,” Nature reviews Clinical oncology, vol. 8, no. 3, p. 184, 2011.
  2. [2] I. Dagogo-Jack and A. T. Shaw, “Tumour heterogeneity and resistance to cancer therapies,” Nature reviews Clinical oncology, vol. 15, no. 2, pp. 81–94, 2018.
  3. [3] D. Koboldt, R. Fulton, M. McLellan, H. Schmidt, J. Kalicki-Veizer, J. McMichael, L. Fulton, D. Dooling, L. Ding, E. Mardis et al., “Comprehensive molecular portraits of human breast tumours,” Nature, vol. 490, no. 7418, pp. 61–70, 2012.
  4. [4] P. S. B. Joel S. Parker, “Supervised risk predictor of breast cancer based on intrinsic subtypes,” Journal of Clinical Oncology, vol. 27, no. 8, p.1 160–1167, 2009.
  5. [5] R. G. Verhaak, K. A. Hoadley, E. Purdom, V. Wang, Y. Qi, M. D. Wilkerson, C. R. Miller, L. Ding, T. Golub, J. P. Mesirov et al., “Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in pdgfra, idh1, egfr, and nf1,” Cancer cell, vol. 17, no. 1, pp. 98–110, 2010.
  6. [6] The Cancer Genome Atlas Network, “Comprehensive molecular portraits of human breast tumours,” Nature, vol. 490, pp. 61–70, 2012.
  7. [7] A. Ally, M. Balasundaram, R. Carlsen, E. Chuah, A. Clarke, N. Dhalla, R. A. Holt, S. J. Jones, D. Lee, Y. Ma et al., “Comprehensive and integrative genomic characterization of hepatocellular carcinoma,” Cell, vol. 169, no. 7, pp.1327–1341, 2017.
  8. [8] The Cancer Genome Atlas Network, “Integrated genomic analyses of ovarian carcinoma,” Nature, vol. 474, pp. 609–615, 2011.

Details

Primary Language

English

Subjects

Artificial Intelligence

Journal Section

Research Article

Publication Date

April 30, 2022

Submission Date

December 7, 2021

Acceptance Date

March 5, 2022

Published in Issue

Year 2022 Volume: 10 Number: 2

APA
Ozcelik, D., & Taştan, Ö. (2022). A Weakly Supervised Clustering Method for Cancer Subgroup Identification. Balkan Journal of Electrical and Computer Engineering, 10(2), 178-186. https://doi.org/10.17694/bajece.1033807
AMA
1.Ozcelik D, Taştan Ö. A Weakly Supervised Clustering Method for Cancer Subgroup Identification. Balkan Journal of Electrical and Computer Engineering. 2022;10(2):178-186. doi:10.17694/bajece.1033807
Chicago
Ozcelik, Duygu, and Öznur Taştan. 2022. “A Weakly Supervised Clustering Method for Cancer Subgroup Identification”. Balkan Journal of Electrical and Computer Engineering 10 (2): 178-86. https://doi.org/10.17694/bajece.1033807.
EndNote
Ozcelik D, Taştan Ö (April 1, 2022) A Weakly Supervised Clustering Method for Cancer Subgroup Identification. Balkan Journal of Electrical and Computer Engineering 10 2 178–186.
IEEE
[1]D. Ozcelik and Ö. Taştan, “A Weakly Supervised Clustering Method for Cancer Subgroup Identification”, Balkan Journal of Electrical and Computer Engineering, vol. 10, no. 2, pp. 178–186, Apr. 2022, doi: 10.17694/bajece.1033807.
ISNAD
Ozcelik, Duygu - Taştan, Öznur. “A Weakly Supervised Clustering Method for Cancer Subgroup Identification”. Balkan Journal of Electrical and Computer Engineering 10/2 (April 1, 2022): 178-186. https://doi.org/10.17694/bajece.1033807.
JAMA
1.Ozcelik D, Taştan Ö. A Weakly Supervised Clustering Method for Cancer Subgroup Identification. Balkan Journal of Electrical and Computer Engineering. 2022;10:178–186.
MLA
Ozcelik, Duygu, and Öznur Taştan. “A Weakly Supervised Clustering Method for Cancer Subgroup Identification”. Balkan Journal of Electrical and Computer Engineering, vol. 10, no. 2, Apr. 2022, pp. 178-86, doi:10.17694/bajece.1033807.
Vancouver
1.Duygu Ozcelik, Öznur Taştan. A Weakly Supervised Clustering Method for Cancer Subgroup Identification. Balkan Journal of Electrical and Computer Engineering. 2022 Apr. 1;10(2):178-86. doi:10.17694/bajece.1033807

All articles published by BAJECE are licensed under the Creative Commons Attribution 4.0 International License. This permits anyone to copy, redistribute, remix, transmit and adapt the work provided the original work and source is appropriately cited.Creative Commons Lisansı