TY - JOUR T1 - pPromoter-FCGR: Deep Learning on Frequency Chaos Game Representation for Prediction of DNA Promoters AU - Şilbir, Gülbahar Merve PY - 2025 DA - May Y2 - 2025 DO - 10.21541/apjess.1659716 JF - Academic Platform Journal of Engineering and Smart Systems JO - APJESS PB - Akademik Perspektif Derneği WT - DergiPark SN - 2822-2385 SP - 61 EP - 70 VL - 13 IS - 2 LA - en AB - A promoter is defined as a DNA sequence that helps to initiate transcription by binding to RNA polymerase. It has a key role in various biological processes, such as gene expression, adaptation and disease development. Promoter identification methods used to be conventional wet-lab approaches, but these can be laborious and costly, so computational methods are now being used instead. In this study, DNA sequences were converted into RGB images using the Frequency Chaos Game Representation method for k-mer values of 5 and 6, and various CNN models were employed to classify promoters and non-promoters. Pretrained models such as ResNet-50, VGG16, and GoogleNet were utilized alongside a custom 17-layer CNN model with optimized hyperparameters. The ResNet-50 model achieved an accuracy of 82% and an AUC of 0.89, while the VGG16 model attained an accuracy of 80% and an AUC of 0.88. The GoogleNet model yielded an accuracy of 74% with an AUC of 0.82. However, the classification performance was observed to be lower compared to existing literature. The proposed 17-layer CNN model demonstrated improved performance, achieving an accuracy of 83% and an AUC of 0.90. The proposed CNN model outperformed pretraned models in promoter prediction. KW - Classification KW - Deep Learning KW - Frequency Chaos Game Representation KW - Pre-training CNN Models KW - Promoter CR - F. Xu et al., “dbDEMC 3.0: Functional Exploration of Differentially Expressed miRNAs in Cancers of Human and Model Organisms,” Genomics. Proteomics Bioinformatics, vol. 20, no. 3, pp. 446–454, Jun. 2022, doi: 10.1016/j.gpb.2022.04.006. CR - D. Castanotto and J. J. Rossi, “The promises and pitfalls of RNA-interference-based therapeutics,” Nature, vol. 457, no. 7228, pp. 426–433, Jan. 2009, doi: 10.1038/nature07758. CR - A. L. Roy and D. S. Singer, “Core promoters in transcription: old problem, new insights,” Trends Biochem. Sci., vol. 40, no. 3, pp. 165–171, Mar. 2015, doi: 10.1016/j.tibs.2015.01.007. CR - T. I. Lee and R. A. Young, “Transcriptional Regulation and Its Misregulation in Disease,” Cell, vol. 152, no. 6, pp. 1237–1251, Mar. 2013, doi: 10.1016/j.cell.2013.02.014. CR - M. De Gobbi et al., “A Regulatory SNP Causes a Human Genetic Disease by Creating a New Transcriptional Promoter,” Science (80-.)., vol. 312, no. 5777, pp. 1215–1217, May 2006, doi: 10.1126/science.1126431. CR - L. E. Montefiori et al., “A promoter interaction map for cardiovascular disease genetics,” Elife, vol. 7, Jul. 2018, doi: 10.7554/eLife.35788. CR - R. J. Leeman-Neill et al., “Noncoding mutations cause super-enhancer retargeting resulting in protein synthesis dysregulation during B cell lymphoma progression,” Nat. Genet., vol. 55, no. 12, pp. 2160–2174, Dec. 2023, doi: 10.1038/s41588-023-01561-1. CR - W. Suza and D. Lee, Genetics, agriculture, and biotechnology. Iowa State University, 2021. CR - P. Gade and D. V. Kalvakolanu, “Chromatin Immunoprecipitation Assay as a Tool for Analyzing Transcription Factor Activity,” in Transcriptional Regulation: Methods and Protocols, 2012, pp. 85–104. CR - C. B. Yildiz et al., “EphrinA5 regulates cell motility by modulating Snhg15/DNA triplex-dependent targeting of DNMT1 to the Ncam1 promoter,” Epigenetics Chromatin, vol. 16, no. 1, p. 42, Oct. 2023, doi: 10.1186/s13072-023-00516-4. CR - X. Xiao, Z.-C. Xu, W.-R. Qiu, P. Wang, H.-T. Ge, and K.-C. Chou, “iPSW(2L)-PseKNC: A two-layer predictor for identifying promoters and their strength by hybrid features via pseudo-K-tuple nucleotide composition,” Genomics, vol. 111, no. 6, pp. 1785–1793, Dec. 2019, doi: 10.1016/j.ygeno.2018.12.001. CR - M. Oubounyt, Z. Louadi, H. Tayara, and K. T. Chong, “DeePromoter: Robust Promoter Predictor Using Deep Learning,” Front. Genet., vol. 10, Apr. 2019, doi: 10.3389/fgene.2019.00286. CR - N. Q. K. Le, Q.-T. Ho, V.-N. Nguyen, and J.-S. Chang, “BERT-Promoter: An improved sequence-based predictor of DNA promoter using BERT pre-trained model and SHAP feature selection,” Comput. Biol. Chem., vol. 99, p. 107732, Aug. 2022, doi: 10.1016/j.compbiolchem.2022.107732. CR - Y. Li et al., “msBERT-Promoter: a multi-scale ensemble predictor based on BERT pre-trained model for the two-stage prediction of DNA promoters and their strengths,” BMC Biol., vol. 22, no. 1, p. 126, May 2024, doi: 10.1186/s12915-024-01923-z. CR - B. Peng, G. Sun, and Y. Fan, “iProL: identifying DNA promoters from sequence information based on Longformer pre-trained model,” BMC Bioinformatics, vol. 25, no. 1, p. 224, Jun. 2024, doi: 10.1186/s12859-024-05849-9. CR - F. Ben Nasr Barber and A. Elloumi Oueslati, “Human exons and introns classification using pre-trained Resnet-50 and GoogleNet models and 13-layers CNN model,” J. Genet. Eng. Biotechnol., vol. 22, no. 1, p. 100359, Mar. 2024, doi: 10.1016/j.jgeb.2024.100359. CR - S. T. Sara, M. M. Hasan, A. Ahmad, and S. Shatabda, “Convolutional neural networks with image representation of amino acid sequences for protein function prediction,” Comput. Biol. Chem., vol. 92, p. 107494, Jun. 2021, doi: 10.1016/j.compbiolchem.2021.107494. CR - J. Shang, C. Peng, X. Tang, and Y. Sun, “PhaVIP: Phage VIrion Protein classification based on chaos game representation and Vision Transformer,” Bioinformatics, vol. 39, no. Supplement_1, pp. i30–i39, Jun. 2023, doi: 10.1093/bioinformatics/btad229. CR - S. Gama-Castro et al., “RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond,” Nucleic Acids Res., vol. 44, no. D1, pp. D133–D143, Jan. 2016, doi: 10.1093/nar/gkv1156. CR - M. F. Barnsley, Fractals Everywhere, 2nd ed. Academic Press, 2014. CR - H. J. Jeffrey, “Chaos game representation of gene structure,” Nucleic Acids Res., vol. 18, no. 8, pp. 2163–2170, 1990, doi: 10.1093/nar/18.8.2163. CR - A. Halder, Piyush, B. Mathew, and D. Sengupta, “Improved Python Package for DNA Sequence Encoding using Frequency Chaos Game Representation.” Apr. 18, 2024, doi: 10.1101/2024.04.14.589394. CR - A. Shabbir et al., “Satellite and Scene Image Classification Based on Transfer Learning and Fine Tuning of ResNet50,” Math. Probl. Eng., vol. 2021, pp. 1–18, Jul. 2021, doi: 10.1155/2021/5843816. CR - S. Tammina, “Transfer learning using VGG-16 with Deep Convolutional Neural Network for Classifying Images,” Int. J. Sci. Res. Publ., vol. 9, no. 10, p. p9420, Oct. 2019, doi: 10.29322/IJSRP.9.10.2019.p9420. CR - Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, May 2015, doi: 10.1038/nature14539. CR - N. Q. K. Le, E. K. Y. Yapp, N. Nagasundaram, and H.-Y. Yeh, “Classifying Promoters by Interpreting the Hidden Information of DNA Sequences via Deep Learning and Combination of Continuous FastText N-Grams,” Front. Bioeng. Biotechnol., vol. 7, Nov. 2019, doi: 10.3389/fbioe.2019.00305. CR - H. Tayara, M. Tahir, and K. T. Chong, “Identification of prokaryotic promoters and their strength by integrating heterogeneous features,” Genomics, vol. 112, no. 2, pp. 1396–1403, Mar. 2020, doi: 10.1016/j.ygeno.2019.08.009. CR - H. Li et al., “dPromoter-XGBoost: Detecting promoters and strength by combining multiple descriptors and feature selection using XGBoost,” Methods, vol. 204, pp. 215–222, Aug. 2022, doi: 10.1016/j.ymeth.2022.01.001. CR - Z. Zhang, J. Zhao, P.-J. Wei, and C.-H. Zheng, “iPromoter-CLA: Identifying promoters and their strength by deep capsule networks with bidirectional long short-term memory,” Comput. Methods Programs Biomed., vol. 226, p. 107087, Nov. 2022, doi: 10.1016/j.cmpb.2022.107087. UR - https://doi.org/10.21541/apjess.1659716 L1 - https://dergipark.org.tr/en/download/article-file/4698826 ER -