Research Article
BibTex RIS Cite

Year 2025, Volume: 13 Issue: 2, 61 - 70, 31.05.2025
https://doi.org/10.21541/apjess.1659716

Abstract

References

  • F. Xu et al., “dbDEMC 3.0: Functional Exploration of Differentially Expressed miRNAs in Cancers of Human and Model Organisms,” Genomics. Proteomics Bioinformatics, vol. 20, no. 3, pp. 446–454, Jun. 2022, doi: 10.1016/j.gpb.2022.04.006.
  • D. Castanotto and J. J. Rossi, “The promises and pitfalls of RNA-interference-based therapeutics,” Nature, vol. 457, no. 7228, pp. 426–433, Jan. 2009, doi: 10.1038/nature07758.
  • A. L. Roy and D. S. Singer, “Core promoters in transcription: old problem, new insights,” Trends Biochem. Sci., vol. 40, no. 3, pp. 165–171, Mar. 2015, doi: 10.1016/j.tibs.2015.01.007.
  • T. I. Lee and R. A. Young, “Transcriptional Regulation and Its Misregulation in Disease,” Cell, vol. 152, no. 6, pp. 1237–1251, Mar. 2013, doi: 10.1016/j.cell.2013.02.014.
  • M. De Gobbi et al., “A Regulatory SNP Causes a Human Genetic Disease by Creating a New Transcriptional Promoter,” Science (80-.)., vol. 312, no. 5777, pp. 1215–1217, May 2006, doi: 10.1126/science.1126431.
  • L. E. Montefiori et al., “A promoter interaction map for cardiovascular disease genetics,” Elife, vol. 7, Jul. 2018, doi: 10.7554/eLife.35788.
  • R. J. Leeman-Neill et al., “Noncoding mutations cause super-enhancer retargeting resulting in protein synthesis dysregulation during B cell lymphoma progression,” Nat. Genet., vol. 55, no. 12, pp. 2160–2174, Dec. 2023, doi: 10.1038/s41588-023-01561-1.
  • W. Suza and D. Lee, Genetics, agriculture, and biotechnology. Iowa State University, 2021.
  • P. Gade and D. V. Kalvakolanu, “Chromatin Immunoprecipitation Assay as a Tool for Analyzing Transcription Factor Activity,” in Transcriptional Regulation: Methods and Protocols, 2012, pp. 85–104.
  • C. B. Yildiz et al., “EphrinA5 regulates cell motility by modulating Snhg15/DNA triplex-dependent targeting of DNMT1 to the Ncam1 promoter,” Epigenetics Chromatin, vol. 16, no. 1, p. 42, Oct. 2023, doi: 10.1186/s13072-023-00516-4.
  • X. Xiao, Z.-C. Xu, W.-R. Qiu, P. Wang, H.-T. Ge, and K.-C. Chou, “iPSW(2L)-PseKNC: A two-layer predictor for identifying promoters and their strength by hybrid features via pseudo-K-tuple nucleotide composition,” Genomics, vol. 111, no. 6, pp. 1785–1793, Dec. 2019, doi: 10.1016/j.ygeno.2018.12.001.
  • M. Oubounyt, Z. Louadi, H. Tayara, and K. T. Chong, “DeePromoter: Robust Promoter Predictor Using Deep Learning,” Front. Genet., vol. 10, Apr. 2019, doi: 10.3389/fgene.2019.00286.
  • N. Q. K. Le, Q.-T. Ho, V.-N. Nguyen, and J.-S. Chang, “BERT-Promoter: An improved sequence-based predictor of DNA promoter using BERT pre-trained model and SHAP feature selection,” Comput. Biol. Chem., vol. 99, p. 107732, Aug. 2022, doi: 10.1016/j.compbiolchem.2022.107732.
  • Y. Li et al., “msBERT-Promoter: a multi-scale ensemble predictor based on BERT pre-trained model for the two-stage prediction of DNA promoters and their strengths,” BMC Biol., vol. 22, no. 1, p. 126, May 2024, doi: 10.1186/s12915-024-01923-z.
  • B. Peng, G. Sun, and Y. Fan, “iProL: identifying DNA promoters from sequence information based on Longformer pre-trained model,” BMC Bioinformatics, vol. 25, no. 1, p. 224, Jun. 2024, doi: 10.1186/s12859-024-05849-9.
  • F. Ben Nasr Barber and A. Elloumi Oueslati, “Human exons and introns classification using pre-trained Resnet-50 and GoogleNet models and 13-layers CNN model,” J. Genet. Eng. Biotechnol., vol. 22, no. 1, p. 100359, Mar. 2024, doi: 10.1016/j.jgeb.2024.100359.
  • S. T. Sara, M. M. Hasan, A. Ahmad, and S. Shatabda, “Convolutional neural networks with image representation of amino acid sequences for protein function prediction,” Comput. Biol. Chem., vol. 92, p. 107494, Jun. 2021, doi: 10.1016/j.compbiolchem.2021.107494.
  • J. Shang, C. Peng, X. Tang, and Y. Sun, “PhaVIP: Phage VIrion Protein classification based on chaos game representation and Vision Transformer,” Bioinformatics, vol. 39, no. Supplement_1, pp. i30–i39, Jun. 2023, doi: 10.1093/bioinformatics/btad229.
  • S. Gama-Castro et al., “RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond,” Nucleic Acids Res., vol. 44, no. D1, pp. D133–D143, Jan. 2016, doi: 10.1093/nar/gkv1156.
  • M. F. Barnsley, Fractals Everywhere, 2nd ed. Academic Press, 2014.
  • H. J. Jeffrey, “Chaos game representation of gene structure,” Nucleic Acids Res., vol. 18, no. 8, pp. 2163–2170, 1990, doi: 10.1093/nar/18.8.2163.
  • A. Halder, Piyush, B. Mathew, and D. Sengupta, “Improved Python Package for DNA Sequence Encoding using Frequency Chaos Game Representation.” Apr. 18, 2024, doi: 10.1101/2024.04.14.589394.
  • A. Shabbir et al., “Satellite and Scene Image Classification Based on Transfer Learning and Fine Tuning of ResNet50,” Math. Probl. Eng., vol. 2021, pp. 1–18, Jul. 2021, doi: 10.1155/2021/5843816.
  • S. Tammina, “Transfer learning using VGG-16 with Deep Convolutional Neural Network for Classifying Images,” Int. J. Sci. Res. Publ., vol. 9, no. 10, p. p9420, Oct. 2019, doi: 10.29322/IJSRP.9.10.2019.p9420.
  • Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, May 2015, doi: 10.1038/nature14539.
  • N. Q. K. Le, E. K. Y. Yapp, N. Nagasundaram, and H.-Y. Yeh, “Classifying Promoters by Interpreting the Hidden Information of DNA Sequences via Deep Learning and Combination of Continuous FastText N-Grams,” Front. Bioeng. Biotechnol., vol. 7, Nov. 2019, doi: 10.3389/fbioe.2019.00305.
  • H. Tayara, M. Tahir, and K. T. Chong, “Identification of prokaryotic promoters and their strength by integrating heterogeneous features,” Genomics, vol. 112, no. 2, pp. 1396–1403, Mar. 2020, doi: 10.1016/j.ygeno.2019.08.009.
  • H. Li et al., “dPromoter-XGBoost: Detecting promoters and strength by combining multiple descriptors and feature selection using XGBoost,” Methods, vol. 204, pp. 215–222, Aug. 2022, doi: 10.1016/j.ymeth.2022.01.001.
  • Z. Zhang, J. Zhao, P.-J. Wei, and C.-H. Zheng, “iPromoter-CLA: Identifying promoters and their strength by deep capsule networks with bidirectional long short-term memory,” Comput. Methods Programs Biomed., vol. 226, p. 107087, Nov. 2022, doi: 10.1016/j.cmpb.2022.107087.

pPromoter-FCGR: Deep Learning on Frequency Chaos Game Representation for Prediction of DNA Promoters

Year 2025, Volume: 13 Issue: 2, 61 - 70, 31.05.2025
https://doi.org/10.21541/apjess.1659716

Abstract

A promoter is defined as a DNA sequence that helps to initiate transcription by binding to RNA polymerase. It has a key role in various biological processes, such as gene expression, adaptation and disease development. Promoter identification methods used to be conventional wet-lab approaches, but these can be laborious and costly, so computational methods are now being used instead. In this study, DNA sequences were converted into RGB images using the Frequency Chaos Game Representation method for k-mer values of 5 and 6, and various CNN models were employed to classify promoters and non-promoters. Pretrained models such as ResNet-50, VGG16, and GoogleNet were utilized alongside a custom 17-layer CNN model with optimized hyperparameters. The ResNet-50 model achieved an accuracy of 82% and an AUC of 0.89, while the VGG16 model attained an accuracy of 80% and an AUC of 0.88. The GoogleNet model yielded an accuracy of 74% with an AUC of 0.82. However, the classification performance was observed to be lower compared to existing literature. The proposed 17-layer CNN model demonstrated improved performance, achieving an accuracy of 83% and an AUC of 0.90. The proposed CNN model outperformed pretraned models in promoter prediction.

References

  • F. Xu et al., “dbDEMC 3.0: Functional Exploration of Differentially Expressed miRNAs in Cancers of Human and Model Organisms,” Genomics. Proteomics Bioinformatics, vol. 20, no. 3, pp. 446–454, Jun. 2022, doi: 10.1016/j.gpb.2022.04.006.
  • D. Castanotto and J. J. Rossi, “The promises and pitfalls of RNA-interference-based therapeutics,” Nature, vol. 457, no. 7228, pp. 426–433, Jan. 2009, doi: 10.1038/nature07758.
  • A. L. Roy and D. S. Singer, “Core promoters in transcription: old problem, new insights,” Trends Biochem. Sci., vol. 40, no. 3, pp. 165–171, Mar. 2015, doi: 10.1016/j.tibs.2015.01.007.
  • T. I. Lee and R. A. Young, “Transcriptional Regulation and Its Misregulation in Disease,” Cell, vol. 152, no. 6, pp. 1237–1251, Mar. 2013, doi: 10.1016/j.cell.2013.02.014.
  • M. De Gobbi et al., “A Regulatory SNP Causes a Human Genetic Disease by Creating a New Transcriptional Promoter,” Science (80-.)., vol. 312, no. 5777, pp. 1215–1217, May 2006, doi: 10.1126/science.1126431.
  • L. E. Montefiori et al., “A promoter interaction map for cardiovascular disease genetics,” Elife, vol. 7, Jul. 2018, doi: 10.7554/eLife.35788.
  • R. J. Leeman-Neill et al., “Noncoding mutations cause super-enhancer retargeting resulting in protein synthesis dysregulation during B cell lymphoma progression,” Nat. Genet., vol. 55, no. 12, pp. 2160–2174, Dec. 2023, doi: 10.1038/s41588-023-01561-1.
  • W. Suza and D. Lee, Genetics, agriculture, and biotechnology. Iowa State University, 2021.
  • P. Gade and D. V. Kalvakolanu, “Chromatin Immunoprecipitation Assay as a Tool for Analyzing Transcription Factor Activity,” in Transcriptional Regulation: Methods and Protocols, 2012, pp. 85–104.
  • C. B. Yildiz et al., “EphrinA5 regulates cell motility by modulating Snhg15/DNA triplex-dependent targeting of DNMT1 to the Ncam1 promoter,” Epigenetics Chromatin, vol. 16, no. 1, p. 42, Oct. 2023, doi: 10.1186/s13072-023-00516-4.
  • X. Xiao, Z.-C. Xu, W.-R. Qiu, P. Wang, H.-T. Ge, and K.-C. Chou, “iPSW(2L)-PseKNC: A two-layer predictor for identifying promoters and their strength by hybrid features via pseudo-K-tuple nucleotide composition,” Genomics, vol. 111, no. 6, pp. 1785–1793, Dec. 2019, doi: 10.1016/j.ygeno.2018.12.001.
  • M. Oubounyt, Z. Louadi, H. Tayara, and K. T. Chong, “DeePromoter: Robust Promoter Predictor Using Deep Learning,” Front. Genet., vol. 10, Apr. 2019, doi: 10.3389/fgene.2019.00286.
  • N. Q. K. Le, Q.-T. Ho, V.-N. Nguyen, and J.-S. Chang, “BERT-Promoter: An improved sequence-based predictor of DNA promoter using BERT pre-trained model and SHAP feature selection,” Comput. Biol. Chem., vol. 99, p. 107732, Aug. 2022, doi: 10.1016/j.compbiolchem.2022.107732.
  • Y. Li et al., “msBERT-Promoter: a multi-scale ensemble predictor based on BERT pre-trained model for the two-stage prediction of DNA promoters and their strengths,” BMC Biol., vol. 22, no. 1, p. 126, May 2024, doi: 10.1186/s12915-024-01923-z.
  • B. Peng, G. Sun, and Y. Fan, “iProL: identifying DNA promoters from sequence information based on Longformer pre-trained model,” BMC Bioinformatics, vol. 25, no. 1, p. 224, Jun. 2024, doi: 10.1186/s12859-024-05849-9.
  • F. Ben Nasr Barber and A. Elloumi Oueslati, “Human exons and introns classification using pre-trained Resnet-50 and GoogleNet models and 13-layers CNN model,” J. Genet. Eng. Biotechnol., vol. 22, no. 1, p. 100359, Mar. 2024, doi: 10.1016/j.jgeb.2024.100359.
  • S. T. Sara, M. M. Hasan, A. Ahmad, and S. Shatabda, “Convolutional neural networks with image representation of amino acid sequences for protein function prediction,” Comput. Biol. Chem., vol. 92, p. 107494, Jun. 2021, doi: 10.1016/j.compbiolchem.2021.107494.
  • J. Shang, C. Peng, X. Tang, and Y. Sun, “PhaVIP: Phage VIrion Protein classification based on chaos game representation and Vision Transformer,” Bioinformatics, vol. 39, no. Supplement_1, pp. i30–i39, Jun. 2023, doi: 10.1093/bioinformatics/btad229.
  • S. Gama-Castro et al., “RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond,” Nucleic Acids Res., vol. 44, no. D1, pp. D133–D143, Jan. 2016, doi: 10.1093/nar/gkv1156.
  • M. F. Barnsley, Fractals Everywhere, 2nd ed. Academic Press, 2014.
  • H. J. Jeffrey, “Chaos game representation of gene structure,” Nucleic Acids Res., vol. 18, no. 8, pp. 2163–2170, 1990, doi: 10.1093/nar/18.8.2163.
  • A. Halder, Piyush, B. Mathew, and D. Sengupta, “Improved Python Package for DNA Sequence Encoding using Frequency Chaos Game Representation.” Apr. 18, 2024, doi: 10.1101/2024.04.14.589394.
  • A. Shabbir et al., “Satellite and Scene Image Classification Based on Transfer Learning and Fine Tuning of ResNet50,” Math. Probl. Eng., vol. 2021, pp. 1–18, Jul. 2021, doi: 10.1155/2021/5843816.
  • S. Tammina, “Transfer learning using VGG-16 with Deep Convolutional Neural Network for Classifying Images,” Int. J. Sci. Res. Publ., vol. 9, no. 10, p. p9420, Oct. 2019, doi: 10.29322/IJSRP.9.10.2019.p9420.
  • Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, May 2015, doi: 10.1038/nature14539.
  • N. Q. K. Le, E. K. Y. Yapp, N. Nagasundaram, and H.-Y. Yeh, “Classifying Promoters by Interpreting the Hidden Information of DNA Sequences via Deep Learning and Combination of Continuous FastText N-Grams,” Front. Bioeng. Biotechnol., vol. 7, Nov. 2019, doi: 10.3389/fbioe.2019.00305.
  • H. Tayara, M. Tahir, and K. T. Chong, “Identification of prokaryotic promoters and their strength by integrating heterogeneous features,” Genomics, vol. 112, no. 2, pp. 1396–1403, Mar. 2020, doi: 10.1016/j.ygeno.2019.08.009.
  • H. Li et al., “dPromoter-XGBoost: Detecting promoters and strength by combining multiple descriptors and feature selection using XGBoost,” Methods, vol. 204, pp. 215–222, Aug. 2022, doi: 10.1016/j.ymeth.2022.01.001.
  • Z. Zhang, J. Zhao, P.-J. Wei, and C.-H. Zheng, “iPromoter-CLA: Identifying promoters and their strength by deep capsule networks with bidirectional long short-term memory,” Comput. Methods Programs Biomed., vol. 226, p. 107087, Nov. 2022, doi: 10.1016/j.cmpb.2022.107087.
There are 29 citations in total.

Details

Primary Language English
Subjects Deep Learning, Classification Algorithms, Bioinformatics, Machine Learning (Other)
Journal Section Research Articles
Authors

Gülbahar Merve Şilbir 0000-0003-0321-7259

Early Pub Date May 30, 2025
Publication Date May 31, 2025
Submission Date March 17, 2025
Acceptance Date May 20, 2025
Published in Issue Year 2025 Volume: 13 Issue: 2

Cite

IEEE G. M. Şilbir, “pPromoter-FCGR: Deep Learning on Frequency Chaos Game Representation for Prediction of DNA Promoters”, APJESS, vol. 13, no. 2, pp. 61–70, 2025, doi: 10.21541/apjess.1659716.

Academic Platform Journal of Engineering and Smart Systems