A promoter is defined as a DNA sequence that helps to initiate transcription by binding to RNA polymerase. It has a key role in various biological processes, such as gene expression, adaptation and disease development. Promoter identification methods used to be conventional wet-lab approaches, but these can be laborious and costly, so computational methods are now being used instead. In this study, DNA sequences were converted into RGB images using the Frequency Chaos Game Representation method for k-mer values of 5 and 6, and various CNN models were employed to classify promoters and non-promoters. Pretrained models such as ResNet-50, VGG16, and GoogleNet were utilized alongside a custom 17-layer CNN model with optimized hyperparameters. The ResNet-50 model achieved an accuracy of 82% and an AUC of 0.89, while the VGG16 model attained an accuracy of 80% and an AUC of 0.88. The GoogleNet model yielded an accuracy of 74% with an AUC of 0.82. However, the classification performance was observed to be lower compared to existing literature. The proposed 17-layer CNN model demonstrated improved performance, achieving an accuracy of 83% and an AUC of 0.90. The proposed CNN model outperformed pretraned models in promoter prediction.
Classification Deep Learning Frequency Chaos Game Representation Pre-training CNN Models Promoter
Primary Language | English |
---|---|
Subjects | Deep Learning, Classification Algorithms, Bioinformatics, Machine Learning (Other) |
Journal Section | Research Articles |
Authors | |
Early Pub Date | May 30, 2025 |
Publication Date | May 31, 2025 |
Submission Date | March 17, 2025 |
Acceptance Date | May 20, 2025 |
Published in Issue | Year 2025 Volume: 13 Issue: 2 |
Academic Platform Journal of Engineering and Smart Systems