Research Article

Signal Processing and Dimensionality Reduction Algorithms for High-Dimensional Time-Series Data Applied to Biomolecular Simulations

Volume: 9 Number: 3 June 30, 2026

Signal Processing and Dimensionality Reduction Algorithms for High-Dimensional Time-Series Data Applied to Biomolecular Simulations

Abstract

High-dimensional time-series datasets with stochastic noise present fundamental challenges in signal processing and data analytics across diverse computational domains. This study develops and validates a systematic framework that integrates digital signal processing techniques with dimensionality reduction algorithms to extract meaningful trends from noisy, high-dimensional temporal data. We implement and compare five filtering approaches—Savitzky-Golay polynomial regression, Moving Average, Gaussian, Butterworth, and Wavelet denoising—combined with Principal Component Analysis (PCA) on GPU-accelerated MD simulation infrastructure (NVIDIA GPU based on the Blackwell architecture, CUDA 12.9), benchmarking performance on two contrasting biomolecular systems: Frataxin (119 residues, rigid) and Carbonic Anhydrase IX (CAIX) (257 residues, dynamic). A comprehensive five-filter comparison across both systems reveals consistent performance rankings: Wavelet achieves the highest signal-to-noise ratio (SNR) for both Frataxin (23.47 dB) and CAIX (13.80 dB), while Savitzky-Golay provides the optimal balance between noise reduction and low-frequency preservation (>99%). Critically, Savitzky-Golay’s improvement over the Moving Average is substantially greater for dynamic CAIX (16.3%) than for rigid Frataxin (3.1%), demonstrating enhanced performance precisely where distinguishing conformational transitions from thermal noise is most challenging. All filters preserve low-frequency conformational dynamics while reducing high-frequency noise, with the extracted noise components validated by Gaussian distribution analysis (σ ≈ 0.008 nm) across both systems and all filtering methods. PCA-based dimensionality reduction achieves 11.9:1 compression (357 dimensions → 30 principal components) for Frataxin while retaining 80% of conformational variance, with CAIX requiring only 11 components for equivalent variance capture due to dominant collective motions. The complete analysis pipeline processes 50,001 frames in less than 0.2 seconds, representing negligible overhead (<0.001%) relative to simulation time. Cross-system validation with consistent filter rankings confirms methodology generalizability across proteins spanning diverse dynamical regimes. All analysis pipelines are implemented in Python 3.13 with open-source libraries (NumPy, SciPy, Matplotlib) to ensure reproducibility and extensibility.

Keywords

Project Number

4733406.03.2025-A3-01.

Ethical Statement

This study is based on computational simulations using publicly available structural data and does not involve human participants, animal subjects, or patient data. Therefore, ethical committee approval was not required. It is declared that during the preparation process of this study, scientific and ethical principles were followed, and all the studies benefited from are stated in the bibliography.

Thanks

This research forms part of the Ph.D. thesis of Kevser Kübra Kırboğa at Süleyman Demirel University. This research was supported by the Scientific and Technological Research Council of Turkey (TÜBİTAK) under the BİDEB 2211-A National Doctoral Scholarship Program and by the Health Institutes of Turkey (TÜSEB) under grant number 4733406.03.2025-A3-01.

References

  1. G. I. Kim and K. Chung, “Extraction of Features for Time Series Classification Using Noise Injection,” Sensors, vol. 24, no. 19, p. 6402, 2024. [Online]. Available: https://www.mdpi.com/1424-8220/24/19/6402.
  2. H.-T. Wu, “Current state of nonlinear-type time–frequency analysis and applications to high-frequency biomedical signals,” Current Opinion in Systems Biology, vol. 23, pp. 8–21, 2020.
  3. J. E. Stone, J. C. Phillips, L. Freddolino, D. J. Hardy, L. G. Trabuco, and K. Schulten, “Accelerating molecular modeling applications with graphics processors,” J Comput Chem, vol. 28, no. 16, pp. 2618–2640, Dec. 2007.
  4. K. Lindorff-Larsen, P. Maragakis, S. Piana, M. P. Eastwood, R. O. Dror, and D. E. Shaw, “Systematic Validation of Protein Force Fields against Experimental Data,” PLOS ONE, vol. 7, no. 2, p. e32131, 2012.
  5. S. Hayward, “A Retrospective on the Development of Methods for the Analysis of Protein Conformational Ensembles,” Protein J, vol. 42, no. 3, pp. 181–191, Jun. 2023.
  6. M. Schmid, D. Rath, and U. Diebold, “Why and How Savitzky-Golay Filters Should Be Replaced,” ACS Meas Sci Au, vol. 2, no. 2, pp. 185–196, Apr. 2022.
  7. H. Yao, F. Da Costa Santana, and Y. Wang, “Properties of ethanol-based foamed asphalt binders using the molecular dynamics (MD) method,” Materials Research Express, vol. 11, no. 6, p. 061501, 2024.
  8. G. A. Alou Angulo, A. Rivero Santamaría, C. Toubin, and M. Monnerville, “Ab Initio Molecular Dynamics Calculations on NO Oxidation over Oxygen-Functionalized Highly Oriented Pyrolytic Graphite,” Journal of Physical Chemistry C, 2024.

Details

Primary Language

English

Subjects

Computing Applications in Life Sciences

Journal Section

Research Article

Early Pub Date

June 25, 2026

Publication Date

June 30, 2026

Submission Date

November 28, 2025

Acceptance Date

February 11, 2026

Published in Issue

Year 2026 Volume: 9 Number: 3

APA
Kırboğa, K. K., & Küçüksille, E. U. (2026). Signal Processing and Dimensionality Reduction Algorithms for High-Dimensional Time-Series Data Applied to Biomolecular Simulations. Sakarya University Journal of Computer and Information Sciences, 9(3), 958-979. https://doi.org/10.35377/saucis...1832143
AMA
1.Kırboğa KK, Küçüksille EU. Signal Processing and Dimensionality Reduction Algorithms for High-Dimensional Time-Series Data Applied to Biomolecular Simulations. SAUCIS. 2026;9(3):958-979. doi:10.35377/saucis.1832143
Chicago
Kırboğa, Kevser Kübra, and Ecir Uğur Küçüksille. 2026. “Signal Processing and Dimensionality Reduction Algorithms for High-Dimensional Time-Series Data Applied to Biomolecular Simulations”. Sakarya University Journal of Computer and Information Sciences 9 (3): 958-79. https://doi.org/10.35377/saucis. 1832143.
EndNote
Kırboğa KK, Küçüksille EU (June 1, 2026) Signal Processing and Dimensionality Reduction Algorithms for High-Dimensional Time-Series Data Applied to Biomolecular Simulations. Sakarya University Journal of Computer and Information Sciences 9 3 958–979.
IEEE
[1]K. K. Kırboğa and E. U. Küçüksille, “Signal Processing and Dimensionality Reduction Algorithms for High-Dimensional Time-Series Data Applied to Biomolecular Simulations”, SAUCIS, vol. 9, no. 3, pp. 958–979, June 2026, doi: 10.35377/saucis...1832143.
ISNAD
Kırboğa, Kevser Kübra - Küçüksille, Ecir Uğur. “Signal Processing and Dimensionality Reduction Algorithms for High-Dimensional Time-Series Data Applied to Biomolecular Simulations”. Sakarya University Journal of Computer and Information Sciences 9/3 (June 1, 2026): 958-979. https://doi.org/10.35377/saucis. 1832143.
JAMA
1.Kırboğa KK, Küçüksille EU. Signal Processing and Dimensionality Reduction Algorithms for High-Dimensional Time-Series Data Applied to Biomolecular Simulations. SAUCIS. 2026;9:958–979.
MLA
Kırboğa, Kevser Kübra, and Ecir Uğur Küçüksille. “Signal Processing and Dimensionality Reduction Algorithms for High-Dimensional Time-Series Data Applied to Biomolecular Simulations”. Sakarya University Journal of Computer and Information Sciences, vol. 9, no. 3, June 2026, pp. 958-79, doi:10.35377/saucis. 1832143.
Vancouver
1.Kevser Kübra Kırboğa, Ecir Uğur Küçüksille. Signal Processing and Dimensionality Reduction Algorithms for High-Dimensional Time-Series Data Applied to Biomolecular Simulations. SAUCIS. 2026 Jun. 1;9(3):958-79. doi:10.35377/saucis. 1832143

 

INDEXING & ABSTRACTING & ARCHIVING

 

31045 31044   ResimLink - Resim Yükle  31047 

31043 28939 28938 34240
 

 

29070    The papers in this journal are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License