Signal Processing and Dimensionality Reduction Algorithms for High-Dimensional Time-Series Data Applied to Biomolecular Simulations
Abstract
High-dimensional time-series datasets with stochastic noise present fundamental challenges in signal processing and data analytics across diverse computational domains. This study develops and validates a systematic framework that integrates digital signal processing techniques with dimensionality reduction algorithms to extract meaningful trends from noisy, high-dimensional temporal data. We implement and compare five filtering approaches—Savitzky-Golay polynomial regression, Moving Average, Gaussian, Butterworth, and Wavelet denoising—combined with Principal Component Analysis (PCA) on GPU-accelerated MD simulation infrastructure (NVIDIA GPU based on the Blackwell architecture, CUDA 12.9), benchmarking performance on two contrasting biomolecular systems: Frataxin (119 residues, rigid) and Carbonic Anhydrase IX (CAIX) (257 residues, dynamic). A comprehensive five-filter comparison across both systems reveals consistent performance rankings: Wavelet achieves the highest signal-to-noise ratio (SNR) for both Frataxin (23.47 dB) and CAIX (13.80 dB), while Savitzky-Golay provides the optimal balance between noise reduction and low-frequency preservation (>99%). Critically, Savitzky-Golay’s improvement over the Moving Average is substantially greater for dynamic CAIX (16.3%) than for rigid Frataxin (3.1%), demonstrating enhanced performance precisely where distinguishing conformational transitions from thermal noise is most challenging. All filters preserve low-frequency conformational dynamics while reducing high-frequency noise, with the extracted noise components validated by Gaussian distribution analysis (σ ≈ 0.008 nm) across both systems and all filtering methods. PCA-based dimensionality reduction achieves 11.9:1 compression (357 dimensions → 30 principal components) for Frataxin while retaining 80% of conformational variance, with CAIX requiring only 11 components for equivalent variance capture due to dominant collective motions. The complete analysis pipeline processes 50,001 frames in less than 0.2 seconds, representing negligible overhead (<0.001%) relative to simulation time. Cross-system validation with consistent filter rankings confirms methodology generalizability across proteins spanning diverse dynamical regimes. All analysis pipelines are implemented in Python 3.13 with open-source libraries (NumPy, SciPy, Matplotlib) to ensure reproducibility and extensibility.
Keywords
- Dimensionality reduction
- Digital signal processing
- Molecular dynamics simulation
- Principal component analysis
- Savitzky-Golay filter
Project Number
Ethical Statement
Thanks
References
- G. I. Kim and K. Chung, “Extraction of Features for Time Series Classification Using Noise Injection,” Sensors, vol. 24, no. 19, p. 6402, 2024. [Online]. Available: https://www.mdpi.com/1424-8220/24/19/6402.
- H.-T. Wu, “Current state of nonlinear-type time–frequency analysis and applications to high-frequency biomedical signals,” Current Opinion in Systems Biology, vol. 23, pp. 8–21, 2020.
- J. E. Stone, J. C. Phillips, L. Freddolino, D. J. Hardy, L. G. Trabuco, and K. Schulten, “Accelerating molecular modeling applications with graphics processors,” J Comput Chem, vol. 28, no. 16, pp. 2618–2640, Dec. 2007.
- K. Lindorff-Larsen, P. Maragakis, S. Piana, M. P. Eastwood, R. O. Dror, and D. E. Shaw, “Systematic Validation of Protein Force Fields against Experimental Data,” PLOS ONE, vol. 7, no. 2, p. e32131, 2012.
- S. Hayward, “A Retrospective on the Development of Methods for the Analysis of Protein Conformational Ensembles,” Protein J, vol. 42, no. 3, pp. 181–191, Jun. 2023.
- M. Schmid, D. Rath, and U. Diebold, “Why and How Savitzky-Golay Filters Should Be Replaced,” ACS Meas Sci Au, vol. 2, no. 2, pp. 185–196, Apr. 2022.
- H. Yao, F. Da Costa Santana, and Y. Wang, “Properties of ethanol-based foamed asphalt binders using the molecular dynamics (MD) method,” Materials Research Express, vol. 11, no. 6, p. 061501, 2024.
- G. A. Alou Angulo, A. Rivero Santamaría, C. Toubin, and M. Monnerville, “Ab Initio Molecular Dynamics Calculations on NO Oxidation over Oxygen-Functionalized Highly Oriented Pyrolytic Graphite,” Journal of Physical Chemistry C, 2024.
Details
Primary Language
English
Subjects
Computing Applications in Life Sciences
Journal Section
Research Article
Authors
Early Pub Date
June 25, 2026
Publication Date
June 30, 2026
Submission Date
November 28, 2025
Acceptance Date
February 11, 2026
Published in Issue
Year 2026 Volume: 9 Number: 3
