EN
Topology-Preserving Scaling in Data Augmentation
Abstract
We propose an algorithmic framework for dataset normalization in data augmentation pipelines that preserves topological stability under non-uniform scaling transformations. Given a finite metric space \( X \subset \mathbb{R}^n \) with Euclidean distance \( d_X \), we consider scaling transformations defined by scaling factors \( s_1, s_2, \ldots, s_n > 0 \). Specifically, we define a scaling function \( S \) that maps each point \( x = (x_1, x_2, \ldots, x_n) \in X \) to
\[
S(x) = (s_1 x_1, s_2 x_2, \ldots, s_n x_n).
\]
Our main result establishes that the bottleneck distance \( d_B(D, D_S) \) between the persistence diagrams \( D \) of \( X \) and \( D_S \) of \( S(X) \) satisfies:
\[
d_B(D, D_S) \leq (s_{\max} - s_{\min}) \cdot \operatorname{diam}(X),
\]
where \( s_{\min} = \min_{1 \leq i \leq n} s_i \), \( s_{\max} = \max_{1 \leq i \leq n} s_i \), and \( \operatorname{diam}(X) \) is the diameter of \( X \). Based on this theoretical guarantee, we formulate an optimization problem to minimize the scaling variability \( \Delta_s = s_{\max} - s_{\min} \) under the constraint \( d_B(D, D_S) \leq \epsilon \), where \( \epsilon > 0 \) is a user-defined tolerance.
We develop an algorithmic solution to this problem, ensuring that data augmentation via scaling transformations preserves essential topological features. We further extend our analysis to higher-dimensional homological features, alternative metrics such as the Wasserstein distance, and iterative or probabilistic scaling scenarios. Our contributions provide a rigorous mathematical framework for dataset normalization in data augmentation pipelines, ensuring that essential topological characteristics are maintained despite scaling transformations.
Keywords
Ethical Statement
The authors bind no conflicting interests.
References
- A. Zomorodian and G. Carlsson, Computing persistent homology, Discrete & Comput. Geom. 33 (2005), no. 2, 249–274. doi:10.1007/s00454-004-1146-y. Available at [https://arxiv.org/abs/cs/0306106](https://arxiv.org/abs/cs/0306106).
- H. Edelsbrunner and J. Harer, Persistent homology—a survey, Contemp. Math. 453 (2008), 257–282. doi:10.1090/conm/453/08802. Available at [https://www.cs.duke.edu/~jeb/CTIC/cti.pdf](https://www.cs.duke.edu/~jeb/CTIC/cti.pdf).
- D. Cohen-Steiner, H. Edelsbrunner, and J. Harer, Stability of persistence diagrams, Discrete & Comput. Geom. 37 (2007), no. 1, 103–120. doi:10.1007/s00454-006-1276-5. Available at [https://arxiv.org/abs/math/0510337](https://arxiv.org/abs/math/0510337).
- G. Carlsson and V. de Silva, Zigzag persistence, Found. Comput. Math. 10 (2010), no. 4, 367–405. doi:10.1007/s10208-010-9066-0. Available at [https://arxiv.org/abs/math/0604450](https://arxiv.org/abs/math/0604450).
- M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu, Spatial transformer networks, in Advances in Neural Information Processing Systems, vol. 28, 2017–2025 (2015). Available at [https://arxiv.org/abs/1506.02025](https://arxiv.org/abs/1506.02025).
- S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, in Proceedings of the 32nd International Conference on Machine Learning, 448–456 (2015). Available at [https://arxiv.org/abs/1502.03167](https://arxiv.org/abs/1502.03167).
- U. Bauer and M. Lesnick, Induced matchings of barcodes and the algebraic stability of persistence, J. Comput. Geom. 6 (2015), no. 2, 162–191. doi:10.20382/jocg.v6i2a7. Available at [https://www.jocg.org/index.php/jocg/article/view/285](https://www.jocg.org/index.php/joc g/article/view/285).
Details
Primary Language
English
Subjects
Operations Research İn Mathematics
Journal Section
Research Article
Publication Date
April 30, 2025
Submission Date
January 7, 2025
Acceptance Date
January 31, 2025
Published in Issue
Year 2025 Volume: 7 Number: 1
APA
Le, V.- anh, & Dik, M. (2025). Topology-Preserving Scaling in Data Augmentation. Maltepe Journal of Mathematics, 7(1), 9-26. https://doi.org/10.47087/mjm.1615296
AMA
1.Le V anh, Dik M. Topology-Preserving Scaling in Data Augmentation. Maltepe Journal of Mathematics. 2025;7(1):9-26. doi:10.47087/mjm.1615296
Chicago
Le, Vu-anh, and Mehmet Dik. 2025. “Topology-Preserving Scaling in Data Augmentation”. Maltepe Journal of Mathematics 7 (1): 9-26. https://doi.org/10.47087/mjm.1615296.
EndNote
Le V- anh, Dik M (April 1, 2025) Topology-Preserving Scaling in Data Augmentation. Maltepe Journal of Mathematics 7 1 9–26.
IEEE
[1]V.- anh Le and M. Dik, “Topology-Preserving Scaling in Data Augmentation”, Maltepe Journal of Mathematics, vol. 7, no. 1, pp. 9–26, Apr. 2025, doi: 10.47087/mjm.1615296.
ISNAD
Le, Vu-anh - Dik, Mehmet. “Topology-Preserving Scaling in Data Augmentation”. Maltepe Journal of Mathematics 7/1 (April 1, 2025): 9-26. https://doi.org/10.47087/mjm.1615296.
JAMA
1.Le V- anh, Dik M. Topology-Preserving Scaling in Data Augmentation. Maltepe Journal of Mathematics. 2025;7:9–26.
MLA
Le, Vu-anh, and Mehmet Dik. “Topology-Preserving Scaling in Data Augmentation”. Maltepe Journal of Mathematics, vol. 7, no. 1, Apr. 2025, pp. 9-26, doi:10.47087/mjm.1615296.
Vancouver
1.Vu-anh Le, Mehmet Dik. Topology-Preserving Scaling in Data Augmentation. Maltepe Journal of Mathematics. 2025 Apr. 1;7(1):9-26. doi:10.47087/mjm.1615296
