Research Article
BibTex RIS Cite

Topology-Preserving Scaling in Data Augmentation

Year 2025, Volume: 7 Issue: 1 , 9 - 26 , 30.04.2025
https://doi.org/10.47087/mjm.1615296
https://izlik.org/JA23TN62KE

Abstract

We propose an algorithmic framework for dataset normalization in data augmentation pipelines that preserves topological stability under non-uniform scaling transformations. Given a finite metric space \( X \subset \mathbb{R}^n \) with Euclidean distance \( d_X \), we consider scaling transformations defined by scaling factors \( s_1, s_2, \ldots, s_n > 0 \). Specifically, we define a scaling function \( S \) that maps each point \( x = (x_1, x_2, \ldots, x_n) \in X \) to
\[
S(x) = (s_1 x_1, s_2 x_2, \ldots, s_n x_n).
\]
Our main result establishes that the bottleneck distance \( d_B(D, D_S) \) between the persistence diagrams \( D \) of \( X \) and \( D_S \) of \( S(X) \) satisfies:
\[
d_B(D, D_S) \leq (s_{\max} - s_{\min}) \cdot \operatorname{diam}(X),
\]
where \( s_{\min} = \min_{1 \leq i \leq n} s_i \), \( s_{\max} = \max_{1 \leq i \leq n} s_i \), and \( \operatorname{diam}(X) \) is the diameter of \( X \). Based on this theoretical guarantee, we formulate an optimization problem to minimize the scaling variability \( \Delta_s = s_{\max} - s_{\min} \) under the constraint \( d_B(D, D_S) \leq \epsilon \), where \( \epsilon > 0 \) is a user-defined tolerance.

We develop an algorithmic solution to this problem, ensuring that data augmentation via scaling transformations preserves essential topological features. We further extend our analysis to higher-dimensional homological features, alternative metrics such as the Wasserstein distance, and iterative or probabilistic scaling scenarios. Our contributions provide a rigorous mathematical framework for dataset normalization in data augmentation pipelines, ensuring that essential topological characteristics are maintained despite scaling transformations.

Ethical Statement

The authors bind no conflicting interests.

References

  • A. Zomorodian and G. Carlsson, Computing persistent homology, Discrete & Comput. Geom. 33 (2005), no. 2, 249–274. doi:10.1007/s00454-004-1146-y. Available at [https://arxiv.org/abs/cs/0306106](https://arxiv.org/abs/cs/0306106).
  • H. Edelsbrunner and J. Harer, Persistent homology—a survey, Contemp. Math. 453 (2008), 257–282. doi:10.1090/conm/453/08802. Available at [https://www.cs.duke.edu/~jeb/CTIC/cti.pdf](https://www.cs.duke.edu/~jeb/CTIC/cti.pdf).
  • D. Cohen-Steiner, H. Edelsbrunner, and J. Harer, Stability of persistence diagrams, Discrete & Comput. Geom. 37 (2007), no. 1, 103–120. doi:10.1007/s00454-006-1276-5. Available at [https://arxiv.org/abs/math/0510337](https://arxiv.org/abs/math/0510337).
  • G. Carlsson and V. de Silva, Zigzag persistence, Found. Comput. Math. 10 (2010), no. 4, 367–405. doi:10.1007/s10208-010-9066-0. Available at [https://arxiv.org/abs/math/0604450](https://arxiv.org/abs/math/0604450).
  • M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu, Spatial transformer networks, in Advances in Neural Information Processing Systems, vol. 28, 2017–2025 (2015). Available at [https://arxiv.org/abs/1506.02025](https://arxiv.org/abs/1506.02025).
  • S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, in Proceedings of the 32nd International Conference on Machine Learning, 448–456 (2015). Available at [https://arxiv.org/abs/1502.03167](https://arxiv.org/abs/1502.03167).
  • U. Bauer and M. Lesnick, Induced matchings of barcodes and the algebraic stability of persistence, J. Comput. Geom. 6 (2015), no. 2, 162–191. doi:10.20382/jocg.v6i2a7. Available at [https://www.jocg.org/index.php/jocg/article/view/285](https://www.jocg.org/index.php/joc g/article/view/285).

Year 2025, Volume: 7 Issue: 1 , 9 - 26 , 30.04.2025
https://doi.org/10.47087/mjm.1615296
https://izlik.org/JA23TN62KE

Abstract

References

  • A. Zomorodian and G. Carlsson, Computing persistent homology, Discrete & Comput. Geom. 33 (2005), no. 2, 249–274. doi:10.1007/s00454-004-1146-y. Available at [https://arxiv.org/abs/cs/0306106](https://arxiv.org/abs/cs/0306106).
  • H. Edelsbrunner and J. Harer, Persistent homology—a survey, Contemp. Math. 453 (2008), 257–282. doi:10.1090/conm/453/08802. Available at [https://www.cs.duke.edu/~jeb/CTIC/cti.pdf](https://www.cs.duke.edu/~jeb/CTIC/cti.pdf).
  • D. Cohen-Steiner, H. Edelsbrunner, and J. Harer, Stability of persistence diagrams, Discrete & Comput. Geom. 37 (2007), no. 1, 103–120. doi:10.1007/s00454-006-1276-5. Available at [https://arxiv.org/abs/math/0510337](https://arxiv.org/abs/math/0510337).
  • G. Carlsson and V. de Silva, Zigzag persistence, Found. Comput. Math. 10 (2010), no. 4, 367–405. doi:10.1007/s10208-010-9066-0. Available at [https://arxiv.org/abs/math/0604450](https://arxiv.org/abs/math/0604450).
  • M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu, Spatial transformer networks, in Advances in Neural Information Processing Systems, vol. 28, 2017–2025 (2015). Available at [https://arxiv.org/abs/1506.02025](https://arxiv.org/abs/1506.02025).
  • S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, in Proceedings of the 32nd International Conference on Machine Learning, 448–456 (2015). Available at [https://arxiv.org/abs/1502.03167](https://arxiv.org/abs/1502.03167).
  • U. Bauer and M. Lesnick, Induced matchings of barcodes and the algebraic stability of persistence, J. Comput. Geom. 6 (2015), no. 2, 162–191. doi:10.20382/jocg.v6i2a7. Available at [https://www.jocg.org/index.php/jocg/article/view/285](https://www.jocg.org/index.php/joc g/article/view/285).
There are 7 citations in total.

Details

Primary Language English
Subjects Operations Research İn Mathematics
Journal Section Research Article
Authors

Vu-anh Le

Mehmet Dik

Submission Date January 7, 2025
Acceptance Date January 31, 2025
Publication Date April 30, 2025
DOI https://doi.org/10.47087/mjm.1615296
IZ https://izlik.org/JA23TN62KE
Published in Issue Year 2025 Volume: 7 Issue: 1

Cite

APA Le, V.- anh, & Dik, M. (2025). Topology-Preserving Scaling in Data Augmentation. Maltepe Journal of Mathematics, 7(1), 9-26. https://doi.org/10.47087/mjm.1615296
AMA 1.Le V anh, Dik M. Topology-Preserving Scaling in Data Augmentation. Maltepe Journal of Mathematics. 2025;7(1):9-26. doi:10.47087/mjm.1615296
Chicago Le, Vu-anh, and Mehmet Dik. 2025. “Topology-Preserving Scaling in Data Augmentation”. Maltepe Journal of Mathematics 7 (1): 9-26. https://doi.org/10.47087/mjm.1615296.
EndNote Le V- anh, Dik M (April 1, 2025) Topology-Preserving Scaling in Data Augmentation. Maltepe Journal of Mathematics 7 1 9–26.
IEEE [1]V.- anh Le and M. Dik, “Topology-Preserving Scaling in Data Augmentation”, Maltepe Journal of Mathematics, vol. 7, no. 1, pp. 9–26, Apr. 2025, doi: 10.47087/mjm.1615296.
ISNAD Le, Vu-anh - Dik, Mehmet. “Topology-Preserving Scaling in Data Augmentation”. Maltepe Journal of Mathematics 7/1 (April 1, 2025): 9-26. https://doi.org/10.47087/mjm.1615296.
JAMA 1.Le V- anh, Dik M. Topology-Preserving Scaling in Data Augmentation. Maltepe Journal of Mathematics. 2025;7:9–26.
MLA Le, Vu-anh, and Mehmet Dik. “Topology-Preserving Scaling in Data Augmentation”. Maltepe Journal of Mathematics, vol. 7, no. 1, Apr. 2025, pp. 9-26, doi:10.47087/mjm.1615296.
Vancouver 1.Vu-anh Le, Mehmet Dik. Topology-Preserving Scaling in Data Augmentation. Maltepe Journal of Mathematics. 2025 Apr. 1;7(1):9-26. doi:10.47087/mjm.1615296

Creative Commons License
The published articles in MJM are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

ISSN 2667-7660