Optimizing Large-Scale Data Processing in Smart Manufacturing: A Benchmarking Study on Automotive Industry Data
Abstract
Considering the high data production in automotive sector production lines, the analysis of this data is of critical importance for predictive maintenance, energy efficiency, and quality control processes. However, increasing data volume challenges the limits of traditional methods and requires consideration of the performance evaluation of different libraries. This paper aims to compare the performance characteristics of Pandas, Dask, Modin, Vaex and Polars libraries in the Python ecosystem for processing large datasets obtained from welding machines used in modern automotive production systems. The study utilized real production data from Matay, an automotive parts supplier, consisting of approximately 30 days of exhaust production machine data with a size of 17 GB containing 106,167,826 rows. Subsets of different sizes (10K, 100K, 1M, 10M rows) were created from this dataset, and 11 different experiments were conducted on selected columns. These experiments cover the topics of reading data, filtering, sorting, grouping, merging, writing data in different formats (csv, parquet) and handling missing data. Then the experiments were evaluated based on three different metrics: total execution time, total memory usage, and CPU execution time. Each experiment was repeated 3 times and average values were recorded. In conclusion, this study demonstrates that Polars may be more advantageous for performance-oriented applications across all data scales. Ultimately, the strategic selection of these data processing tools serves as a critical enabler for digital transformation in the automotive industry; thereby facilitating the integration of digital twins and AI-driven quality control into high-performance Industry 4.0 ecosystems.
Keywords
References
- [1] Temiz RÖ, Onan M, Cebi H, Aslanlar S, Talaş Ş. Effect of Elec-trode Type and Weld Current on Service Life of Resistance Spot Weld Electrode. International Jorunal of Automotive Science and Technology. 2024;8(1):52-64. https://doi.org/10.30939/ijastech..1315759.
- [2] Çıkmak S. Pythagorean Fuzzy AHP Approach for Evaluating the Importance Level of Industry 4.0 Technologies in the Automotive Manufacturing Industry. International Jorunal of Automotive Sci-ence and Technology. 2025;9(1):26-39. https://doi.org/10.30939/ijastech..1522257.
- [3] Blachowicz T, Wylezek J, Sokol Z, Bondel M. Real-Time Analy-sis of Industrial Data Using the Unsupervised Hierarchical Densi-ty-Based Spatial Clustering of Applications with Noise Method in Monitoring the Welding Process in a Robotic Cell. Information. 2025; 16(2):79. https://doi.org/10.3390/info16020079.
- [4] Shanbhag S, Chimalakonda S. An Exploratory Study on Energy Consumption of Dataframe Processing Libraries. In: 2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR). IEEE; 2023. p. 284–95. https://doi.org/10.1109/MSR59073.2023.00048.
- [5] Abeykoon V, Charles Fox G. Trends in High-Performance Data Engineering for Data Analytics. In: New Trends and Challenges in Open Data. IntechOpen; 2023. https://doi.org/10.5772/intechopen.1001458.
- [6] Castro O, Bruneau P, Sottet JS, Torregrossa D. Landscape of High-Performance Python to Develop Data Science and Machine Learning Applications. ACM Computing Surveys. 2024 Mar 31;56(3):1–30. https://doi.org/10.1145/3617588.
- [7] Chen Y, Su S, Yang H. Convolutional Neural Network Analysis of Recurrence Plots for Anomaly Detection. International Journal of Bifurcation and Chaos. 2020 Jan 12;30(01):2050002. https://doi.org/10.1142/s0218127420500029.
- [8] Wang B, Li Y, Luo Y, Li X, Freiheit T. Early event detection in a deep-learning driven quality prediction model for ultrasonic weld-ing. Journal of Manufacturing Systems. 2021 Jul; 60:325–36. https://doi.org/10.1016/j.jmsy.2021.06.009.
Details
Primary Language
English
Subjects
Automotive Engineering (Other)
Journal Section
Research Article
Publication Date
February 11, 2026
Submission Date
April 15, 2025
Acceptance Date
January 20, 2026
Published in Issue
Year 2026 Volume: 10 Number: 1
APA
Yavuz, Z., & Bilgin, T. T. (2026). Optimizing Large-Scale Data Processing in Smart Manufacturing: A Benchmarking Study on Automotive Industry Data. International Journal of Automotive Science And Technology, 10(1), 26-39. https://doi.org/10.30939/ijastech..1676422
AMA
1.Yavuz Z, Bilgin TT. Optimizing Large-Scale Data Processing in Smart Manufacturing: A Benchmarking Study on Automotive Industry Data. IJASTECH. 2026;10(1):26-39. doi:10.30939/ijastech.1676422
Chicago
Yavuz, Zafer, and Turgay Tugay Bilgin. 2026. “Optimizing Large-Scale Data Processing in Smart Manufacturing: A Benchmarking Study on Automotive Industry Data”. International Journal of Automotive Science And Technology 10 (1): 26-39. https://doi.org/10.30939/ijastech. 1676422.
EndNote
Yavuz Z, Bilgin TT (February 1, 2026) Optimizing Large-Scale Data Processing in Smart Manufacturing: A Benchmarking Study on Automotive Industry Data. International Journal of Automotive Science And Technology 10 1 26–39.
IEEE
[1]Z. Yavuz and T. T. Bilgin, “Optimizing Large-Scale Data Processing in Smart Manufacturing: A Benchmarking Study on Automotive Industry Data”, IJASTECH, vol. 10, no. 1, pp. 26–39, Feb. 2026, doi: 10.30939/ijastech..1676422.
ISNAD
Yavuz, Zafer - Bilgin, Turgay Tugay. “Optimizing Large-Scale Data Processing in Smart Manufacturing: A Benchmarking Study on Automotive Industry Data”. International Journal of Automotive Science And Technology 10/1 (February 1, 2026): 26-39. https://doi.org/10.30939/ijastech. 1676422.
JAMA
1.Yavuz Z, Bilgin TT. Optimizing Large-Scale Data Processing in Smart Manufacturing: A Benchmarking Study on Automotive Industry Data. IJASTECH. 2026;10:26–39.
MLA
Yavuz, Zafer, and Turgay Tugay Bilgin. “Optimizing Large-Scale Data Processing in Smart Manufacturing: A Benchmarking Study on Automotive Industry Data”. International Journal of Automotive Science And Technology, vol. 10, no. 1, Feb. 2026, pp. 26-39, doi:10.30939/ijastech. 1676422.
Vancouver
1.Zafer Yavuz, Turgay Tugay Bilgin. Optimizing Large-Scale Data Processing in Smart Manufacturing: A Benchmarking Study on Automotive Industry Data. IJASTECH. 2026 Feb. 1;10(1):26-39. doi:10.30939/ijastech. 1676422
