Research Article

Recording Performances of Some File Types for Pandas Data

Number: 36 May 31, 2022
TR EN

Recording Performances of Some File Types for Pandas Data

Abstract

Scientists, researchers, engineers, etc. almost everyone who works with data crosses paths with Pandas at some point. It is so powerful library that allows for easy, rapid and efficient manipulation of data. It can convert data it represent into various file types. Among these file types, the determination of the one which records the same Pandas data with the smallest size on the disk is an important issue considering the abundance of today's data. In this study, the file types that can save Pandas data with minimum size has been experimentally investigated from various perspectives. In this respect, the CSV, HDF, JSON, Excel and Pickle file types are involved in the experiments. The sizes of these files were benchmarked under several conditions such as the completeness or lack of data and type of variables that are contained in data. In addition, it was also examined that how file sizes vary as data increases.

Keywords

References

  1. Abeykoon, V., Perera, N., Widanage, C., Kamburugamuve, S., Kanewala, T. A., Maithree, H., … Fox, G. (2020). Data Engineering for HPC with Python. In 2020 IEEE/ACM 9th Workshop on Python for High-Performance and Scientific Computing (PyHPC) (pp. 13–21). https://doi.org/10.1109/PyHPC51966.2020.00007
  2. Fortner, B. (1998). HDF: The hierarchical data format. Dr Dobb’s J Software Tools Prof Program, 23(5), 42.
  3. Hoyer, S., & Hamman, J. (2017). xarray: ND labeled arrays and datasets in Python. Journal of Open Research Software, 5(1).
  4. Kişisel Verilerin Korunması Kanunu. (n.d.). Retrieved from https://www.mevzuat.gov.tr/mevzuatmetin/1.5.6698.pdf
  5. Pezoa, F., Reutter, J. L., Suarez, F., Ugarte, M., & Vrgoč, D. (2016). Foundations of JSON schema. In Proceedings of the 25th International Conference on World Wide Web (pp. 263–273).
  6. Reback, J., McKinney, W., jbrockmendel, den Bossche, J. Van, Augspurger, T., Cloud, P., … Mehyar, M. (2020). pandas-dev/pandas: Pandas 1.0.3. Zenodo. https://doi.org/10.5281/zenodo.3715232
  7. Van Rossum, G. (2020). The Python Library Reference, release 3.8.2. Python Software Foundation.
  8. Van Rossum, G., & Drake, F. L. (2003). An introduction to Python. Network Theory Ltd. Bristol.

Details

Primary Language

English

Subjects

Engineering

Journal Section

Research Article

Publication Date

May 31, 2022

Submission Date

April 14, 2022

Acceptance Date

April 16, 2022

Published in Issue

Year 2022 Number: 36

APA
Temiz, H. (2022). Recording Performances of Some File Types for Pandas Data. Avrupa Bilim Ve Teknoloji Dergisi, 36, 55-60. https://doi.org/10.31590/ejosat.1103499
AMA
1.Temiz H. Recording Performances of Some File Types for Pandas Data. EJOSAT. 2022;(36):55-60. doi:10.31590/ejosat.1103499
Chicago
Temiz, Hakan. 2022. “Recording Performances of Some File Types for Pandas Data”. Avrupa Bilim Ve Teknoloji Dergisi, nos. 36: 55-60. https://doi.org/10.31590/ejosat.1103499.
EndNote
Temiz H (May 1, 2022) Recording Performances of Some File Types for Pandas Data. Avrupa Bilim ve Teknoloji Dergisi 36 55–60.
IEEE
[1]H. Temiz, “Recording Performances of Some File Types for Pandas Data”, EJOSAT, no. 36, pp. 55–60, May 2022, doi: 10.31590/ejosat.1103499.
ISNAD
Temiz, Hakan. “Recording Performances of Some File Types for Pandas Data”. Avrupa Bilim ve Teknoloji Dergisi. 36 (May 1, 2022): 55-60. https://doi.org/10.31590/ejosat.1103499.
JAMA
1.Temiz H. Recording Performances of Some File Types for Pandas Data. EJOSAT. 2022;:55–60.
MLA
Temiz, Hakan. “Recording Performances of Some File Types for Pandas Data”. Avrupa Bilim Ve Teknoloji Dergisi, no. 36, May 2022, pp. 55-60, doi:10.31590/ejosat.1103499.
Vancouver
1.Hakan Temiz. Recording Performances of Some File Types for Pandas Data. EJOSAT. 2022 May 1;(36):55-60. doi:10.31590/ejosat.1103499

Cited By