Araştırma Makalesi

SCALABLE IMPLEMENTATIONS OF DESCRIPTIVE STATISTICS ON HADOOP

Cilt: 1 Sayı: 1 30 Haziran 2019
PDF İndir
TR EN

SCALABLE IMPLEMENTATIONS OF DESCRIPTIVE STATISTICS ON HADOOP

Öz

Big Data is one of the most trendy technologies of our time. The volume of data is increasing day by day, thanks to serial data generation technologies such as social media, sensor data, Internet of Things. The massive increase in the amount of data accumulated around the world requires different approaches to store, process and analyze the big data. A set of quantitative data has many features and the descriptive statistics can describe these features in a meaningful and manageable form without having to list every value in the dataset. However, the standard statistical techniques cannot suit big data due to the size, complexity and velocity of the data. Though there are many off-the-shelf statistical tools available to analyze quantitative data they are not always compatible with the big data file systems. In this paper, we describe our implementations of the descriptive statistics algorithms over big data and show the scalability of our experiments on a small Hadoop cluster with 196 threads. This study presents that descriptive statistics for large datasets can benefit from distributed computation features of a Hadoop cluster.

Anahtar Kelimeler

Kaynakça

  1. Apache Software Foundation, Hadoop Releases, apache.org, Dec. 10, 2011. [Online]. http://en.wikipedia.org/wiki/Apache_Hadoop. [Accessed: Oct. 06, 2018]
  2. Battiato, S., Cantone, D., Catalano, D., Cincotti, G., and Hofri, M. (2000), An efficient algorithm for the approximate median selection problem. Algorithms and complexity, 226-238.
  3. Buragohain C., and Suri S. (2009), Encyclopedia of Database Systems, 2235-2240, Springer US.
  4. Chen C. P., and Zhang C.Y. (2014), Data-intensive applications, challenges, techniques and technologies: A survey on Big Data, Information Sciences ,275, 314-347
  5. Cheung P. (2012), Big Data, Official Statistics and Social Science Research: Emerging Data Challenges, Presentation at the World Bank.
  6. Ciaccio A. Di, Coli M., Ibanez A., and Miguel J. (2012), Advanced Statistical Methods for the Analysis of Large Data-Sets.
  7. Daas P., Tennekes M., Jonge E. De, Priem A., Buelens B., Pelt M. Van, and Hurk P. Van Den (2012), Data Science and the Future of Statistics Presentation at the first Data Science NL meetup, http://www.slideshare.net/pietdaas/data-science-and-the-future-of-statistics.
  8. Douglas L. (2001), 3d data management: Controlling data volume, velocity and variety, https://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf.

Ayrıntılar

Birincil Dil

İngilizce

Konular

-

Bölüm

Araştırma Makalesi

Yayımlanma Tarihi

30 Haziran 2019

Gönderilme Tarihi

25 Mart 2019

Kabul Tarihi

20 Haziran 2019

Yayımlandığı Sayı

Yıl 2019 Cilt: 1 Sayı: 1

Kaynak Göster

APA
Yılmazel, Ö. (2019). SCALABLE IMPLEMENTATIONS OF DESCRIPTIVE STATISTICS ON HADOOP. Nicel Bilimler Dergisi, 1(1), 43-58. https://izlik.org/JA67LP57LB
AMA
1.Yılmazel Ö. SCALABLE IMPLEMENTATIONS OF DESCRIPTIVE STATISTICS ON HADOOP. NBD. 2019;1(1):43-58. https://izlik.org/JA67LP57LB
Chicago
Yılmazel, Özgür. 2019. “SCALABLE IMPLEMENTATIONS OF DESCRIPTIVE STATISTICS ON HADOOP”. Nicel Bilimler Dergisi 1 (1): 43-58. https://izlik.org/JA67LP57LB.
EndNote
Yılmazel Ö (01 Haziran 2019) SCALABLE IMPLEMENTATIONS OF DESCRIPTIVE STATISTICS ON HADOOP. Nicel Bilimler Dergisi 1 1 43–58.
IEEE
[1]Ö. Yılmazel, “SCALABLE IMPLEMENTATIONS OF DESCRIPTIVE STATISTICS ON HADOOP”, NBD, c. 1, sy 1, ss. 43–58, Haz. 2019, [çevrimiçi]. Erişim adresi: https://izlik.org/JA67LP57LB
ISNAD
Yılmazel, Özgür. “SCALABLE IMPLEMENTATIONS OF DESCRIPTIVE STATISTICS ON HADOOP”. Nicel Bilimler Dergisi 1/1 (01 Haziran 2019): 43-58. https://izlik.org/JA67LP57LB.
JAMA
1.Yılmazel Ö. SCALABLE IMPLEMENTATIONS OF DESCRIPTIVE STATISTICS ON HADOOP. NBD. 2019;1:43–58.
MLA
Yılmazel, Özgür. “SCALABLE IMPLEMENTATIONS OF DESCRIPTIVE STATISTICS ON HADOOP”. Nicel Bilimler Dergisi, c. 1, sy 1, Haziran 2019, ss. 43-58, https://izlik.org/JA67LP57LB.
Vancouver
1.Özgür Yılmazel. SCALABLE IMPLEMENTATIONS OF DESCRIPTIVE STATISTICS ON HADOOP. NBD [Internet]. 01 Haziran 2019;1(1):43-58. Erişim adresi: https://izlik.org/JA67LP57LB