Research Article
BibTex RIS Cite

Study on MapReduce Operations Creating Cloud with Single Board Computer

Year 2015, , 179 - 191, 22.07.2015
https://doi.org/10.17671/btd.88292

Abstract

- Nowadays information systems have much larger data than in the past. The storage and analysis of this data
has a huge lack of resource. Saving, processing and analyzing of the big data needs systems that work faster and
consume less energy than current systems. Otherwise much greater costs and times of data analysis will be faced. In this
study, a cluster of single board computers is created and succeeded to run process isolation operating system level
virtualization for experiencing on big data algorithms. In this context created our work, the Map Reduce transactions,
which are the basics of big data systems, were executed on specifically designed ARM architecture mini supercomputer
clusters. ARM processed single board computers have effective costs, less energy consumptions and less carbon
emissions. Clustering, cloud computing, multiprocessing, parallel processing and big data applications compliance has
been also observed. Container virtualization on single board computer is an untested approach to use. Using process
isolation for MapReduce WorkerNode is yet another new practice.

References

  • Y. Kaplan, Bulut Bilişim ve İş Sürekliliği, Telepati Telekomünikasyon 183 Rapor, Türkiye, 2010.
  • Y. Korkmaz, Bulut Bilişim: Türkiye İçin Fırsatlar TÜBİTAK – UEKAE, Türkiye, 2008.
  • Y. Zhao, I. Raicu, S.Lu , Cloud Computing and Grid Computing 360-Degree Compared, ,Texas,ABD, 2008
  • B.Emily,M. Jaikrishnan, S. Karthikeyan, Power Struggles: Revisiting the RISC vs. CISC Debate on Contemporary ARM and x86 Architectures, Univ Wisconsin Madison, WI, ABD,2013
  • Arm Company Profile, http://arm.com/about/company-profile/ ,02015
  • Seal, David. ARM Architecture Reference Manual, (2001)
  • Y. Liu,H. Zhenjiang, K. Matsuzaki. Towards Systematic Parallel Programming over MapReduceConference Proceedings, 483-485, 02015
  • M. Owen, TeraByte Sort on Apache Hadoop,Kalifornia,ABD,2008
  • R. L¨ammel , Google’s MapReduce programming model ,2008
  • Adapteva Parallella Manual, http://www.parallella.org/docs /parallella_manual.pdf ,20.01.2015
  • CoreMark Scores http://www.eembc.org ,26.03.2015
  • K. Freund, Redefining Datacenter Efficiency,Calxeda,2012 benchmarks -for-calxedas-5-watt-web-server ,01.01.2015 res2011q3/power_ssj2008-20110806-00392.html ,07.01.2015
  • Deployments, http://www.accenture.com/ sitecollectiondocuments/ pdf/accenture-hadoop-deployment-comparison-study.pdf ,20.20.2014 Comparison Study CloudBased
  • Accenture Hadoop on Cloud, http://www.accenture.com/ Site CollectionDocuments/PDF/Accenture-Cloud-Based-Hadoop- Deployments-Benefits-and-Considerations.pdf ,19.04.2015 hardware-update ,16.02.2015
  • Parallella Soft, http://elinux.org/Parallella_Software ,07.01.2015 Parallella-hw, ,02015
  • Adapteva Referans Tasarım, http://www.adapteva.com/white- papers/parallella-platform-reference-design ,20.01.2015
  • Linaro Open source for ARM-SOC http://www.linaro.org ,02015
  • /technology/high-speed-serial.html ,02.04.2015 http://www.xilinx.com/products
  • Multi-Gigabit Transceiver, http://en.wikipedia.org/wiki/Multi- gigabit_transceiver ,20.01.2015
  • Apache Hadoop https://hadoop.apache.org ,07.01.2015
  • Apache Hadoop NextGen MapReduce (YARN), https://hadoop. apache.org/docs/r2.7.0/hadoop-yarn/hadoop-yarn-site/YARN.html api/org/apache/hadoop/ examples/terasort/package-summary.html ,02015 http://hadoop.apache.org/docs/current/
  • General Technical Discuss, https://forums.xilinx.com/t5/ General-Technical-Discussion/bd-p/GenDis ,07.01.2015 EK-1
  • TestDFSIO -read overall - - - - - 1287 112.863 TestDFSIO -read - -nrFiles 10 -fileSize 1 - - - - 35390 6290
  • TestDFSIO -write overall - - - - - 2731 111.845 TestDFSIO -write - -nrFiles 10 -fileSize 1 - - - - 45140 8820
  • Kurulum İşlem Adımları (Operation Steps) Linaro Kurulumu (Linaro Installation) Parallella releases.linaro.org/14.06/ubuntu/trusty-images/developer 6tar.gz sd kartlara boot edilebilir halde yazılarak kurulum yapılmıştır. adresinden olarak kullanılmıştır. indirilen linaro-trusty-developer-20140623
  • Spent 16ms computing TeraScheduler splits. Computing input splits took 1789ms
  • Sampling 2 splits of 2 Making 1 from 100 sampled records Computing parititions took 1380ms
  • File System Counters FILE: Number of bytes read=10406
  • FILE: Number of bytes written=340350
  • FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=10202
  • HDFS: Number of bytes written=10000
  • HDFS: Number of read operations=9 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=2 Launched reduce tasks=1 Data-local map tasks=1 Rack-local map tasks=1 Total time spent by all maps in occupied slots (ms)=92783
  • Total time spent by all reduces in occupied slots (ms)=20735
  • Total time spent by all map tasks (ms)=92783
  • Total time spent by all reduce tasks (ms)=20735
  • Total vcore-seconds taken by all map tasks=92783
  • Total vcore-seconds taken by all reduce tasks=20735
  • Total megabyte-seconds taken by all map tasks=95009792
  • Total megabyte-seconds taken by all reduce tasks=21232640 Map-Reduce Framework Map input records=100 Map output records=100 Map output bytes=10200
  • Map output materialized bytes=10412 Input split bytes=202
  • Combine input records=0 Combine output records=0 Reduce input groups=100 Reduce shuffle bytes=10412
  • Reduce input records=100 Reduce output records=100 Spilled Records=200 Shuffled Maps =2 Failed Shuffles=0 Merged Map outputs=2 GC time elapsed (ms)=3517
  • CPU time spent (ms)=7530
  • Physical memory (bytes) snapshot=375734272
  • Virtual memory (bytes) snapshot=1086078976
  • Total committed heap usage (bytes)=256647168 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0
  • File Input Format Counters Bytes Read=10000
  • File Output Format Counters Bytes Written=10000 15/04/01 13:59:59 INFO terasort.TeraSort: done terasort Ref hduser@hadoop3:~$ yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar terasort
  • File System Counters FILE: Number of bytes read=10406
  • FILE: Number of bytes written=340350
  • FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=10202
  • HDFS: Number of bytes written=10000
  • HDFS: Number of read operations=9 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=2 Launched reduce tasks=1 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=64639
  • Total time spent by all reduces in occupied slots (ms)=32219
  • Total time spent by all map tasks (ms)=64639
  • Total time spent by all reduce tasks (ms)=32219
  • Total vcore-seconds taken by all map tasks=64639
  • Total vcore-seconds taken by all reduce tasks=32219
  • Total megabyte-seconds taken by all map tasks=66190336
  • Total megabyte-seconds taken by all reduce tasks=32992256 Map-Reduce Framework Map input records=100 Map output records=100 Map output bytes=10200
  • Map output materialized bytes=10412 Input split bytes=202
  • Combine input records=0 Combine output records=0 Reduce input groups=100 Reduce shuffle bytes=10412
  • Reduce input records=100 Reduce output records=100 Spilled Records=200 Shuffled Maps =2 Failed Shuffles=0 Merged Map outputs=2 GC time elapsed (ms)=50 CPU time spent (ms)=1100
  • Physical memory (bytes) snapshot=714772480
  • Virtual memory (bytes) snapshot=2177273856
  • Total committed heap usage (bytes)=603979776 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0
  • File Input Format Counters Bytes Read=10000
  • File Output Format Counters Bytes Written=10000 15/03/31 14:38:03 INFO terasort.TeraSort: done

Tek Kart Bilgisayarlar ile Bulut Oluşturarak MapReduce İşlemleri Denemesi

Year 2015, , 179 - 191, 22.07.2015
https://doi.org/10.17671/btd.88292

Abstract

Günümüzde bilişim sistemlerinde geçmişe oranla çok daha büyük veriler oluşmaktadır. Bu verilerin depolanması ve analizinde önemli kaynak sorunları yaşanmaktadır. Büyük Verinin depolanması, işlenmesi ve analiz edilmesi için ihtiyaç duyulan sistemlerin, güncel sistemlerden daha hızlı çalışması ve daha az enerji tüketmesi gerekmektedir. Aksi takdirde çok büyük maliyet ve veri analiz süreleri önümüze çıkmaktadır. Bu çalışmada tek kart mini kişisel bilgisayarlar ile küme oluşturup ve üzerinde kap tabanlı sanallaştırma sağlayıp büyük veri algoritmaları denemeleri yapılmıştır. Bu kapsamda oluşturulan büyük veri sistemlerinin temelini oluşturan Map Reduce işlemlerinin özel olarak tasarlanmış ARM işlemci kümeleri üzerinde yürütülmesini ve etkinliğinin test edilmesi araştırılmıştır. ARMişlemcili tek kart mini bilgisayarların maliyeti ucuz, enerji tüketimi düşük, karbon salınımı düşüktür. Kümeleme, bulut bilişim, çoklu işlem, paralel işlem ve büyük veri uygulamalarına da uygunluğu da görülmüştür.Tek kart bilgisayar donanımı üzerinde kap temelli sanallaştırma kullanımını denenmemiş bir yaklaşımdır. MapReduce uygulamasında işçi düğüm olarak işlem tecritlenmesi kullanılması da yeni bir uygulamadır

References

  • Y. Kaplan, Bulut Bilişim ve İş Sürekliliği, Telepati Telekomünikasyon 183 Rapor, Türkiye, 2010.
  • Y. Korkmaz, Bulut Bilişim: Türkiye İçin Fırsatlar TÜBİTAK – UEKAE, Türkiye, 2008.
  • Y. Zhao, I. Raicu, S.Lu , Cloud Computing and Grid Computing 360-Degree Compared, ,Texas,ABD, 2008
  • B.Emily,M. Jaikrishnan, S. Karthikeyan, Power Struggles: Revisiting the RISC vs. CISC Debate on Contemporary ARM and x86 Architectures, Univ Wisconsin Madison, WI, ABD,2013
  • Arm Company Profile, http://arm.com/about/company-profile/ ,02015
  • Seal, David. ARM Architecture Reference Manual, (2001)
  • Y. Liu,H. Zhenjiang, K. Matsuzaki. Towards Systematic Parallel Programming over MapReduceConference Proceedings, 483-485, 02015
  • M. Owen, TeraByte Sort on Apache Hadoop,Kalifornia,ABD,2008
  • R. L¨ammel , Google’s MapReduce programming model ,2008
  • Adapteva Parallella Manual, http://www.parallella.org/docs /parallella_manual.pdf ,20.01.2015
  • CoreMark Scores http://www.eembc.org ,26.03.2015
  • K. Freund, Redefining Datacenter Efficiency,Calxeda,2012 benchmarks -for-calxedas-5-watt-web-server ,01.01.2015 res2011q3/power_ssj2008-20110806-00392.html ,07.01.2015
  • Deployments, http://www.accenture.com/ sitecollectiondocuments/ pdf/accenture-hadoop-deployment-comparison-study.pdf ,20.20.2014 Comparison Study CloudBased
  • Accenture Hadoop on Cloud, http://www.accenture.com/ Site CollectionDocuments/PDF/Accenture-Cloud-Based-Hadoop- Deployments-Benefits-and-Considerations.pdf ,19.04.2015 hardware-update ,16.02.2015
  • Parallella Soft, http://elinux.org/Parallella_Software ,07.01.2015 Parallella-hw, ,02015
  • Adapteva Referans Tasarım, http://www.adapteva.com/white- papers/parallella-platform-reference-design ,20.01.2015
  • Linaro Open source for ARM-SOC http://www.linaro.org ,02015
  • /technology/high-speed-serial.html ,02.04.2015 http://www.xilinx.com/products
  • Multi-Gigabit Transceiver, http://en.wikipedia.org/wiki/Multi- gigabit_transceiver ,20.01.2015
  • Apache Hadoop https://hadoop.apache.org ,07.01.2015
  • Apache Hadoop NextGen MapReduce (YARN), https://hadoop. apache.org/docs/r2.7.0/hadoop-yarn/hadoop-yarn-site/YARN.html api/org/apache/hadoop/ examples/terasort/package-summary.html ,02015 http://hadoop.apache.org/docs/current/
  • General Technical Discuss, https://forums.xilinx.com/t5/ General-Technical-Discussion/bd-p/GenDis ,07.01.2015 EK-1
  • TestDFSIO -read overall - - - - - 1287 112.863 TestDFSIO -read - -nrFiles 10 -fileSize 1 - - - - 35390 6290
  • TestDFSIO -write overall - - - - - 2731 111.845 TestDFSIO -write - -nrFiles 10 -fileSize 1 - - - - 45140 8820
  • Kurulum İşlem Adımları (Operation Steps) Linaro Kurulumu (Linaro Installation) Parallella releases.linaro.org/14.06/ubuntu/trusty-images/developer 6tar.gz sd kartlara boot edilebilir halde yazılarak kurulum yapılmıştır. adresinden olarak kullanılmıştır. indirilen linaro-trusty-developer-20140623
  • Spent 16ms computing TeraScheduler splits. Computing input splits took 1789ms
  • Sampling 2 splits of 2 Making 1 from 100 sampled records Computing parititions took 1380ms
  • File System Counters FILE: Number of bytes read=10406
  • FILE: Number of bytes written=340350
  • FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=10202
  • HDFS: Number of bytes written=10000
  • HDFS: Number of read operations=9 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=2 Launched reduce tasks=1 Data-local map tasks=1 Rack-local map tasks=1 Total time spent by all maps in occupied slots (ms)=92783
  • Total time spent by all reduces in occupied slots (ms)=20735
  • Total time spent by all map tasks (ms)=92783
  • Total time spent by all reduce tasks (ms)=20735
  • Total vcore-seconds taken by all map tasks=92783
  • Total vcore-seconds taken by all reduce tasks=20735
  • Total megabyte-seconds taken by all map tasks=95009792
  • Total megabyte-seconds taken by all reduce tasks=21232640 Map-Reduce Framework Map input records=100 Map output records=100 Map output bytes=10200
  • Map output materialized bytes=10412 Input split bytes=202
  • Combine input records=0 Combine output records=0 Reduce input groups=100 Reduce shuffle bytes=10412
  • Reduce input records=100 Reduce output records=100 Spilled Records=200 Shuffled Maps =2 Failed Shuffles=0 Merged Map outputs=2 GC time elapsed (ms)=3517
  • CPU time spent (ms)=7530
  • Physical memory (bytes) snapshot=375734272
  • Virtual memory (bytes) snapshot=1086078976
  • Total committed heap usage (bytes)=256647168 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0
  • File Input Format Counters Bytes Read=10000
  • File Output Format Counters Bytes Written=10000 15/04/01 13:59:59 INFO terasort.TeraSort: done terasort Ref hduser@hadoop3:~$ yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar terasort
  • File System Counters FILE: Number of bytes read=10406
  • FILE: Number of bytes written=340350
  • FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=10202
  • HDFS: Number of bytes written=10000
  • HDFS: Number of read operations=9 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=2 Launched reduce tasks=1 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=64639
  • Total time spent by all reduces in occupied slots (ms)=32219
  • Total time spent by all map tasks (ms)=64639
  • Total time spent by all reduce tasks (ms)=32219
  • Total vcore-seconds taken by all map tasks=64639
  • Total vcore-seconds taken by all reduce tasks=32219
  • Total megabyte-seconds taken by all map tasks=66190336
  • Total megabyte-seconds taken by all reduce tasks=32992256 Map-Reduce Framework Map input records=100 Map output records=100 Map output bytes=10200
  • Map output materialized bytes=10412 Input split bytes=202
  • Combine input records=0 Combine output records=0 Reduce input groups=100 Reduce shuffle bytes=10412
  • Reduce input records=100 Reduce output records=100 Spilled Records=200 Shuffled Maps =2 Failed Shuffles=0 Merged Map outputs=2 GC time elapsed (ms)=50 CPU time spent (ms)=1100
  • Physical memory (bytes) snapshot=714772480
  • Virtual memory (bytes) snapshot=2177273856
  • Total committed heap usage (bytes)=603979776 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0
  • File Input Format Counters Bytes Read=10000
  • File Output Format Counters Bytes Written=10000 15/03/31 14:38:03 INFO terasort.TeraSort: done
There are 68 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Articles
Authors

Levent Aysan This is me

İzzet Özbilgin

Publication Date July 22, 2015
Submission Date July 22, 2015
Published in Issue Year 2015

Cite

APA Aysan, L., & Özbilgin, İ. (2015). Tek Kart Bilgisayarlar ile Bulut Oluşturarak MapReduce İşlemleri Denemesi. Bilişim Teknolojileri Dergisi, 8(3), 179-191. https://doi.org/10.17671/btd.88292