Tek Kart Bilgisayarlar ile Bulut Oluşturarak MapReduce İşlemleri Denemesi

Levent Aysan; İzzet Özbilgin

doi:10.17671/btd.88292

Araştırma Makalesi

Study on MapReduce Operations Creating Cloud with Single Board Computer

Yıl 2015, Cilt: 8 Sayı: 3, 179 - 191, 22.07.2015

Levent Aysan İzzet Özbilgin

https://doi.org/10.17671/btd.88292

Öz

- Nowadays information systems have much larger data than in the past. The storage and analysis of this data
has a huge lack of resource. Saving, processing and analyzing of the big data needs systems that work faster and
consume less energy than current systems. Otherwise much greater costs and times of data analysis will be faced. In this
study, a cluster of single board computers is created and succeeded to run process isolation operating system level
virtualization for experiencing on big data algorithms. In this context created our work, the Map Reduce transactions,
which are the basics of big data systems, were executed on specifically designed ARM architecture mini supercomputer
clusters. ARM processed single board computers have effective costs, less energy consumptions and less carbon
emissions. Clustering, cloud computing, multiprocessing, parallel processing and big data applications compliance has
been also observed. Container virtualization on single board computer is an untested approach to use. Using process
isolation for MapReduce WorkerNode is yet another new practice.

Anahtar Kelimeler

single board computer cluster, virtualization, big data, MapReduce, Hadoop

Kaynakça

Y. Kaplan, Bulut Bilişim ve İş Sürekliliği, Telepati Telekomünikasyon 183 Rapor, Türkiye, 2010.
Y. Korkmaz, Bulut Bilişim: Türkiye İçin Fırsatlar TÜBİTAK – UEKAE, Türkiye, 2008.
Y. Zhao, I. Raicu, S.Lu , Cloud Computing and Grid Computing 360-Degree Compared, ,Texas,ABD, 2008
B.Emily,M. Jaikrishnan, S. Karthikeyan, Power Struggles: Revisiting the RISC vs. CISC Debate on Contemporary ARM and x86 Architectures, Univ Wisconsin Madison, WI, ABD,2013
Arm Company Profile, http://arm.com/about/company-profile/ ,02015
Seal, David. ARM Architecture Reference Manual, (2001)
Y. Liu,H. Zhenjiang, K. Matsuzaki. Towards Systematic Parallel Programming over MapReduceConference Proceedings, 483-485, 02015
M. Owen, TeraByte Sort on Apache Hadoop,Kalifornia,ABD,2008
R. L¨ammel , Google’s MapReduce programming model ,2008
Adapteva Parallella Manual, http://www.parallella.org/docs /parallella_manual.pdf ,20.01.2015
CoreMark Scores http://www.eembc.org ,26.03.2015
K. Freund, Redefining Datacenter Efficiency,Calxeda,2012 benchmarks -for-calxedas-5-watt-web-server ,01.01.2015 res2011q3/power_ssj2008-20110806-00392.html ,07.01.2015
Deployments, http://www.accenture.com/ sitecollectiondocuments/ pdf/accenture-hadoop-deployment-comparison-study.pdf ,20.20.2014 Comparison Study CloudBased
Accenture Hadoop on Cloud, http://www.accenture.com/ Site CollectionDocuments/PDF/Accenture-Cloud-Based-Hadoop- Deployments-Benefits-and-Considerations.pdf ,19.04.2015 hardware-update ,16.02.2015
Parallella Soft, http://elinux.org/Parallella_Software ,07.01.2015 Parallella-hw, ,02015
Adapteva Referans Tasarım, http://www.adapteva.com/white- papers/parallella-platform-reference-design ,20.01.2015
Linaro Open source for ARM-SOC http://www.linaro.org ,02015
/technology/high-speed-serial.html ,02.04.2015 http://www.xilinx.com/products
Multi-Gigabit Transceiver, http://en.wikipedia.org/wiki/Multi- gigabit_transceiver ,20.01.2015
Apache Hadoop https://hadoop.apache.org ,07.01.2015
Apache Hadoop NextGen MapReduce (YARN), https://hadoop. apache.org/docs/r2.7.0/hadoop-yarn/hadoop-yarn-site/YARN.html api/org/apache/hadoop/ examples/terasort/package-summary.html ,02015 http://hadoop.apache.org/docs/current/
General Technical Discuss, https://forums.xilinx.com/t5/ General-Technical-Discussion/bd-p/GenDis ,07.01.2015 EK-1
TestDFSIO -read overall - - - - - 1287 112.863 TestDFSIO -read - -nrFiles 10 -fileSize 1 - - - - 35390 6290
TestDFSIO -write overall - - - - - 2731 111.845 TestDFSIO -write - -nrFiles 10 -fileSize 1 - - - - 45140 8820
Kurulum İşlem Adımları (Operation Steps) Linaro Kurulumu (Linaro Installation) Parallella releases.linaro.org/14.06/ubuntu/trusty-images/developer 6tar.gz sd kartlara boot edilebilir halde yazılarak kurulum yapılmıştır. adresinden olarak kullanılmıştır. indirilen linaro-trusty-developer-20140623
Spent 16ms computing TeraScheduler splits. Computing input splits took 1789ms
Sampling 2 splits of 2 Making 1 from 100 sampled records Computing parititions took 1380ms
File System Counters FILE: Number of bytes read=10406
FILE: Number of bytes written=340350
FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=10202
HDFS: Number of bytes written=10000
HDFS: Number of read operations=9 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=2 Launched reduce tasks=1 Data-local map tasks=1 Rack-local map tasks=1 Total time spent by all maps in occupied slots (ms)=92783
Total time spent by all reduces in occupied slots (ms)=20735
Total time spent by all map tasks (ms)=92783
Total time spent by all reduce tasks (ms)=20735
Total vcore-seconds taken by all map tasks=92783
Total vcore-seconds taken by all reduce tasks=20735
Total megabyte-seconds taken by all map tasks=95009792
Total megabyte-seconds taken by all reduce tasks=21232640 Map-Reduce Framework Map input records=100 Map output records=100 Map output bytes=10200
Map output materialized bytes=10412 Input split bytes=202
Combine input records=0 Combine output records=0 Reduce input groups=100 Reduce shuffle bytes=10412
Reduce input records=100 Reduce output records=100 Spilled Records=200 Shuffled Maps =2 Failed Shuffles=0 Merged Map outputs=2 GC time elapsed (ms)=3517
CPU time spent (ms)=7530
Physical memory (bytes) snapshot=375734272
Virtual memory (bytes) snapshot=1086078976
Total committed heap usage (bytes)=256647168 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0
File Input Format Counters Bytes Read=10000
File Output Format Counters Bytes Written=10000 15/04/01 13:59:59 INFO terasort.TeraSort: done terasort Ref hduser@hadoop3:~$ yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar terasort
File System Counters FILE: Number of bytes read=10406
FILE: Number of bytes written=340350
FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=10202
HDFS: Number of bytes written=10000
HDFS: Number of read operations=9 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=2 Launched reduce tasks=1 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=64639
Total time spent by all reduces in occupied slots (ms)=32219
Total time spent by all map tasks (ms)=64639
Total time spent by all reduce tasks (ms)=32219
Total vcore-seconds taken by all map tasks=64639
Total vcore-seconds taken by all reduce tasks=32219
Total megabyte-seconds taken by all map tasks=66190336
Total megabyte-seconds taken by all reduce tasks=32992256 Map-Reduce Framework Map input records=100 Map output records=100 Map output bytes=10200
Map output materialized bytes=10412 Input split bytes=202
Combine input records=0 Combine output records=0 Reduce input groups=100 Reduce shuffle bytes=10412
Reduce input records=100 Reduce output records=100 Spilled Records=200 Shuffled Maps =2 Failed Shuffles=0 Merged Map outputs=2 GC time elapsed (ms)=50 CPU time spent (ms)=1100
Physical memory (bytes) snapshot=714772480
Virtual memory (bytes) snapshot=2177273856
Total committed heap usage (bytes)=603979776 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0
File Input Format Counters Bytes Read=10000
File Output Format Counters Bytes Written=10000 15/03/31 14:38:03 INFO terasort.TeraSort: done

Tek Kart Bilgisayarlar ile Bulut Oluşturarak MapReduce İşlemleri Denemesi

Yıl 2015, Cilt: 8 Sayı: 3, 179 - 191, 22.07.2015

Levent Aysan İzzet Özbilgin

https://doi.org/10.17671/btd.88292

Öz

Günümüzde bilişim sistemlerinde geçmişe oranla çok daha büyük veriler oluşmaktadır. Bu verilerin depolanması ve analizinde önemli kaynak sorunları yaşanmaktadır. Büyük Verinin depolanması, işlenmesi ve analiz edilmesi için ihtiyaç duyulan sistemlerin, güncel sistemlerden daha hızlı çalışması ve daha az enerji tüketmesi gerekmektedir. Aksi takdirde çok büyük maliyet ve veri analiz süreleri önümüze çıkmaktadır. Bu çalışmada tek kart mini kişisel bilgisayarlar ile küme oluşturup ve üzerinde kap tabanlı sanallaştırma sağlayıp büyük veri algoritmaları denemeleri yapılmıştır. Bu kapsamda oluşturulan büyük veri sistemlerinin temelini oluşturan Map Reduce işlemlerinin özel olarak tasarlanmış ARM işlemci kümeleri üzerinde yürütülmesini ve etkinliğinin test edilmesi araştırılmıştır. ARMişlemcili tek kart mini bilgisayarların maliyeti ucuz, enerji tüketimi düşük, karbon salınımı düşüktür. Kümeleme, bulut bilişim, çoklu işlem, paralel işlem ve büyük veri uygulamalarına da uygunluğu da görülmüştür.Tek kart bilgisayar donanımı üzerinde kap temelli sanallaştırma kullanımını denenmemiş bir yaklaşımdır. MapReduce uygulamasında işçi düğüm olarak işlem tecritlenmesi kullanılması da yeni bir uygulamadır

Anahtar Kelimeler

tek kart bilgisayar kümesi, sanallaştırma, büyük veri, MapReduce, Hadoop

Kaynakça

Y. Kaplan, Bulut Bilişim ve İş Sürekliliği, Telepati Telekomünikasyon 183 Rapor, Türkiye, 2010.
Y. Korkmaz, Bulut Bilişim: Türkiye İçin Fırsatlar TÜBİTAK – UEKAE, Türkiye, 2008.
Y. Zhao, I. Raicu, S.Lu , Cloud Computing and Grid Computing 360-Degree Compared, ,Texas,ABD, 2008
B.Emily,M. Jaikrishnan, S. Karthikeyan, Power Struggles: Revisiting the RISC vs. CISC Debate on Contemporary ARM and x86 Architectures, Univ Wisconsin Madison, WI, ABD,2013
Arm Company Profile, http://arm.com/about/company-profile/ ,02015
Seal, David. ARM Architecture Reference Manual, (2001)
Y. Liu,H. Zhenjiang, K. Matsuzaki. Towards Systematic Parallel Programming over MapReduceConference Proceedings, 483-485, 02015
M. Owen, TeraByte Sort on Apache Hadoop,Kalifornia,ABD,2008
R. L¨ammel , Google’s MapReduce programming model ,2008
Adapteva Parallella Manual, http://www.parallella.org/docs /parallella_manual.pdf ,20.01.2015
CoreMark Scores http://www.eembc.org ,26.03.2015
K. Freund, Redefining Datacenter Efficiency,Calxeda,2012 benchmarks -for-calxedas-5-watt-web-server ,01.01.2015 res2011q3/power_ssj2008-20110806-00392.html ,07.01.2015
Deployments, http://www.accenture.com/ sitecollectiondocuments/ pdf/accenture-hadoop-deployment-comparison-study.pdf ,20.20.2014 Comparison Study CloudBased
Accenture Hadoop on Cloud, http://www.accenture.com/ Site CollectionDocuments/PDF/Accenture-Cloud-Based-Hadoop- Deployments-Benefits-and-Considerations.pdf ,19.04.2015 hardware-update ,16.02.2015
Parallella Soft, http://elinux.org/Parallella_Software ,07.01.2015 Parallella-hw, ,02015
Adapteva Referans Tasarım, http://www.adapteva.com/white- papers/parallella-platform-reference-design ,20.01.2015
Linaro Open source for ARM-SOC http://www.linaro.org ,02015
/technology/high-speed-serial.html ,02.04.2015 http://www.xilinx.com/products
Multi-Gigabit Transceiver, http://en.wikipedia.org/wiki/Multi- gigabit_transceiver ,20.01.2015
Apache Hadoop https://hadoop.apache.org ,07.01.2015
Apache Hadoop NextGen MapReduce (YARN), https://hadoop. apache.org/docs/r2.7.0/hadoop-yarn/hadoop-yarn-site/YARN.html api/org/apache/hadoop/ examples/terasort/package-summary.html ,02015 http://hadoop.apache.org/docs/current/
General Technical Discuss, https://forums.xilinx.com/t5/ General-Technical-Discussion/bd-p/GenDis ,07.01.2015 EK-1
TestDFSIO -read overall - - - - - 1287 112.863 TestDFSIO -read - -nrFiles 10 -fileSize 1 - - - - 35390 6290
TestDFSIO -write overall - - - - - 2731 111.845 TestDFSIO -write - -nrFiles 10 -fileSize 1 - - - - 45140 8820
Kurulum İşlem Adımları (Operation Steps) Linaro Kurulumu (Linaro Installation) Parallella releases.linaro.org/14.06/ubuntu/trusty-images/developer 6tar.gz sd kartlara boot edilebilir halde yazılarak kurulum yapılmıştır. adresinden olarak kullanılmıştır. indirilen linaro-trusty-developer-20140623
Spent 16ms computing TeraScheduler splits. Computing input splits took 1789ms
Sampling 2 splits of 2 Making 1 from 100 sampled records Computing parititions took 1380ms
File System Counters FILE: Number of bytes read=10406
FILE: Number of bytes written=340350
FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=10202
HDFS: Number of bytes written=10000
HDFS: Number of read operations=9 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=2 Launched reduce tasks=1 Data-local map tasks=1 Rack-local map tasks=1 Total time spent by all maps in occupied slots (ms)=92783
Total time spent by all reduces in occupied slots (ms)=20735
Total time spent by all map tasks (ms)=92783
Total time spent by all reduce tasks (ms)=20735
Total vcore-seconds taken by all map tasks=92783
Total vcore-seconds taken by all reduce tasks=20735
Total megabyte-seconds taken by all map tasks=95009792
Total megabyte-seconds taken by all reduce tasks=21232640 Map-Reduce Framework Map input records=100 Map output records=100 Map output bytes=10200
Map output materialized bytes=10412 Input split bytes=202
Combine input records=0 Combine output records=0 Reduce input groups=100 Reduce shuffle bytes=10412
Reduce input records=100 Reduce output records=100 Spilled Records=200 Shuffled Maps =2 Failed Shuffles=0 Merged Map outputs=2 GC time elapsed (ms)=3517
CPU time spent (ms)=7530
Physical memory (bytes) snapshot=375734272
Virtual memory (bytes) snapshot=1086078976
Total committed heap usage (bytes)=256647168 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0
File Input Format Counters Bytes Read=10000
File Output Format Counters Bytes Written=10000 15/04/01 13:59:59 INFO terasort.TeraSort: done terasort Ref hduser@hadoop3:~$ yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar terasort
File System Counters FILE: Number of bytes read=10406
FILE: Number of bytes written=340350
FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=10202
HDFS: Number of bytes written=10000
HDFS: Number of read operations=9 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=2 Launched reduce tasks=1 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=64639
Total time spent by all reduces in occupied slots (ms)=32219
Total time spent by all map tasks (ms)=64639
Total time spent by all reduce tasks (ms)=32219
Total vcore-seconds taken by all map tasks=64639
Total vcore-seconds taken by all reduce tasks=32219
Total megabyte-seconds taken by all map tasks=66190336
Total megabyte-seconds taken by all reduce tasks=32992256 Map-Reduce Framework Map input records=100 Map output records=100 Map output bytes=10200
Map output materialized bytes=10412 Input split bytes=202
Combine input records=0 Combine output records=0 Reduce input groups=100 Reduce shuffle bytes=10412
Reduce input records=100 Reduce output records=100 Spilled Records=200 Shuffled Maps =2 Failed Shuffles=0 Merged Map outputs=2 GC time elapsed (ms)=50 CPU time spent (ms)=1100
Physical memory (bytes) snapshot=714772480
Virtual memory (bytes) snapshot=2177273856
Total committed heap usage (bytes)=603979776 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0
File Input Format Counters Bytes Read=10000
File Output Format Counters Bytes Written=10000 15/03/31 14:38:03 INFO terasort.TeraSort: done

Toplam 68 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Konular	Mühendislik
Bölüm	Makaleler
Yazarlar	Levent Aysan Bu kişi benim İzzet Özbilgin
Yayımlanma Tarihi	22 Temmuz 2015
Gönderilme Tarihi	22 Temmuz 2015
Yayımlandığı Sayı	Yıl 2015 Cilt: 8 Sayı: 3

Kaynak Göster

APA	Aysan, L., & Özbilgin, İ. (2015). Tek Kart Bilgisayarlar ile Bulut Oluşturarak MapReduce İşlemleri Denemesi. Bilişim Teknolojileri Dergisi, 8(3), 179-191. https://doi.org/10.17671/btd.88292

Kapak Resmi İndir

Makale Dosyaları

Tam Metin