Abstract
Cluster analysis by k-means algorithm by R programming
is the scope of the current paper. The study assesses the similarity of
the sampling data derived from the GIS project by homogeneity of their
attribute parameters aimed to analyze similar clusters of the observa-
tion data by the variety of parameters: geology (similar location on the
tectonic plates, sediment thickness, igneous volcanic areas), bathymetry
(similar depth ranges) and geomorphology (similar slope steepness and
aspect). The geological case study is Mariana Trench. Clustering as ef-
fective statistical method to detect similar groups in the data set. Tech-
nically, major used R libraries include {cluster}, {factoextra}, {ggplot2}.
Minor R libraries include {wordcloud}, {tm}. Several clusters were tested
from 2 to 7, optical number is 5. The findings include following computed
and visualized results illustrated by 8 figures: 1) correlation matrix show-
ing crossing correlations in the combination of factors; 2) comparison of
the bi-factors in-between the factors revealed pairwise correlation; 3)
pairwise comparative analysis enabled to observe an influence on the
variables as bi-factors: in response to the decreasing sediment thickness,
slope angles go in parallel; 4) the location of the volcanic igneous ar-
eas cause a cyclic repetition of the curve for the slope angles, and those
of the volcanic zones have correlation with the slope angle and aspect
degree. Findings reveals that four variables affect geomorphology of the
trench: slope angle, sediment thickness, aspect degree and volcanic ig-
neous areas. The paper includes 7 listings of R programming codes for
repeatability of the algorithms in similar research.