Genetic Variation, Heritability, Principal Component Analysis, Correlation and Path Coefficient Analysis in the Fruit Samples of Sechium edule (Jacq.) Sw. Genotypes

Article Info Abstract: Genetic diversity, heritability, the genetic advance of yield, and associated traits are some important criteria to generate some basic information related to the genetic improvement of crops. Some characters of Sechium edule (Jacq.) Sw. genotype fruits have been evaluated for their improvement purpose. Genotypes and fruit samples of Sechium were randomly collected for fruit traits such as length (FL), width (FW), circumference/girth (FC), and the number of ridges (FR) from the various parts of village Kigwema of Kohima district, Nagaland at a mean value of latitude (25.60690 N), longitude (94.34250 E) and altitude (1538 masl) for the purpose. Genotypes and fruit samples collection for trait study normally distributed in histogram plot and normality test. Analysis of variance (ANOVA) estimated significant differences in fruit sample traits. The phenotypic coefficient of variation (PCV) was greater than the genotypic coefficient of variation (GCV) for all the traits. The phenotypic and genotypic coefficient of variation was recorded maximum for trait fruit length, while maximum heritability was recorded for trait fruit circumference. High heritability and high genetic advance estimates for fruit circumference suggest that it could be considered for further improvement through various breeding programs. Principle component analysis (PCA) showed that fruit length and fruit ridges numbers are responsible for most of the variations observed in the fruit morphology and could be considered for its improvement. Fruit width recorded maximum for correlation coefficient direct value indicated towards effect on the fruit circumference and direct selection of the trait for its improvement. Received: 18.11.2021 Accepted: 02.03.2022 Online published: 15.03.2022 DOI:10.29133/yyutbd.1025466

Genotypes and fruit samples of Sechium were randomly collected for fruit traits such as length (FL), width (FW), circumference/girth (FC), and the number of ridges (FR) from the various parts of village Kigwema of Kohima district, Nagaland at a mean value of latitude (25.60690 N), longitude (94.34250 E) and altitude (1538 masl) for the purpose. Genotypes and fruit samples collection for trait study normally distributed in histogram plot and normality test. Analysis of variance (ANOVA) estimated significant differences in fruit sample traits. The phenotypic coefficient of variation (PCV) was greater than the genotypic coefficient of variation (GCV) for all the traits. The phenotypic and genotypic coefficient of variation was recorded maximum for trait fruit length, while maximum heritability was recorded for trait fruit circumference. High heritability and high genetic advance estimates for fruit circumference suggest that it could be considered for further improvement through various breeding programs. Principle component analysis (PCA) showed that fruit length and fruit ridges numbers are responsible for most of the variations observed in the fruit morphology and could be considered for its improvement. Fruit width recorded maximum for correlation coefficient direct value indicated towards effect on the fruit circumference and direct selection of the trait for its improvement.

Introduction
Nature fulfills every need of human beings and grows various plants with economic importance. Most of the plants in nature are unexplored and gathering information on nature and its variation is an important criterion to select a genotype with desirable traits for a successful plant breeding program (Dyulgerova and Valcheva, 2014;Ipek and Balta, 2020). Sechium edule (Jacq.) Sw., Cucurbitaceae, is a nutritious crop, mostly uses matured fruit as vegetables and boiled salads. The demands of various cucurbitaceous vegetables by the people in the society caused to increase the production of such important crops. Therefore to achieve the target, information on genetic variability, heritability, the genetic advance of yield, and associated traits is essentially important.
Regression is an estimation of the linear relationship between a dependent variable and one or more independent variables (Vrbik, 2018). Also, it is a test method to observe data is approximately normally distributed (Dimitrova et al., 2020). Plotting a histogram of the variables of interest indicated the shape of normally distributed data, and test of normality is performed and confirmed through Q-Q plots and Kolmogorove-Smirnov and Shapiro-Wilk's W test in SPSS ver. 16 (Kolmogorove, 1933;Smirnov, 1948;Shapiro and Wilk, 1965).
Analysis of variance (anova) is generally used to ensure whether the means of two or more groups of samples are significantly differ or not from each other (Turkheimer, 2015). Heritability and agronomic characteristics are important traits that may be utilized to estimate and improve the yield of crops successfully. Falconer and Mackey (1996) defined heritability as the measure of correspondence between breeding values and phenotypic values. Heritability plays a predictive role in breeding and expresss its dependability on phenotype, which guides its breeding value (Wray and Visscher, 2008). Genetic advance is a direct relationship between heritability and response to selection. High genetic advance with high heritability estimates the most effective condition for selection (Moore and Shenk, 2017). Heritability is, therefore important, and its application increases when a genetic advance is calculated as it indicates the gain in character obtained under particular selection pressure. Therefore, genetic advance is an important selection parameter that helps in the selection breeding program.
The correlation procedure calculates the correlation between variables and measures the strength of the linear relationship between two variables. The knowledge of the relationship between genetic and phenotypic data is important and valuable when the traits are considered for selection.
Path coefficient analysis allows an effective means of partitioning correlation coefficients into the unidirectional pathway and alternative pathways. This analysis permits a critical examination of specific factors that produce a given correlation and can be successfully employed in formulating an effective strategy (Okuyama et al., 2004).
Principal component analysis (PCA) is a mathematical procedure to transform, possibly a large number of correlated variables into a smaller number of uncorrelated variables, usually known as principal components (Chatfield and Collis, 1980). The first principal component (PC1) accounts for as much of the variability in the data as possible, and each succeeding component (PC2, PC3, PC4 etc.) accounts for as much of the remaining variability as possible. The objective of the PCA is to discover or reduce the dimensionality of the data set and to identify new meaningful causal variables (Jolliffe, 2002).
The purpose of cluster analysis is to discover a system of organizing observations where members of the group share specific properties in common. Cluster analysis is a class of techniques that classifies cases into groups that are relatively homogeneous within themselves and relatively heterogeneous between each other (Yim and Ramdeen, 2015).
So, the purpose of the present study is to generate some basic information, using standard statistical methods such as analysis of variation, correlation, regression, path correlation coefficient analysis, principal component analysis, cluster, and factor analysis to differentiate and assess the variations, heritability (broad sense), genetic advance, yield and contribution through direct and indirect effects of some important traits on the total variation of genotypes for the further genetic improvement of the crop.

Plant materials
The twelve genotypes of Sechium were randomly collected from randomized complete block design at Kigwema village of Kohima district, Nagaland, with a mean value of latitude (25.60690 N), longitude (94.34250 E), and altitude (1538 masl). The mean values of 5 fruits from each genotype of 3 randomized complete block designs were used for quantitative traits such as length, width, circumference/girth, and the number of ridges for further analysis.

Regression and data normality test
Regression analysis is a statistical technique applied to draw a relationship among interrelated variables and analyzed using SPSS ver. 16 as suggested by Landau and Everitt (2004). The genotypes fruit samples quantitative traits were regressed against the genotypes, and the histogram plot suggested the normal distribution and shape of data. Test of normality was performed in SPSS ver. 16 and suggested the Shapiro-Wilk test, which is sensitive and more appropriate for smaller samples than the Kolmogorove-Smirnov test.

Analysis of variance
Analysis of variance (ANOVA) was carried out using procedures explained for SPSS ver. 16 by Landau and Everitt (2004). Genotypic and phenotypic coefficients of variability were estimated using the formula given by Burton and Devane (1953).

Heritability (broad sense), genetic advance, and genetic advance mean percent
Heritability in the broad sense (H 2 ) was estimated as per the formula suggested by Allard (1960).
GA=K× σp×H2 (4) Where, K=selection differential at 5% selection intensity, which accounts for a constant value 2.06 =phenotypic standard deviation Genetic advance over mean (GAM) is expressed in percentage and calculated using the following formula. Genetic advance as percent over mean is categorized as low (<10%), moderate (10-20%) and high (>20%) according to Johnson et al., 1955.

Correlation and Path coefficient analysis
The Pearson correlation was performed for the quantitative traits using SPSS ver. 16 and followed Okuyama et al., 2004. Path coefficient analysis for the quantitative traits was performed in MS-Excel spreadsheet using an inbuilt statistical software package and followed the procedure of Akintunde (2012). Correlation coefficients were calculated for their direct and indirect component parts and represented in a graphical form.

Principal component analysis or Factor analysis
Principal component analysis (PCA) or factor analysis was performed using SPSS ver. 16, and followed the procedure suggested by Chatfield and Collis (1980). PCA is used to calculate the initial and extracted communalities suggested the total variation by 2 major components in the traits.

Cluster analysis
Hierarchical cluster analysis is performed for grouping the fruit sample traits and followed the procedure of Everitt et al., 2011. Cluster analysis using ward method dendrogram suggested the 2 groups of quantitative traits of fruit samples

Results
The quantitative traits data of Sechium fruit samples are normally distributed as suggested by the histogram plot with normal bell-shape and test of normality where Shapiro-Wilk statistics are more towards normal distribution than Kolmogorov-Smirnov statistics at p≤0.05 (Karney, 2016). Shapiro-Wilk test is more sensitive towards outliers and smaller data samples (Shore, 2012 and. Therefore, normality could be assumed for the data; as a result, any other test assumptions may be satisfied, and finally, an appropriate parametric test can be used (Figure 1 and Table 1).
Analysis of variance (ANOVA) was performed for the quantitative traits of Sechium genotypes fruit traits, and mean value was recorded high for FC (21.7±0.32) followed by FL (11.5±0.31), Fwd (8.51±0.15), and FR (4.73±0.11) respectively are significantly differs from each other at the probability level of p≤0.05 (Table 2).  The estimates of the phenotypic and genotypic variances showed similar trend as of the mean value from ANOVA (Table 3). Pearson's correlation of quantitative traits of fruit samples is significant at the level of probability of significance p≤0.01 (2-tailed), and all traits are positively associated with each other (Table 4). For the estimation of the partial regression coefficients or direct path coefficients, fruit trait, FC was considered as a dependent (resultant) and FL, Fwd, and FR independent (causal) variables (Table  5).     Test of normality suggested that data are normally distributed and fit for test assumptions and parametric tests (Table 1). Also, in order to whether the present data are suitable for principle component analysis/factor analysis or not, anti image correlation, as well as Kaiser-Meyer-Olkin (KMO) and Bartlett's test, was performed. anti-image correlation suggested the value more than 0.600 a (diagonal value) and suitable for the measure of sample adequacy. Similarly, results are obtained in KMO and Bartlett's test (0.718 and p-value=0.000) for the measure of sample adequacy (Table 7).
The principal component is an important multivariate analysis technique used to examine the association between characters and measures genetic diversity (Table 8-9 and Figure 3). Hierarchical cluster dendrogram using ward method for fruit characters were estimated (Figure 4).

Discussion
Data are normally distributed and could be assumed for an appropriate parametric test as revealed from the normality test (Table 1 and Figure 1). The significant differences among and between the traits from ANOVA suggest the presence of genetic variation and provide an opportunity for the consideration of fruit samples in plant breeding improvement programs (Gelman, 2005;Dag et al., 2018). The maximum phenotypic (Vp) and genotypic (Vg) variances was estimated for FC (Vp=6.25; Vg=6.11) followed by FL (Vp=5.56; Vg=5.39), Fwd (Vp=1.32; Vg=1.27) and FR (Vp=0.63; Vg=0.59) respectively. The minimum Vp was recorded for the trait FR. As expected, the phenotypic coefficient of variation (PCV) was greater than the genotypic coefficient variation (GCV) for all the traits. A larger difference between PCV and GCV values suggests the greater environmental effects on the traits. The PCV and GCV estimate the nature and magnitude of variation and explain the variation either due to genetic or environmental causes. It was estimated maximum for FL (pcv%=20.50; gcv%=20.18) followed by FR (pcv%=16.78;gcv%=16.23),Fwd (pcv%=13.50;gcv%=13.24) and FC (pcv%=11.52;gcv%=11.39) respectively. All the traits showed smaller differences, and minimum PCV and GCV value was estimated for FC. Heritability is the proportion of genetic variance in phenotypic variance that gives information about the inheritance of traits. Traits with high heritability are easy to improve through selection, and it is recorded high for all traits in the study. The highest heritability was estimated for FC (97%), followed by FL (96%), Fwd (96%), and FR (93%), respectively. Similar results have been reported in previous studies (Turkheimer, 2011;Tester and Langridge, 2010). It is also considered that high heritability does not indicate high genetic gain always. Therefore, both heritability and genetic advance are considered together for prediction and their final effects on traits for selection (Luby et al., 2015;Johnson et al., 2011). Fruit circumference, FC showed high heritability coupled with high genetic advance. High heritability coupled with high genetic advance suggests the additive effects on the control of the particular trait (Hartung and Schiemann, 2014;Lipi et al., 2020). On the other hand, high heritability with low genetic advance indicates the non-additive effects on the control of a particular trait (Heckerman et al., 2016;Ning et al., 2020). High heritability and high genetic advance for FC suggest a good trait to consider for selection and improvement in plant breeding programs (Table 3).
Quantitative traits of fruit samples are significant at the level of probability of significance p≤0.01 (2-tailed), and all traits are positively associated with each other and indicate towards possible contribution in trait improvement. The dependent variable and independent variables in the correlation matrix for fruit sample traits are represented as Y and X1, X2, and X3, respectively (Mahdavi, 2013;Székely et al., 2007) (Table 4).
For the estimation of the partial regression coefficients, fruit sample trait, FC was considered as a dependent (resultant) and FL, Fwd, and FR independent (causal) variables. FL showed a small but negative (-0.048) and negligible direct effect on the FC. The indirect effect on FC through Fwd and FR is positive, where the Fwd effect is in considerable magnitude than FR. The total sum value of the correlation (0.550) is positive and significant at p≤0.01, which suggests that indirect selection of the trait through Fwd could be useful. Fwd showed the highest and positive (0.927) direct effect on FC. The indirect effects are negative (-0.031) through FL and positive (0.051) for FR. Both the indirect effects are small and negligible in magnitude. The total sum value of correlation (0.947) is positive and highly significant at p≤0.01, which suggests that direct selection of the trait could be useful. FR showed a positive (0.087) but small and negligible direct effect on the FC. The indirect effect is positive (0.547) and negative (-0.026) through Fwd and FL, respectively. The indirect effect through Fwd is considerable magnitude than FL. The total sum value of correlation (0.521) is positive and significant at p≤0.01, which suggests that indirect selection of the trait could be useful (Tarka, 2017;Bentler and Chih-Ping, 2016). The partial regression coefficients or direct path coefficients, direct and indirect component parts are presented in Table 5-6 and Figure 2.
The principal component is an important multivariate analysis technique used to examine the association between characters and measures the genotype/varieties genetic diversity (Esposito et al., 2007;Muradoglu et al., 2021). The result of PCA for the fruit traits of Sechium genotypes showed that the first two components explained 74.76% and 87.66% of the cumulative variation for fruit populations, respectively. The first component (PC1) accounts for 74.76% of total variation and is positively and highly associated with all the traits in the study called the fruit component. The second component (PC2) explained 12.90% of total variation and was positively associated with FL and FR while negatively associated with FC and Fwd (Table 7-8). The genetic variation of quantitative traits based on multivariate analysis using PCA suggests that FL and FR as the most important trait for explaining variability in the fruits of Sechium genotypes followed by FC and Fwd. The contribution of FL and FR was observed high in the principal component axes (Figure 3). The observation suggests that FL and FR are the major traits that explain the total variation in the fruit morphology of the Sechium genotypes and may be considered for further improvement. Similar results were reported in other crops by various authors (Rosso and Pagano, 2005;Chandran and Padya, 2000).
Hierarchical Cluster or dendrogram analysis of the fruit traits using ward method measured the interval with squared euclidean distance showed traits classified into 2 groups suggest the contribution towards other groups (Figure 4).

Conclusion
Fruit circumference (FC) was recorded with maximum heritability (broad sense), and genetic advance could be recommended for further improvement of the trait. Fruit length (FL) and fruit ridges (FR) numbers are responsible for fruit morphological trait variations as indicated by principal component analysis.