Investigating the Performance of the Exploratory Graph Analysis When the Data Are Unidimensional and Polytomous

The question of how observable variables should be associated with latent structures has been at the center of the area of psychometrics. A recently proposed alternative model to the traditional factor retention methods is called Exploratory Graph Analysis (EGA). This method belongs to the broader family of network psychometrics which assumes that the associations between observed variables are caused by a system in which variables have direct and potentially causal interaction. This method approaches the psychological data in an exploratory manner and enables the visualization of the relationships between variables and allocation of variables to the dimensions in a deterministic manner. In this regard, the aim of this study was set as comparing the EGA with traditional factor retention methods when the data is unidimensional and items are constructed with polytomous response format. For this investigation, simulated data sets were used and three different conditions were manipulated: the sample size (250, 500, 1000 and 3000), the number of items (5, 10, 20) and internal consistency of the scale (α = 0.7 and α = 0.9). The results revealed that EGA is a robust method especially when used with graphical least absolute shrinkage and selection operator (GLASSO) algorithm and provides better performance in the retention of a true number of dimension than Kaiser's rule and yields comparable results with the other traditional factor retention methods (optimal coordinates, acceleration factor and Horn's parallel analysis) under some conditions. These results were discussed based on the existing literature and some suggestions were given for future studies.


INTRODUCTION
The question of how observable variables should be associated with latent structures have been at the center of the area of psychometrics (Borsboom & Molenaar, 2015). So far, various models were developed to specify this association. However, despite the quantitative increase in numbers and great flexibility of mathematical models used in psychometric studies, the models are surprisingly limited in terms of the paradigm that they are based on.
There are two large families of the models in social sciences to describe the relationships between latent variables and observed variables (Edwards & Bagozzi, 2000). In the first category, the latent traits are considered as the common cause of the observed scores. The model based on such kind of conceptualization is called reflective. Reflective models assume that latent traits cause observed variables (also known as indicators, test items, or symptoms. In reflective models, the indicators are modeled as a function of a common latent variable plus some amount of item-specific error variance. Confirmatory factor analysis (CFA) is one of the most commonly used methods representing reflective models.
Formative models are another broad category to define the relationship between latent structures and observed variables. By this conceptualization, it is accepted that observable variables define the latent structures, not caused by them. The classic example of these kinds of models is the socio-economic status defined by a set of observed variables (e.g. education, job, salary and the district of residency). Principal component analysis (PCA) can be given as a classic example of this kind of model. Using A recently proposed alternative model to traditional reflective and formative approaches is called network modeling. In this approach, there is an assumption that the associations between observed variables are caused by a system in which variables have direct and potentially causal interaction with each other (Eaton, 2015). The usage of network models has provided considerable benefit for understanding complex systems in many different disciplines (Barabási & Pósfai, 2016). In the social sciences, the application of network analysis was adopted firstly to investigate social network structures (eg. Cartwright and Harary, 1956). However, in the following decades, it has been used as an alternative to latent variable modeling in studies to analyze network models of psychological behaviors in an exploratory manner Schmittmann et al., 2013). After this shift in the application of network modeling, the popularity of the network approach increased and it started to be used intensively in psychology and led to the emergence of a new branch of psychology aimed at predicting network structures in psychological data. This new branch is called network psychometrics (Epskamp, Maris, Waldorp, & Borsboom, 2015).
As with other network models, a psychometric network model consists of a series of nodes (or vertices), a set of connections or links between the nodes (also known as edges) and information regarding the structure of nodes and edges (De Nooy, Mrvar & Batagelj, 2011). In this framework, the nodes represent the psychological indicator variables (e.g. symptoms, behaviors, or faces of latent variables). Traditionally, they are represented by circles in the network structure. On the other hand, the edges represent the node's associations and represented in a network models by lines connecting the nodes.
A more recent paper (Golino & Epskamp, 2017) introduced an innovative way to investigate the dimensionality of psychological constructs by network modeling. This new method is called the EGA. As its name implies, this model is not based on prior assumptions when investigating the dimensionality of a construct. Instead, it approaches the psychological data in an exploratory way. A fascinating feature of EGA is that it enables the visualization of the relationships between variables and allocating variables to the dimensions in a deterministic manner (Golino et al., 2020). For this reason, it is an ideal method to test or reevaluate the theoretical structure of psychological constructs.
In an EGA model, traditionally green (or blue) lines on the network represent positive partial correlations, and red lines correspond to negative partial correlations. In addition, the thickness of the Like other statistical methods that use sample data to estimate parameters, correlation and partial correlation values are also affected by sampling variation. Hence, the exact zero values in matrices are rarely be observed in real data. As a result, the estimated networks based on partial correlations become fully connected. Small weights on many edges could possibly reflect weak and potentially spurious partial correlations in this kind of network. These spurious relationships cause a threat to the clear interpretation of networks and replicability. Frequently, a statistical method is used to remove these spurious connections and control network complexity. For estimations based on partial correlations, a commonly used procedure is to apply the least absolute shrinkage and selection operator (LASSO) proposed by Friedman, Hastie and Tibshirani (2008). Because the LASSO can control spurious connections, this method can provide high precision estimates when combined with the community detection algorithm, such as the walktrap algorithm (Pons & Latapy, 2005).
LASSO uses a tuning parameter to remove spurious connections in the model by filtering the network with penalization approach to the inverse covariance matrix. In this way, partial correlation values smaller than a threshold are estimated as exactly zero. The tuning parameter was selected based on minimizing Extended Bayesian information criterion (EBIC) proposed by Chen and Chen (2008). It enables the researcher to control the sparsity of networks (Foygel & Drton, 2010). LASSO is an important part of network modeling because it determines the eventual network structure. It also enables obtaining parsimonious and more interpretable models. In EGA models, a graphical extension of LASSO is used and referred to as GLASSO. In addition, as an alternative to GLASSO, Triangulated Maximally Filtered Graph (TMFG) was proposed. This approach builds a triangulation that enables a score function to maximize. In this way, the data becomes organized in a meaningful structure and modeling becomes possible. The detailed explanations and formulations could be found in Massara, Di Matteo and Aste (2016).
As cited above, the EGA was firstly proposed by Golino & Epskamp (2017). In this paper, they compared the performance of the EGA with five different traditional factor retention methods. These methods are as follows: (a)very simple structure (VSS; Revelle & Rocklin, 1979); (b) minimum average partial procedure (MAP; Velicer, 1976); (c) fit of a different number of factors, from 1 to 10, via BIC and via EBIC; (d) Horn's Parallel Analysis (PA; Horn, 1965); (e) Kaiser-Guttman eigenvalue greater than one rule (Guttman, 1954); (f) EGA.
In the study, these methods were compared with each other by using simulated data sets across different conditions: the sample size (100, 500, 1000 and 5000), the number of factors (2 and 4), the number of items in each factor (5 and 10) and the correlation between the dimensions (.2, .5 and .7). The datasets were generated in two and four dimension structures and as having dichotomous items. The effectiveness of the methods was tested with their estimation rate of a true number of factors. These methods were compared in terms of their performance to extract the true number of dimensions. According to the findings, it was reported that EGA performed better than the traditional factor retention methods especially when the datasets were simulated as having four dimensions and when the number of items in each dimension was five. It was also stated that EGA was found to be the only method giving satisfactory results in all conditions. All in all, this study confirmed the superiority of EGA to other traditional methods under some conditions. As this study revealed, EGA is suitable to be used with multidimensional datasets.
On the other hand, the reason why multidimensional datasets were preferred in this recent study is that EGA framework was available to be used only with multidimensional datasets, but a recent revision allowed the examination of unidimensional datasets. In this way, practical limitations to test the effectiveness of EGA with unidimensional datasets were eliminated. There are a number of important reasons to examine unidimensionality in tests. First of all, there is a need to calculate the α coefficient for the overall test (Dunn, Baguley, & Brunsden, 2014). In addition, unidimensionality indicates the presence of a common underlying cause or a coherent set of homogeneous causes (DeVellis, 2017). Based on these facts, Golino & Epskamp (2017) recommended testing the performance of EGA with unidimensional datasets composed of polytomously scored items.
Considering the richness of outputs (such as centrality measures, node strength measures, item stability statistics and entropy fit index) EGA provide to evaluate psychometrical properties of scales (Golino & Christensen, 2020), it is assumed that test developers will use EGA with increasing frequency in the future. In addition, some psychological traits like depression (Beard et al. 2016), anxiety (Fisher et al., 2017) or addiction are measured based on the symptoms they are relied on. DiFranza and his colleagues (2002) suggested considering these symptoms as interconnecting networks rather than indicators caused by latent traits. It is assumed that such kinds of understanding of psychopathological symptoms can contribute more to our understanding of disorders (Beard et al. 2016). For this reason, it is fair to assume that use of EGA will increase in the future.

Purpose of the Study
In this regard, the aim of this study was set as the comparison of the performance of EGA with traditional factor retention methods when the data is unidimensional and items are scored in polytomous response format.

Data Simulation Procedure
In the current study, three different conditions were manipulated: the sample size (250, 500, 1000 and 3000), the number of items (5, 10,20) and the internal consistency level (α = 0.7 and α = 0.9). The conditions of the study were determined by taking into account the features of the scales in the existing psychology literature. Related literature shows that the number of items in unidimensional measurement tools show variance. For example, the Satisfaction with Life Scale (Diener, Emmons, Larsen, & Griffin (1985) consists of five items while the Center for Epidemiologic Studies Depression Scale (Radloff, 1977) consists of twenty items. For this reason, a number of items in simulated data sets were allowed to vary between these observed values (5, 10,20). In addition, in order to consider a test to be reliable, the lower threshold value was proposed as .7 (Nunnaly 1978). On the other hand, if the α level is above .90, it is regarded as the test has a good level of α. Accordingly, the data sets were simulated as half of them had α at lower threshold (α = 0.7) while another half of the datasets were simulated as having α level regarded as good (α = 0.9). Finally, the sample size of n=250 is generally regarded as the minimum number when applying factor retention methods (Cattell, 1978). For this reason, the simulated datasets were arranged to had a sample size of at least 250 while n=500, n=1000 and n=3000 conditions were also selected when generating data sets. Based on these facts, 24 different conditions were created with a 4x3x2 design. Finally, in line with the main aim of this study, all of the data sets were simulated as having unidimensional structure and datasets were generated as if the items were scored between 1-5 intervals.
For each condition, data simulation was repeated 100 times to obtain more stable results. This process resulted in generating 2400 datasets. The reported results in this study reflect the arithmetic average of the iterations. The data simulation was performed with mirt package (Chalmers, 2012) in R program (R core team, 2019).

Analysis Procedure
EGA analyses were carried out using the EGAnet package available in R statistical environment (Golino & Christensen, 2020). The tuning parameter for GLASSO was determined based on EBIC to obtain a sparser network. In this study, this parameter was set at 0.5, which is a default option in EGAnet. On the other hand, the nFactors package (Raiche, 2010) was used for applying OC, AF, PA and KR1 factor retention methods.
The assessment of how accurate the correct number of dimensions is extracted was made based on extraction accuracy index and bias indices, as Garrido, Abad & Posada (2016). Factor extraction accuracy index was calculated at two stages: (1) coding correct estimation of the true number of factors as 1 and incorrect estimation of castors as 0, (2) taking the arithmetic mean of coded scores. For instance, when 100 datasets were analyzed, if the true number of factors extracted for 50 datasets, the accuracy index was computed as 0.5. On the other hand, the bias index was calculated as a subtraction of the estimated number of dimensions from a true number of dimensions. For instance, for a unidimensional dataset if the estimated number of the dataset is 1, the bias index is calculated as 0 while if the estimated number is 2, the bias value becomes 1. Therefore, a bias value of 0 indicates the correct number of dimensions are extracted perfectly while a bias values far from 0 indicates the poor performance of the corresponding method. Similar to the accuracy index, the values of bias in the results section represents the arithmetic mean of 100 iterations.

RESULTS
The average accuracy index values and corresponding standard deviations obtained from 100 iterations were given in Table 1. When the sample size was set as 250 and datasets contained five items, all of the methods estimated the correct number of factors perfectly regardless of the α level. As the number of items was increased to ten and α level was 0.7, EGA (LASSO) could extract unidimensional structure for 79% while this rate was 49% for EGA(TMFG). Both algorithms of EGA method outperformed the traditional KR1 method. When the α level has risen to 0.9, EGA (LASSO) method estimated the correct number of dimensions for 99% of datasets, whereas EGA (TMFG) method's percentage drops to 9%. On the other hand, for the other four traditional methods, the average accuracy rates were 100%. In particular, EGA (LASSO) method yielded comparable results with traditional methods when the alpha level was 0.9. Finally, for data sets containing twenty items, the accuracy rate of EGA(LASSO) was 2% and 52% for the conditions where the α was 0.7 and 0.9 respectively, whereas accuracy rates of EGA (TMFG) were 0% for both α levels. The only method EGA(LASSO) outperformed was KR1 while EGA (TMFG) yielded the worst accuracy rates.
For the datasets with n=500 sample size condition, all of the methods examined were perfectly estimated unidimensional structure when the sample size contained five items. This result didn't show a difference across α levels. On the other hand, when the number of items was increased to 10 and α level was 0.7, the average accuracy rate of EGA(LASSO) and EGA(TMFG) was found to be 0.99 and 0.45 respectively. EGA(LASSO) outperformed the traditional KR1 method while EGA(TMFG) method yielded the lowest accuracy levels. As the α level increased to 0.90, EGA(TMFG) was the only method that provided an imperfect accuracy rate (%22). Finally, as the number of items in the datasets was increased to 20, only AF performed a perfectly estimated true number of dimensions when the α was set to be 0.7 while AF and PA performed perfectly when the α level was 0.90. On the other hand, EGA methods yielded the worst accuracy rates.
For the n=1000 sample size condition, when the dataset contained five items, all of the methods extracted the correct number of dimensions perfectly while imperfect rates were obtained for EGA(TMFG) with accuracy rates of 0.59 and 0.26 depending on the α level for the datasets contained ten items. Finally, as the number of items was set to be 20, the EGA(LASSO) method's accuracy rates were 68% and 99% for the α levels of 0.7 and 0.9, respectively. On the other hand, EGA(TMFG) yielded perfectly inaccurate results.

Journal of Measurement and Evaluation in Education and Psychology
For datasets where the sample size was 3000, the accuracy rate for EGA (LASSO) was 99% when α was 0.7 and the number of items was 20, while it was 100% in other conditions. For EGA (TMFG) method, the accuracy rates for datasets with 10 and twenty items fell to 77% and 0% when the alpha was α = 0.7, while the accuracy rates for the data sets with ten and twenty items and with α value of .9, accuracy rates decreased to 36% and 0% respectively. For the KR1 method, the accuracy rate was 3% for datasets where α = 0.7 and the number of items was 2. For OC, AF and AP methods, a 100% accuracy rate was achieved under all conditions. Lastly, EGA(LASSO) yielded a 99% accuracy rate when α level was 0.7 and datasets contained twenty items while it perfectly estimated true number of dimensions for the rest of the conditions. On the other hand, EGA(TMFG) yielded the lowest accuracy rates when the number of items was 10 and 20. Especially, OC, AF and AP methods yielded perfect accuracy rates under all conditions examined. As could be inferred, based on the number of items, EGA's relative performance against traditional factor retention methods changed dramatically. In addition, for most of the conditions, GLASSO algorithm was superior to TMFG algorithm. The calculated bias values for the factor retention methods across conditions were given in Table 2. If the datasets contained five items, EGA(LASSO) provided unbiased estimates of the correct number of dimensions. As the number of items in the datasets was increased to 10 and the sample size of n=250, the bias value was estimated to be 0.33 0.01 for α levels of 0.7 and 0.9, respectively. As the sample size of datasets was increased to 500, EGA(LASSO) yielded 0.01 and 0 bias for α levels of 0.7 and 0.9. When the sample size was n=1000 and n=3000, EGA(LASSO) yielded no bias when the item number was 10. For the datasets containing twenty items, if the sample size was n=250, the bias value was 2.41 for α level of 0.7 and 1.39 for α level of 0.90. On the other than, the bias value of 1.39 has very large standard deviation value which indicated that, there was a variation across the datasets in terms of the bias value calculated. As the sample size was increased to 500, 1000 and 3000, the bias values calculated showed a decrease compared to n=250 condition. Similar changes were also observed for EGA(TMFG) across the conditions while EGA(TMFG) performed worse than EGA(LASSO) in general. On the other hand, other traditional estimation methods provided almost perfect results especially when the sample size was n=1000 and n=3000.  After calculating the accuracy rates and the bias values, a series of factorial ANOVA was performed to examine the effects of conditions altered for each factor retention method. For this analysis, the raw estimated dimension number value was used as the dependent variable. Only eta square (η2) effect size values and the significance levels of ANOVA analysis were reported. The significance levels, ** sign denotes significance at p<0.01 level and * implies significance at p<0.05. The η2 values show the magnitudes of the differences between the conditions for each method under investigation. According to Cohen (1988), η2 values of 0.14 and above can be regarded as a "large" effect size. On the other hand, the effect size for AF method cannot be compared because this method perfectly estimated the true number of dimensions for all 2400 datasets.
For the rest of the methods, it was found that the unique effects of the conditions examined for EGA (GLASSO) method or their two-way and three-way interactions did not have a large effect size. Similar results were observed for OC and PA methods. On the other hand, the item number condition had a large effect size for EGA(TMFG) method. Finally, for the KR1 method, large amounts of the effect size values were observed for each of the conditions examined and their two-way and three-way interactions were found as significant.

DISCUSSION and CONCLUSION
The current study aimed to compare the effectiveness of EGA in extracting the true number of dimensions with traditional methods when the data was unidimensional and composed of polytomous items. This aim was determined based on Golino and Epskamp's (2017) recommendations and literature review showed that no study was conducted so far considering this recommendation. Unlike this study, in the current study, OC and AF methods were included for comparison because these methods are also relatively new compared to more traditional methods like PA and KR1 and their inclusion on relatively new methods is believed to increase existing knowledge on the effectiveness of EGA.
As a result of this study, it has been observed that EGA (LASSO) successfully extracted unidimensional structure perfectly like other methods for datasets where the number of items was five. This success of EGA was valid even for data sets with a sample size as small as 250. A similar finding was obtained for EGA (TMFG). On the other hand, as the number of items increases, the performance of both EGA (LASSO) and EGA (TMFG) decreased. Even when the sample size was 3000 and the reliability level was 0.9, EGA (TMFG) could not extract the correct number of dimension with high accuracy if there were ten or more items in the data set. On the other hand, for n = 500 and n = 1000 sample size conditions, EGA (LASSO) yielded comparable accuracy rates only if the reliability level was 0.9 while it's performance decreased when the reliability dropped to 0.7 and when data sets contained twenty items.
If the methods are compared in general, AF had perfectly extracted the actual dimensional structure regardless of the conditions altered and use of it by the researchers is strictly recommended in their future studies. Overall, EGA (LASSO) algorithm outperformed EGA (TMFG) algorithm. For this According to factorial ANOVA results, it was found that there were no unique or interaction effects observed for EGA (LASSO) method. Similar findings were also observed for OC and PA methods. It can be said that these three methods were the most robust ones across the conditions tested. Although these statistics can not be calculated for AF, it provides perfect results under all conditions. it is also definitely correct to consider this method as robust. On the other hand, "large" effect size was observed for the EGA (TMFG) method for the sample size condition. That is, the sample size affects the performance of EGA (TMFG) method negatively regardless of other conditions. The poor performance of TMFG algorithm is understandable because it performs better when booting algorithms are used simultaneously.
Finally, for the KR1 method. "large" effect sizes were observed for all conditions and their two-way and three-way interactions. Accordingly, it can be said that the KR1 method was the least robust method within the context of the conditions examined in this study. This finding is in line with past literature (Velicer, Eaton & Fava. 2000;Ruscio & Roche, 2012).
This study is one of the few studies comparing EGA's factor retention effectiveness with other traditional methods. Contrary to the findings obtained by Golino and Epskamp (2017), EGA(LASSO) was not to be detected as clearly superior to other traditional methods. This result implies that EGA (LASSO) may not be a suitable alternative when the data is unidimensional and potential researchers should use EGA (LASSO) for scales with fewer items, higher internal consistency and a large sample size for unidimensional tests. On the other hand, EGA (TMFG) should not be an option for researchers in a wide of conditions considered in the current study.
All in all, more research is needed to examine the effectiveness of EGA in different conditions. For example, EGA's effectiveness in datasets with different ability distributions will contribute to the richness of the existing literature. In addition, in this study the effectiveness of the methods was only evaluated in terms of the number of factors. In future studies, it is suggested to evaluate the performance of EGA in terms of estimating real factor loadings.