The Turkish Adaptation of the Statistics Anxiety Scale for Graduate Students

In this study, it was aimed to adapt the Statistical Anxiety Scale (SAS) developed for graduate students by Faber, Drexler, Stappert and Eichhorn to Turkish. The research was carried out on 375 students attending graduate education in any field in Turkey. In the study, construct validity of the SAS was investigated via exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). Parallel analysis method was also used in making decision about factor number of the scale. In the EFA and parallel analysis, a unidimensional structure was obtained in line with the results acquired in the factor analysis of the original form of the SAS. However; since the original form of the SAS was designed by foreseeing a three-dimensional structure of worry, avoidance and emotionality, both unidimensional and three-dimensional structures were tested in CFA. The fit indices reported in CFA were found to be within acceptable limits for both models. In the reliability analysis, Cronbach Alpha internal consistency coefficient was calculated as .91 for the whole scale, and it was found to be .91, .83, and .91 for worry, avoidance and emotionality dimensions, respectively. It was determined that item correlations exceed the lower limit of .30 for all items in the scale. Ferguson Delta statistic, which provide evidence for the discriminatory power of the entire scale, was determined as .98. These results suggest that the Turkish form of the SAS yields valid and reliable measures.


INTRODUCTION
One of the most important stages of scientific research process is to analyse the collected data via appropriate methods (Gürbüz & Şahin, 2017). The appropriate method for data analysis differs depending on the way the data is collected and the problems sought in the research. In the most general sense, the data are analysed through descriptive analysis or content analysis if a qualitative study is conducted (Yıldırım & Şimşek, 2016); but statistical techniques are used in the quantitative studies. In this context, a researcher conducting a quantitative study needs to be knowledgeable about statistics. Of course, it does not mean that a researcher conducting qualitative study does not need knowledge of statistics. This is because knowledge of statistics is necessary not only for analysing a researcher's own data but also for following the literature and understanding the conducted studies (Tan, 2016). For this reason, statistics is considered as an instrument complement scientific research (Sutarso, 1992), and anybody doing scientific study is expected to be trained in statistical techniques beside research methods (Erkuş, 2011). Due to this, at least one statistical course is compulsory in almost all of the graduate education programmes in the social, educational, and behavioural sciences. Yet, taking a statistics course can turn into a negative experience for many students attending graduate programmes (Collins & Onwuegbuzie, 2007). Therefore, most students postpone taking statistics related courses as far as possible and prefer taking them at the last semester (Roberts & Bilderbeck, 1980). Such behaviours displayed by students against statistics is referred to as statistics anxiety.
A review of relevant literature demonstrates that several studies concerning statistics anxiety have been conducted especially in the last 30 years in social sciences (Beurze, Donders, Zielhuis, Vegt & Verbeek, 2013). The remarkable results obtained from relevant studies can be summarized as followings: Students with weak mathematical background or limited education in mathematics have higher statistics anxiety (Baloğlu, 2003;Baloğlu & Zelhart, 2004;Primi & Chiesi, 2018;Roberts & Saxe, 1982;Wilson, 1997;Zeidner, 1991); there are positive correlations between statistics anxiety and tendencies to put off assignments in graduate education (Onwuegbuzie, 2004); students consider statistics as a barrier in front of academic career (Onwuegbuzie, 1997b as cited in Rodarte-Luna & Sherry, 2008); reading skills significantly affect statistics anxiety (Collins & Onwuegbuzie, 2007). The studies intending to determine the effects of such demographic variables as gender and age, on the other hand, has obtained differing findings. Sutarso (1992) found that there were no significant differences between male and female students' statistics anxiety; Baloğlu (2003), Benson (1989) and Rodarte-Luna and Sherry (2008), however, found that female students had significantly higher statistics anxiety than male students. While Beurze et al. (2013) found that statistics anxiety did not differ according to age, Baloğlu (2003) found that there was increase in statistics anxiety through age.

Measuring Statistics Anxiety
Measurement tools created by using mathematics anxiety scales were used in earlier studies on statistics anxiety (Pan & Tang, 2005). Statistics anxiety scale developed by Pretorius and Norman (1992) and statistics anxiety inventory developed by Zeidner (1991) can be given as examples to such measurement tools (Chiesi, Primi & Carmona, 2011). In later studies, however, it was emphasised that mathematics anxiety and statistics anxiety were related but that they were distinct structures, and thus the validity of statistics anxiety scales prepared with reference to mathematics anxiety scales was questioned (Onwuegbuzie & Wilson, 2003). Thus, scales intended to measure directly statistics anxiety were developed. Of them the most frequently used one is the Statistics Anxiety Rating Scale which was developed by Cruise et al. (1985) and whose psychometrical properties were analysed more recently by Baloğlu (2002); Chew, Dillon and Svinbourne (2018) ;Hanna, Shevlin and Dempster (2008) ;Liu, Onwuegbuzie and Meng (2011); Maat and Rosli (2016); Nesbit and Bourne (2018) and Teman (2013). This five-pointed Likert type scale contains 51 items and six subscales labelled as worth of statistics, interpretation anxiety, test and class anxiety, computational self-concept, fear of asking for help, and fear of statistics teachers. Onwuegbuzie and Wilson (2003) stated in their review study that the Statistical Anxiety Rating Scale (Cruise et al., 1985) was the most known and widely used scale on the subject. However, the fact that this scale was very long in length and also considered constructs such as attitude and self-concept in addition to anxiety (Chiesi et al., 2011) paved the way for studies aiming to develop measurement tools which were more useful and which were to measure only statistics anxiety. One of those studies was performed by Vigil-Colet, Lorenzo-Seva and Condon (2008). The researchers aimed to include in the literature a measurement tool which contained items reflecting only statistics anxiety and which was short enough to use easily. In accordance with their purpose, they developed a 24-item, three-437 factor (test anxiety, asking for help anxiety and interpretation anxiety) statistics anxiety scale in Spanish sample. Another contemporary measurement tool for statistics anxiety is the 17-item scale developed by Faber, Drexler, Stappert and Eichorn (2018). The scale was developed with the participation of graduate students in educational sciences and in special education. A close examination of the items in the scale makes it clear that the audience is not restricted only to students in the field of education. Hence, the scale is applicable with graduate students in diverse areas who come across statistics in the papers they read or in the research they do.

Statistics Anxiety Scales Available in Turkish Literature
Four different measurement tools are found on searching for the concept of statistics anxiety (istatistik kaygısı) on Turkish pages in Google search engine. One of them is Statistics Attitudes Scale developed by Köklü (1994). The researcher concluded that the scale can be considered as both single factor and four factors as a result of the principal components analysis applied to the statistical attitude scale and called one of the factors in the four-factor scale as statistics anxiety. The second scale was developed by Köklü (1996) and the third one was developed by Yaşar (2014). The one developed by Köklü (1996) is intended directly to measure statistics anxiety. The scale developed by Yaşar (2014), on the other hand, was prepared to measure attitudes towards statistics and statistics anxiety is only one of its five factors. The property in common in the scales developed by Köklü (1994Köklü ( , 1996 and Yaşar (2014) is that they both are directed to undergraduate students and that they do not contain items corresponding to the basic components of graduate education such as reading scientific articles, doing scientific research and presenting it. The fourth measurement tool available on the Turkish pages of Google search engine is the statistics anxiety rating scale. Yet, on examining the studies using the scale, it was found that there was no mention of a form of adaptation into Turkish. That is to say, even though there were studies in Turkish using the statistics anxiety rating scale (Baloğlu & Zelhart, 2004;Baloğlu, Koçak & Zelhart, 2007), the studies were performed in Texas in the USA by using the original form of the scale. No studies in which the Turkish adaptation of the scale was used were available.

Purpose of the Study
The objectives and contents of statistics courses taught at undergraduate and graduate levels are different. The main reason for this difference is related to the competencies that graduates should have. At the undergraduate level the topics such as basic concepts of statistics, reading and interpretation of tables and graphs, calculation of descriptive statistics, calculation and interpretation of simple correlation coefficients are covered. On the other hand, at the graduate level individuals are expected to carry out the statistical process from start to finish by planning a scientific research and so the scopes expand. In other words, the graduate student is a researcher who is accepted as an expert in the related field. For this reason, statistical anxiety scales for graduate students must contain items that correspond to the basic elements of graduate education such as reading, conducting and presenting scientific studies.
Differences in the content of statistics courses taught at undergraduate and graduate levels make it inevitable that the scales related to the anxiety, attitude or self-efficacy towards statistics as prepared for these educational levels will also differ. In this sense, it is considered that the use of statistical anxiety scales developed for undergraduate students to measure the statistical anxiety of graduate students is not correct. When the Turkish literature was analysed from this perspective, it has seen that the measurement tools developed to determine the statistical anxiety were limited to the scales for the undergraduate students. Therefore, a Turkish scale usable in determining graduate students' statistics anxiety was needed. In this context the present study aims to adapt the Statistics Anxiety Scale (SAS) developed by Faber et al. (2018) for graduate students into Turkish.

METHOD
This research, which aims to adapt SAS into Turkish, is a descriptive study. Descriptive research aims to present and interpret the current situation as it is. These researches give a snapshot of beliefs, thoughts, emotions and behaviours at a given time and place (Stangor, 2010). Descriptive research can be quantitative or qualitative oriented. Generating numerical data, requiring selection of a sample that can represent a large population, providing inferential and explanatory information, gathering standardized information obtained by applying the same measurement tool to all participants, capturing data mostly from scales, multiple choice tests, questionnaires, etc. are typical features of quantitative-oriented descriptive research (Cohen, Manion & Morrison, 2007). When these features are taken into consideration, studies aimed at developing, adapting or revising the measurement tools can be expressed as quantitative oriented descriptive studies.

Study Group
In reaching the participants of the research, three different paths were followed. First of all, the scale was applied face to face to the students who have taken the statistics course and who continue their graduate education in the faculty where the researchers work. The number of participants to whom the scale was applied face to face was 25. Then, the researchers searched as master student and doctoral student in google scholar and they limited search results to 2019. In this manner it was reached to the articles with postgraduate student(s) among its authors. Subsequently, these articles were reviewed to see if they contain statistical analyzes or whether the relevant field of the article requires statistical information. If the article contains statistical analyzes, or it is related to a field (educational sciences, field education, biostatistics etc.) where its authors are expected to have knowledge of statistics, the email address of the article's author(s) who is at graduate level was recorded and the scale was sent to this author(s) via e-mail. Finally, the websites of universities were scanned and the e-mail addresses of the research assistants who indicated that they were continuing their graduate education in their resumes and that they required statistical information of the graduate program in which they were registered were recorded, and the scale was delivered electronically to these research assistants. The number of participants who answered the scale electronically was 350. Finally, a total of 375 participants who continue graduate education at any university in Turkey was reached. Of the participants 233 (62.10%) were female and 142 (37.90%) were male. The participants' ages ranged between 22 and 57 ( ̅ = 30.06, SD = 5.58), but two of them did not indicate their age. The distribution of the participants according to the institute where they are registered, the stage of graduate education they were at and whether they had taken a statistics course is shown in Table 1. The majority (73.87%) of the participants in the study group have been registered in one of the graduate programmes of educational sciences and teacher training basic field. Yet, there were also graduate students registered in such diverse programmes as medical training, tourism and hotel management, private law, and finance. They were included in the study group due to the fact that they also needed knowledge of statistics in their graduate courses and in their scientific studies.

Data Collection Tool
The research data were collected through SAS-which was developed by Faber et al. (2018) and which this study aims to adapt into Turkish. The scale is in four-pointed Likert type and it contains 17 items.
There is no reverse scored item in the scale. While developing the original form of the scale a threedimensional structure has been foresighted. Table 2 shows information on this three-dimensional structure. Although the scale was designed as having three factors as is shown in Table 2, the principal components analysis could not statistically separate the three anxiety components and thus the SAS had a single-factor structure. In unidimensional structure, the explained variance rate was determined as 43.59% and it was found that the factor loadings of the scale items ranged from .49 to .76. The reliability of the measures obtained with SAS was tested through Cronbach's Alpha internal consistency coefficient and was detected as .92. The corrected total item correlations calculated for item discrimination were reported to range between .44 and .70. Faber et al. (2018) stated that the fact that the SAS showed a statistically single-factor structure does not prevent commenting on the basis of subscales and that evaluation can be made on the subscales' scores in addition to the total score. SAS scores range from 17 to 68. High scores from both the whole scale and the subscales indicate a high level of statistical anxiety.

Translating the Scale into Turkish
Primarily the researchers who had developed the original form of the scale were contacted in adapting the scale into Turkish. Thus, Günter Faber was sent an e-mail on 10 November 2018 to get the permission for Turkish adaptation of the scale. The e-mail of Günter Faber's approval of the adaptation was received on 11 November 2018 and the process of adaptation was thus started.
The first step in the adaptation process is to translate the scale from English to Turkish. When translating the measurement tool from the source language to the target language, there are four different methods that can be used: judgmental single-translation, judgmental back-translation, statistical single-translation and statistical back-translation (Hambleton & Bollwark, 1991). In present study, judgmental single-translation method was used. In this method, one or more translators translate the scale from the source language to the target language, then another group compare the original form with the translation form to determine whether the two forms are linguistically equivalent and they change the translation form if deemed necessary (Hambleton & Kanjee, 1993). Accordingly, the items of the SAS were translated into Turkish by five experts three of whom were experts in 440 measurement and evaluation, one of whom was an expert in social studies education and one of whom was an expert in curriculum and instruction. Another expert in English language was not needed because the expert in curriculum and instruction was a graduate of English Language Teaching. After the five experts had translated the scale independently of each other, the translations were brought together and the Turkish equivalents which were thought to reflect the items in the best way were chosen. Then, the Turkish form was presented to the two different experts together with the original form of the scale and the experts were asked to examine whether the two forms were equivalent. Both experts stated that the two forms were generally equivalent to each other. Only one of the experts stated that the item-15 in the scale did not fully reflect the original form and proposed revision for the relevant item. The revision proposed by the expert has been adopted by the researchers and the necessary translation has been changed.
Four-pointed rating was adopted in the Turkish version of the scale as in its original version and the scale categories were labelled as absolutely disagree (1), slightly agree (2), quite agree (3) and absolutely agree (4). To test the intelligibility of the translations, the scale was applied to three research assistants who were studying for their PhD. After the feedback from the three research assistants that the scale items were clear and comprehensible, the Turkish form of the SAS (Appendix A) was ready for use. It was difficult to reach a large sample of graduate students. That's why, the researchers thought it was unlikely to reach two different study groups, one in the pilot and the other in the actual application. Consequently, after testing the intelligibility of the scale items on a small group, the actual application of the scale was started; no pilot study was included.

Data Collection and Analysis
The research data were collected online in the period between 27 November 2018 and 05 February 2019. Within the scope of psychometric properties of the measures collected by the Turkish form of the SAS; construct validity, internal consistency reliability and discrimination power have been tested. Exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) were done for the construct validity of the SAS, and additionally, parallel analysis method was used to determine the number of factors. The studies in the literature (Fabrigar, Wegener, MacCallum & Strahan, 1999;Macfarlane, Meach & Leroy, 2014;Raykow & Marcoulides, 2011) recommend that EFA and CFA be conducted with data obtained from different samples. The reason for this is that EFA includes some subjective decisions by the researcher. Considering that the EFA is based on a single sample, it is critical to retest the factor structure obtained in EFA on a fresh data. For this purpose, the data set is randomly splitted in half, so that the first half is used for EFA and the second half is used for CFA. Essentially, CFA tries to recreate the structure found in EFA in a different dataset. Hence, the data set was randomly divided into two according to the participant numbers prior EFA and CFA were performed. Accordingly, the data files with odd numbers were used for EFA whereas the data files with even numbers were used for CFA. Thus, there were 188 participants in the data set to which EFA was applied and there were 187 participants in the data set to which CFA was applied. The data set used in EFA was used also in parallel analysis. Because in parallel analysis, the eigenvalues obtained as a result of EFA are used when deciding the number of factors (Pallant, 2005).
Before starting the analyses, the skewness and kurtosis coefficients were examined to get an idea about the distribution of the data. Table 3 shows the skewness and kurtosis coefficients obtained for the overall and sub-scales of SAS in the data sets where AFA and CFA are conducted.

441
When the skewness and kurtosis coefficients in Table 3 are examined, it is seen that they are all within ±2 range. In perfectly symmetrical normal distribution, the coefficients of skewness and kurtosis are equal to zero. However, as a rule of thumb values for skewness and kurtosis between ±2 is interpreted as the distribution does not show a significant deviation from normal (Bachman, 2004). Accordingly, it can be said that the research data meet the assumption of normality.
Another indicator that can provide evidence for the normality of the research data is the number of participants in the study group. Indeed, Kirk (2007) points out that in large enough samples, the data approach normal distribution and that a sample of 100 people is sufficient to reach a normal distribution. Similarly, Waternaux (1976) found that when the sample size was over 100, the effect of skewness and kurtosis of the data on the results of the analysis was reduced, and that the effect was almost completely abolished in over 200 samples. Therefore, not only the calculated skewness and kurtosis coefficients; but also, the size of the study group is sufficient to say that the research data is suitable for normal distribution.
Following the examining the skewness and kurtosis coefficients, whether the data are appropriate for factor analysis was checked. For this purpose, Kaiser-Meyer-Olkin (KMO) coefficient and the results of Bartlett test were examined. The KMO was found to exceed the lower limit .60 with a value of .94 (Büyüköztürk, 2010), and Bartlett test was found significant (χ 2 = 2536.07, df = 136, p < .001). The results showed that the data are appropriate for factor analysis. Following this finding, EFA was conducted and principal components method was chosen in the analysis. When interpreting factor loadings in EFA, .32 value recommended by Tabachnick and Fidell (2007) was taken as a criterion.
After EFA, parallel analysis and CFA were done respectively. Two different models were tested in CFA. One of them was the three-factor structure on which the original version of the SAS was based, and the second was the single-factor structure which was reached in EFA conducted in both original and Turkish forms of the SAS. RMSEA, SRMR, CFI, IFI, RFI, NFI and NNFI (TLI) were used to find whether those tested models had been confirmed or not and to see which model fitted the data better. Considering Kline's (2016) explanation that the use of χ 2 / df value as a criterion for model fit does not have a strong logical and statistical foundation, this fit index was not taken into consideration in the study. The acceptable ranges of the fit indices examined are presented in Table 4.  Hancock and Mueller (2013) Factor loadings beside the model-data fit in CFA were assessed. When deciding whether the factor loading of an item was sufficient or not, the criterion of .32 was considered as in EFA. After completing the analyses for testing construct validity, reliability analysis was started. The reliability of the measures in the Turkish form of the SAS was calculated with Cronbach's Alpha internal consistency coefficient. The values of .70 and above (Tezbaşaran, 1997) were interpreted as evidence for the reliability of the measures. The discrimination of the SAS items in the Turkish sample were analysed with corrected total item correlation; and the items with correlation values above .30 (Field, 2009) were considered as discriminant enough. Ferguson Delta statistic was used to determine the discriminatory of the entire of the SAS. Calculation of Ferguson Delta, reliability and item analysis was performed on the data from all 375 participants in the study group in contrast to EFA, parallel analysis and CFA. While LISREL 8.54 package programme was used for CFA; IBM SPSS 22 package programme was employed for EFA, reliability and item analysis. Parallel analysis was done by using

RESULTS
This section includes analysis outputs for the psychometric properties of the Turkish form of the SAS. The findings obtained from the statistical analyses done for construct validity, reliability and discrimination are offered below under relevant headings.

Construct Validity
First, EFA was performed for the construct validity of the SAS and the findings obtained are shown in Table 5. The results of EFA demonstrated that the Turkish version of the SAS had single-factor structure, like the original version. The variance explained for single-factor structure was found as 59%. As is clear from Table 5, the factor loadings of the scale items range between .60 and .87. The single-factor structure obtained in EFA was supported by the parallel analysis results. Averages for eigenvalue are calculated from the correlation matrix which contains the number of variables and participants equal to the real data and which is formed randomly in the method of parallel analysis developed by Horn (1965), (Yavuz & Doğan, 2015). While determining the number of factors, the number of steps where the eigenvalues obtained from the actual data are greater than the eigenvalues that are estimated from random data are taken as basis (O'Connor, 2000). According to Table 6, first eigenvalue is greater than actual data in comparison to random data. On comparing the second eigenvalues, it is found that the value estimated from the random data is higher. Thus, the single-factor structure of the scale was also confirmed through parallel analysis method. Following EFA and parallel analysis, CFA was done. The first model tested in CFA was the threefactor structure (worry, avoidance and emotionality) which was considered while developing the original version of the SAS. The fit indices reported for the three-factor structure as a result of CFA are given in Table 7.

443
The fit indices in Table 7, mean that the three-factor model is confirmed. The measurement model obtained for the three-factor structure of the Turkish version of SAS is shown in Figure 1.  Figure 1, it is evident that the factor loadings range between .65 and .85 in the factor of worry, that they range between .52 and .84 in the factor of avoidance and that they range between .81 and .84 in the factor of emotionality. As can be seen in Figure 1, the modification was applied by correlating the error variances of item-3 and item-4 in the avoidance dimension. Item-3 contains the expression of selecting another course instead of statistics, and item-4 refers to choosing a topic that does not include statistics while sharing presentation topics. Therefore, statistical modification is supported theoretically. After the three-factor model, the single-factor model of the SAS was tested because the structure encountered in EFA was found to have single factor in its original version and in its Turkish form even though the scale items had been written on the basis of three-factor structure. The fit indices for the single-factor structure were given in Table 8. The values in Table 8 demonstrate that the measures made with the Turkish version of SAS also fitted the single-factor model. The measurement model reached for the single-factor structure in the Turkish version of SAS is shown in Figure 2. As is clear from Figure 2, the factor loadings in the single-factor model of the Turkish version range between .45 and .85. Also, as shown in Figure 2, in addition to the modification in the three-factor model, the error variances of the eighth and ninth items of the scale were also related to each other. While the eighth item of the scale is related to the difficulties in understanding the statistical contents of the courses; ninth items is about the problems experienced in the interpretation of statistical tables. Accordingly, the modifications applied to improve model-data fit are also theoretically explainable.

Reliability Analysis
Considering the fact that the Turkish version of the SAS fitted both the three-factor and the singlefactor structure in CFA, internal consistency coefficient was calculated not only for the whole scale, but also reliability analyses were done for the subscales. The internal consistency coefficients calculated for the three factors of the scale and for the overall scale are shown in Table 9. Accordingly, the internal consistency coefficients range between .83 and .96.

Item Analysis
The corrected total item correlations (rjx) calculated to test the item discrimination index in the Turkish version of the scale are shown in Table 10. An examination of

Ferguson Delta Statistics
Ferguson Delta (δ) statistics in addition to item correlations were also used to demonstrate the discrimination of the SAS. According to this statistic, high variability in scores received from the scale (heterogeneity of the group) displays that the measurement tool is discriminant (Zhang & Lidbury, 2013). The variability in scores the participants receive from the scale are divided into the highest variability probable to be observed in calculating the Ferguson Delta statistics (Day & Bonn, 2011). While δ = .00 when all the participants receive the same scores from the scale, δ = 1.00 when the variability between participants' scores is equal to the highest variability probable to be observed (Hankins, 2008). Kline (2000) states that Ferguson Delta corresponds to .93 in normal distribution and suggests that the value of .90 should be taken as the criterion for the statistics. The Equation 1 is used in calculating the Ferguson Delta statistics for the measurement tools with more than two response options (Hankins, 2008). (1) n = sample size f = frequency of each score m = number of response category As is apparent from the Equation 1, first the frequency table should be drawn for the scores received from the measurement instrument to be able to calculate the Ferguson Delta statistic (Ramsay & Reynolds, 2000). The frequencies for the scores the 375 participants received from the SAS are shown in Table 11. On placing the frequencies along with the values k = 17, m = 4 and n = 375 in the formula, the Ferguson Delta statistics was found as .98.  17  36  27  19  37  8  47  4  57  1  18  20  28  10  38  3  48  4  58  2  19  23  29  12  39  3  49  4  59  3  20  22  30  11  40  4  50  4  60  1  21  19  31  15  41  4  51  8  61  1  22  10  32  9  42  4  52  3  62  2  23  15  33  10  43  2  53  4  68  1  24  15  34  8  44  1  54  3  25  8  35  9  45  2  55  4  26  14  36  6  46  1 56 3

The Interpretation of the SAS Scores
As all of the items in the original form of SAS had sufficient factor loadings and discriminative values also in the Turkish version of the scale, no item was removed from the scale. Thus, as in the original

DISCUSSION and CONCLUSION
In this study, the SAS developed by Faber et al. (2018) for graduate students was adapted into Turkish. The construct validity of SAS was tested with EFA and CFA; and parallel analysis method was also used in deciding about the number of factors in the scale. A single-factor structure was found in EFA and the rate of explained variance was found to be 59%. There are various criteria set in the literature by researchers about what the rate of explained variance should be at least. While Bayram (2010) and Büyüköztürk (2010) say that the explained variance should be at least 30%; Aksu, Eser and Güzeller (2017) say that the values of 40% and above are acceptable. According to Sönmez and Alacapınar (2016), however, the rate of explained variance should be higher than the rate of unexplained variance. The rate of variance reported after EFA meets all these criteria. Besides, the factor loadings for all of the items in the SAS were found to be above the threshold level of .32 (Tabachnick & Fidell, 2007).
These results indicate that the construct validity was achieved in the Turkish version of the SAS. The single-factor structure found in EFA was also supported by the results of parallel analysis.
Conclusions that there was evidence to show the construct validity of the Turkish version of the SAS in CFA as in EFA were reached. According to the fit indices reported in CFA, both the three-factor structure (labelled as worry, avoidance and emotionality) taken into consideration when developing the original form of the scale and the unidimensional structure emerging as a result of EFA were confirmed. In addition to that, it was also found that the factor loadings for both models were above .32. On considering these results about CFA along with the findings obtained in EFA and parallel analysis, it may be said that the three factors of the scale can be interpreted separately in addition to the total scores received from the scale and that it would not be very correct to make an evaluation based on the subscales only without obtaining a total score for anxiety.
It was concluded that internal consistency coefficients calculated in reliability analysis for the subscales in the SAS and for the whole scale met the criterion of .70 (Pallant, 2005;Tekindal, 2009). Accordingly, it can be stated that the Turkish version of SAS is an instrument yielding reliable measures. According to item analysis results, the corrected item correlations met the threshold value of .30 (Erkuş, 2012) for all the items in the SAS. The value found for Ferguson Delta statistics also met the criterion of .90 (Kline, 2000). Therefore, it may be said that the SAS is discriminant enoughthat is to say, it is capable of discriminating between graduate students having different levels of statistics anxiety. In conclusion, the results obtained in this study indicate that the statistics anxiety of graduate students can be measured by using SAS in a valid and reliable way.

Recommendations for Further Studies
This study analysed the construct validity of the Turkish version of the SAS with EFA and CFA. Convergent and divergent validity analyses can be included in further studies. Because the reliability of the SAS was analysed only on the basis of internal consistency in this study, it can be recommended that the further studies could test the test-retest reliability of the scale. Besides, since this study was conducted within the framework of classical test theory, it can be suggested that the reliability and validity of the SAS be analysed on the basis of item response theory.
By using SAS, studies can be conducted to compare the statistical anxiety levels of the researchers who continue their graduate education in any of the fields of educational, social and health sciences, field education or pure science. In this way, it can be determined whether there is a significant difference between the statistical anxieties of the individuals attending graduate education in different fields and if significant difference is detected, the rationale of the observed differences can be revealed by qualitative analysis.