An Alternative To Likert Scale: Emoji

In the twenty-first century, the wide use of emojis in communication platforms has emerged. As a result, emojis have started to be used in scales. However, there are a limited number of studies in the literature that focuses on the effect of using emojis instead of Likert-type response categories in scales. Therefore, the focus of this study is to examine the differences that may arise from using emoji and Likert-type response categories in scales. For this purpose, the 3, 5, and 7-point Likert-type and 3, 5, and 7 emoji response categories Psychological WellBeing Scale was applied to 341 students studying at two state universities located in different regions of Turkey. Exploratory and confirmatory factor analyses and reliability analyses were carried out on the data of the participants who answered the six forms with different response categories. As a result, it was determined that there were no significant differences in exploratory and confirmatory factor analyses and reliability analyses. However, when correlational analyses were examined, it was observed that as the number of reaction categories increased, the correlation scores of emoji and Likert-type scales decreased.


INTRODUCTION
Researchers frequently adopt scaling techniques such as Thurstone (1927), , and  when developing self-report scales . The Thurstone scale has a structure that consists of many items, and the items are rated by experts. In this scale, participants indicate whether they agree or disagree with each item . On the other hand, Guttman scaling technique is a response-based technique, and people can respond to a large number of items. However, they are evaluated according to the answer they give to the strongest item in terms of the feature examined. Items are scaled according to the amount or importance of the feature being measured . Guttman scales differ from Thurstone scales in their cumulative aspect. In Guttman scales, a positive response to one level of the scale demonstrates a positive response to all items below that level, and with this aspect, it differs from Thurstone scales. Thurstone and Guttman scales are prepared to represent all levels of the feature, but in Likert-type scales, the items are close to the endpoints of the measured feature . In a Likert-type scale, which is a personoriented method, participants indicate their degree of agreement on many items. The rating can be made as strongly disagree, disagree, neutral, agree, and strongly agree , and they can be formed as three, four, five, and seven categories. In the scale, there may be an indecision option to choose when there is no positive or negative emotion regarding the item. Likert-type scales do not need an expert view in the scoring process contrary to the Thurstone scale. This situation allows for eliminating errors caused by experts . Likert-type scales are considered to be practical and reliable. However, in recent years, as a reflection of digitalization, it has been observed that emojis are used as reaction categories to the items in the scales. In emoji, e represents pictures, and moji represents characters. When we look at the history of emojis, we see that they were created in 1998 by a Japanese communicator, and the widespread use of them has been around since 2010. In 2015, an emoji (face with tears of joy [ ]) was chosen as the word of the year by the Oxford Dictionary, which

Population and Sample
The accessible population of the research consisted of undergraduate students studying at two state universities, one in the Southeastern Anatolia and the other in the Black Sea region. In the study, no inference was made about the feature examined; only the use of emoji and verbal expressions as a response category were compared. For this reason, the convenience sampling method was adopted. In convenient sampling, a non-random sampling method, researchers reach out to the most accessible participants in order to prevent excessive time and energy loss and to reduce study costs . The sample group consisted of 341 students, and the demographic characteristics of the students were shown in Table 1. When Table 1 is examined, it is seen that 79.9% (n = 252) of the university students in the sample were female and 26.1% (n = 89) were male. The ages of the participants range between 18 and 41, with an average of 21.6 and a median of 21. Of all the participants, 83% (n = 283) studied at the faculty of education, 5.6% (n = 19) at social sciences vocational school, 4.4% (n = 15) at the faculty of science and literature, 3.2% (n = 11) at the faculty of fine arts, and 3.9% (n = 13) at other faculties (dentistry, pharmacy, economics and administrative sciences, health sciences, tourism) and institutes (natural sciences). The sample consisted of 19.4% (n = 66) first year, 43.4% (n = 148) second year, 18.5% (n = 63) third year, 16.7% (n = 57) fourth year, and 2% (n = 6) other year (preparatory year and fifth year) students.

Data Collection Tools
The data collection tools consisted of a questionnaire inquiring the participants about their genders, universities, faculties, and years, as well as the Psychological Well-being Scale. The scale was developed by  and adapted to Turkish culture by Telef (2013). When the psychometric properties of the Turkish form of the Psychological Well-Being Scale were examined, it was seen that the scale was unidimensional, and the explained variance was 42%. The factor loadings of the items varied between .54 and .76. The Cronbach Alpha reliability coefficient of the scale scores was .80, and the test-retest reliability coefficient was .86. In order to obtain evidence of criterion validity, the correlation of a different psychological well-being and a needs satisfaction scale was examined. As a result, correlation values of .56 and .73 were found with the psychological well-being and needs satisfaction scales, respectively. The Psychological Well-Being Scale consists of eight items, and the items are rated as 1 strongly disagree, 2 disagree, 3 slightly disagree, 4 neutral, 5 slightly agree, 6 agree, and 7 strongly agree.

Data Collection Procedure
The demographic information form and Psychological Well-Being Scale which was formed as 3-point (disagree, neutral, agree), 5-point (strongly disagree, disagree, neutral, agree, strongly agree), and 7point (strongly disagree, disagree, slightly disagree, neutral, slightly agree, agree, absolutely agree) Likert-type response categories and 3-point ( , ,  185   point ( ,  ,  ,  ,  ,  , ) emoji reaction categories were turned into online forms and applied to university students in a single session.

Data Analysis
Before the analysis, the data set was examined, and it was observed that there was no missing data. This study was carried out to compare the results of the exploratory (EFA) and confirmatory factor analysis (CFA) of the scales with Likert-type and emoji response categories. First, it was analyzed whether the data sets met the assumptions of the factor analysis. For that purpose, it was investigated whether there were multivariate extreme values in the data set obtained with both Likert-type and emoji response categories from 341 participants, and Mahalanobis distances were calculated. Among the obtained Mahalanobis distances, those giving significant results at α = .001 were excluded from the data sets. Also, whether there is multicollinearity in the data sets was examined through tolerance value (TV), variance inflation factor (VIF), and condition index (CI) values. Whether the data sets provided multivariate normality was analyzed through Mardia's coefficient of multivariate kurtosis. The suitability of the data sets for EFA was investigated through the use of KMO and Bartlett test of sphericity. All values obtained according to the data sets regarding the assumptions were presented in Table 2. In Table 2, it is seen that the number of multivariate extreme values in data sets varies between 0 and 15. These extreme values were extracted from the data sets of 341 people. It was observed that the tolerance values of all data sets were greater than .01, the variance inflation factor was less than 10, and the condition indexes were less than 30. Accordingly, it can be argued that there is no multicollinearity in data sets Tabachnick & Fidell, 2013). When KMO values and Bartlett's sphericity test results were examined, KMO values were between .85 and .93. The acceptable minimum KMO value for factor analysis is specified as .60 . Accordingly, the data sets have a sufficient sample size for EFA . Bartlett's sphericity test results were significant in all data sets. So, it can be said that the correlation matrices obtained from the data sets were different from the identity matrix. Since the multivariate normal distribution assumption was not provided to perform EFA, the stronger unweighted least squares (ULS) factor extraction method was used against the violation of this assumption . In CFA, the mean and variance adjusted unweighted least squares (ULSMV) estimation method was used. EFA and CFA were carried out by using a polychoric correlation matrix. Factor 10.10.03 (Lorenzo-Seva & Ferrando, 2020) was used for the EFA, and Mplus  software was used for CFA.

Ethics Committee Approval
In this study, all rules stated to be followed within the scope of Higher Education Institutions Scientific Research and Publication Ethics Directive were followed. None of the actions stated under the title of Actions Against Scientific Research and Publication Ethics, were taken.

RESULTS
In this section, findings were given according to the order in the research questions.

Comparison of EFA Results of Data Obtained from Emoji and Likert Type Response Categories
EFA results of the data obtained from the scales with Likert-type and emoji response categories were compared in terms of the variance ratio explained and the factor loadings of the items. The results obtained were presented in Table 3. In Table 3, factor loadings of the items in scales rated in emoji and Likert type were presented. When EFA results of the data obtained from scales rated in Likert and emoji type were examined, it can be said that the factor loadings were very close to each other, and the explained variance rates were very similar. As the number of response categories increased, the explained variance rate increased. However, the EFA results of the data obtained from the scales rated in Likert and emoji type with the same number of categories were very similar.
The Wilcoxon signed-rank test was applied to examine whether the factor loadings of the data obtained from scales rated in Likert and emoji type differ significantly or not. As a result, no significant difference was found between the factor loadings of the data sets obtained with the Likert-type and emoji response categories of both 3-point (Z = -.70, p = .94) and 5-point (Z = -.84, p = .40) as well as 7-point scales (Z = -1.40, p = .16).

Comparison of CFA Results of Data Obtained from Emoji and Likert Type Response Categories
CFA results obtained from data sets whose response categories are Likert-type and emoji were compared with regard to factor loadings of the items. Accordingly, the results obtained were presented in Table 4.
When Table 4 is reviewed, the factor loadings of the scales with both Likert-type and emoji response categories obtained from CFA results can be seen. Findings showed that the factor loadings of the data obtained from the scales with the Likert-type and emoji response category with the same number of categories were very similar. The Wilcoxon signed-rank test was applied to examine whether the factor loadings differed in the data obtained from scales rated in emoji and Likert type. As a result, it was found that Likert-type rating with emoji does not reveal a significant difference between factor loadings for both 3-category (Z = .00, p = 1.00) and 5-category (Z = -.84, p = .40) as well as 7-category scored scales (Z = -1.40, p = .16). Table 5 included the fit indices obtained from CFA. When the scales rated with Likert and emoji had 3 categories, CFI values were obtained as .98 for Likert-type and .94 for emoji. It is stated that the CFI change is important when the difference between these two CFI values is greater than .01 Vandenberg & Lance, 2000). Hereunder, when examined in terms of the CFI index, a 3-point Likert-type rating fits the data better than a 3-point emoji rating. However, when the ΔCFI values are examined for the 5 and 7-point, it is observed that these values are less than .01.
When examined in terms of RMSEA, it is stated that the difference is important when the value of ΔRMSEA is greater than .01 . Accordingly, in terms of RMSEA, it can be concluded that the Likert-type 3-point rating fits the data better than the 3-point emoji rating. There are no similar comparisons for TLI and Chi-Square (Vandenberg & Lance, 2000). On the other hand, statistics obtained from Likert and emoji type scales are not at a level that will affect the model-data fit decision. In other words, if the model-data fit is provided in the data set obtained from Likert-type scales, it is also provided in the data set obtained from emoji type scales. Similarly, if the model-data fit is not provided in the Likert-type scale, it is not provided in the emoji-type scale, as well. For instance, when the results obtained from 3-point data sets are compared, while the CFI value for the emoji type scale is .94, for the Likert-type scale, it is .98. Since it is stated that CFI and TLI are greater than .90 indicates that model-data fit is achieved Vandenberg & Lance, 2000), it does not affect the decision about whether model-data fit is achieved in emoji or Likert type scales.

Investigation of The Relationships Between the Scores Obtained from Emoji and Likert Type Response Categories
The relationships between the scores obtained from the data sets, the reaction categories of which are Likert-type and emojis, were examined by gender. Results were presented in Table 6. In Table 6, the correlations between the scores obtained from the emoji and Likert type rated scales varied between .54 and .75 for females and .72 and .80 for males. It can be stated that as the number of categories increases for both males and females, the correlations between the scores obtained from emoji and Likert type rating scales decrease.

Comparison of Reliability of Scores Obtained from Emoji and Likert Type Response Categories
The Cronbach Alpha coefficients obtained from the data sets whose response categories are Likerttype and emojis were presented in Table 7.  Table 7 shows the Cronbach Alpha coefficients of the data obtained from emoji and Likert-type rated scales. It can be stated that as the number of categories increases, the reliability coefficient increases, and this is already an expected result. It can also be indicated that the reliability of the scores obtained from the Emoji and Likert type rating scales is very close to each other.

DISCUSSION and CONCLUSION
The current study was conducted to examine the structures of scales consisting of Likert and emoji response categories. It was observed that the structures were similar as a result of EFA and CFA obtained from the data of scales with the same number of categories. As the number of categories increased as a result of EFA, the variance rate also increased. However, similar results were obtained from emoji and Likert type data. When EFA was conducted to see factor loads, there was not enough evidence that the factor loads were statistically significantly different from each other. Therefore, the construct validity of the scales consisting of Likert and emoji response categories in terms of EFA was found to be sufficient. Based on this result, it can be argued that emoji response categories can be used instead of Likert response categories.

189
When CFA was conducted, results showed that fit indices were sufficient for both emoji and Likert type scale data. However, the fit indices decreased as the number of categories increased. Moreover, the number of categories of fit indices has changed, but the differences between Likert and emoji type response categories were not significant. When CFA factor loadings were examined, results showed that the factor loads obtained from the emoji and Likert type data did not differ significantly. Therefore, the current study results showed that the construct validity of the data obtained from both scale types was sufficient.
When the correlations of emoji and Likert type scales were examined, it was seen that the correlation scores decreased with the increase in the number of categories. Results also showed that the highest correlation indicated a moderate relationship. Therefore, the same scale in Likert and emoji categories may not measure the same structure, or it may cause different reactions in participants. In particular, when female participants' seven-category Likert and emoji scales data were examined, the correlation decreased to .54, suggesting that different characteristics are measured with the same items. Similar results were found by Setty, Srinivasan, Radhakrishna, Melwani, and Dr (2019) (2018), it was stated that individuals interpreted the same emojis differently. For instance, some individuals rated the neutral facial expression ( ) as sad. The number of emoji used increases with the increase in the number of categories. Therefore, it can be stated that individuals do not perceive emojis in the same way as Likerttype verbal expressions. As a result, low correlation results were found.
According to the research findings, there is no obstacle to the use of emoji type response categories in scales. It was observed that scales with the same number of categories were very similar in terms of reliability coefficients of construct validity and internal consistency. Therefore, emoji type response categories can also be used in scale development studies. However, the relationships between the total scores were at a medium level. These differences may be because of differences in measured structures or because the reaction categories of emoji and Likert-type caused different reactions in individuals.
In the current study, it is seen that 3-emoji reaction categories can be used instead of 3-Likert response categories. However, the correlation results of the 5 and 7 emoji and Likert response categories were different. Since the use of emoji response categories is still new, in order to contribute to the literature and practitioners, the similarities or differences of the results obtained from the present study should be compared with samples from different age groups and different scales. Based on the findings of the current study, it can be stated that the data obtained from university students with Likert type or emoji response categories have similar construct validity. However, it should be acknowledged that this study is limited to the instrument and the sample used.
According to the present study findings, when the results obtained from the scales consisting of 3, 5 and, 7 emoji and Likert response categories are examined, it was seen that women and men attribute different meanings to the same emoji. In future studies, research should be conducted to examine the reasons for those attributions. In addition, this differentiation can be examined in depth with different age groups and equal/close numbers of gender groups. However, it should be kept in mind that this study is limited to the data obtained from the Psychological Well-Being Scale.
Considering that the use of emoji response categories in scales is new, future studies need to be conducted to examine whether the situations of indecision, which can be experienced in scales with 7 or more Likert response categories (verbal and numerical), can be prevented. Moreover, preschool and primary school students' literacy level and limitations need to be considered, and it should be investigated whether a more valid result can be obtained by using emoji reaction categories among these populations. Additionally, questions may also be read to illiterate individuals, and researchers