Development of Short Forms of Scales with Decision Tree Algorithms

— Scales or surveys are among the measurement tools developed to measure the perceptions of individuals on specified topics. In some cases, the length of these measuring tools may negatively affect the response rates of individuals towards these tools. In this regard, this study aimed to development of short forms of scales by decision trees algorithms. In this way, it can be provided to design short-form measuring tools that perform similar functions and to increase the rate of responses to measuring tools. In the study, predictions were made with decision trees, which are data mining methods. In this context, analyzes were made with decision trees algorithms to obtain short forms of three scales. According to the results obtained, a high level of correlation was found between scales’ short and long forms. Thus, it can be concluded that short forms of scales are suitable for measuring similar purposes. Instead of using scales consisting of 40, 20 and 20 items, expected measurements can be made with at least three and 10 items with appropriate tree algorithms for each scale. Among the suggestions of the study, it is possible to carry out similar studies for frequently used scales. So high participation rates for scales can be obtain.

One of the conditions taken into consideration in the measurement stages is the efficient realization of the measurement. In order to perform an efficient measurement, a strong measurement tool is required [3]. Otherwise, there may be consequences such as cost, time loss and inability to measure at the expected level. Measurement tools in social sciences can be expressed mostly in the form of surveys and scales.
The length and subject of the measuring tool affect the response rates of individuals. Various problems can be encountered during the response of the measuring instruments prepared by individuals. Among these, problems such as individuals not responding to all or some of the measuring tool and being tightened due to the length of the measuring tool can be mentioned. However, shorter scales are generally good because they load less responses to respondents [3]. Despite this, the use of long measuring instruments may cause negative perceptions of individuals towards measuring instruments. For this reason, researchers sometimes try to compensate for undesirable effects on response rates, using material or immaterial incentives [4]. Experimental studies have shown a strong negative linear relationship between survey length and response rates [5]. Sheehan [6] analyzed 31 studies and found small positive relationships between response rates and length of the surveys. As can be seen from this, the long measurement tools may cause the response rates to decrease. This may lead to underrepresentation of views in the target community in studies.
In some studies, it is seen that monetary incentives can be given for increasing the response rates of the measurement tools [5,7,8]. With these financial incentives, it is aimed to increase the response rates of the surveys. It should be ensured that the financial incentives to be given will be enough to compensate for long surveys [4]. The measuring tools can be shortened for increasing the response rates. Shorter surveys have a higher response rate than long surveys [9]. In this context, the scales that are thought to be long and that the response rates may be low can be shortened. The 18-item scale developed by Carlson, Kacmar, Wayne and Grzywacz [10] was shortened to six items by Kacmar et al. [11]. As a result of comparison of these two scales, it was concluded that the desired measurements were realized with both scales [12]. A similar study was performed for Smartphone Addiction Scale which developed by Kwon et al. [13]. There are six factors and 48 items in the long version of this scale. The short form of this scale was developed as 10 items by Kwon, Kim, Cho and Yang [14]. It is concluded that long and short forms of the scales can be used for the same E. AYDEMIR and F. KAYSI purpose. In addition, researchers use shortened scales to enable respondents to examine more complex models without the risk of fatigue and boredom [15]. In this way, both participants and researchers can be more comfortable with time. Finally, there is a strong correlation between short form scales and long form scales [16]. In other words, the scales in the short form developed have the same function as the ones in the long form.
In academic studies, it is possible to shorten the scales and thus reach better response rates. In this context, methods that can reduce the number of items can be preferred in some scales which are frequently used. Among these methods, it may be possible to reduce the number of items in the scales by using decision trees. The aim of this study is to reveal the short forms of three measuring tools by using decision trees. In this way, higher scale responses can be obtained with shorter scale forms.
II. METHODOLOGY In the study, analyzes were carried out using the WEKA (Waikato Environment for Knowledge Analysis) program. With the analysis, it is aimed to prepare short forms of three scales. The scales in the short form that will be revealed by the study are personal. The same scale can be shortened to three items for one of the respondents, while it can be shortened to 10 items for the other respondent. This will change according to the responses of the respondent to the items in the scale. As a result, each respondent would have answered the scale with fewer items instead of the long one. In this context, analyzes were conducted for the scales with decision trees algorithms via WEKA software. First of all, an excel file containing each answer given to the questionnaire in one line was created. While each column of this file contains the answers given to the questions, the last column contains the result obtained from the survey. This file was saved as csv extension and added to the Weka program. Correlation coefficient, mean absolute error and root mean squared error formulas were used in the success analysis of the algorithms used.
The first scale included in the study was the Mobile Phone Problem Usage (MPPU) scale developed by Tekin, Güneş and Çolak [17]. This scale has a structure with three factors and 20 items. The aim of the scale is to measure the mobile phone usages of university students. Data of the MPPU scale which used in this study for prediction was applied to 195 participants in the five-likert [18]. The second scale of the study is the Mobile Device Addiction (MDA) scale developed by Yıldırım, Sumuer, Adnan and Yıldırım [19]. This scale has a structure with four factors and 20 items. The purpose of this scale is to measure the mobile device addiction levels of individuals. For using data of this scale in this study, MDA scale was applied to 363 participants in five-likert. The third scale of the study is Attitude Towards Learning (ATL) scale developed by Kara [20]. The scale of ATL has a structure with four factors and 40 items. In order to use the data in the study, the data of the ATL scale was applied to 164 participants in the five-likert [21]. Permission to use was obtained from scale owners via e-mail. Some information about the scales used is given in Table I. Although statistic is an important tool in the analysis of data, its use is limited in some cases. In such cases, decision trees may require data mining through data analysis methods [22]. With data mining, predictions can be made on the available data for problems. Decision trees are among the most common and effective methods used in creating prediction models through data mining [23]. With the decision tree model, approximate calculation of target functions is provided. With this model, it helps decision makers to determine the factors to be considered in decision making and to determine the relationship of these factors with decision outputs [24]. Most used decision tree methods are C4.5, REP Tree, Random Forest, CART and Logistic Model Tree algorithm [25]. In this study, the M5P, RandomTree and RepTree algorithms, which are used to estimate the numerical results and output the tree structure together with its branches, are used. RandomTree selects a test based on a given number of random features in each node and does not prune. REPTree creates a decision or regression tree using the information gain / change reduction method and melts this error using reduced error pruning. Optimized for speed and ranks values for numerical attributes only once. Using the M5 model, M5P realizes learning and draws the rules.

IV. FINDINGS
M5P, RandomTree and RepTree, one of the decision trees algorithms, were used to prepare shorter forms of OIT, MDA and MPPU scales. In the analysis made through these algorithms, the scales were aimed to function similarly with fewer items. The analysis results obtained in this context also ISSN: 2147-284X http://dergipark.gov.tr/bajece express the effectiveness levels of the short forms of the scales. As a result of the analyzes, the range where the correlation values that can be found and the interpretation status of these ranges are given in Table II. For the scales in the study with Weka software, the M5P, RandomTree and RepTree algorithms are aimed to estimate the average of responses to the scales. The algorithms used give the tree structure formally as well as showing the conditions in the tree branches as output. The results of the algorithms used for the three scales included in the study are given in Table III.  Table III shows that the highest success rate for all three scales is obtained in the M5P algorithm. MDA and MPPU scales showed higher success rates compared to the ATL scale. However, the 0.86 correlation coefficient of the ATL scale can be interpreted as showing very high success. As a result of the analyzes, tree structures of the scales were revealed. It has been determined that the total number of branches belonging to decision trees is more than 80. Due to the very large and complex shape of the tree node structures, structures were not visually added to the study.
Considering that the main aim of the study is to developing short forms of the scales, there is a need for findings regarding how many items of the scales can show similar results. In this context, Table IV shows predicted least items for participants' responses for short-form of scales.  Table IV shows the numbers of items that can be answered in the short forms of ATL consisting of 40 items, MDA consisting of 20 items and MPPU scales consisting of 20 items. In this context, it can be seen that the expected measurement for the entire scale can be realized by answering only three items, provided that it starts with the correct item and is produced with the appropriate algorithm for the scale of ATL. The number of items in other forms of the ATL scale as a result of decision trees ranges between four and nine items. It is seen that the expected measurement can be realized by responding at least four items in MDA scale. However, similar measurements can be obtained with the items to be presented to the participants between five and 10 items for the scale. For the MPPU scale, which is the last scale of the study, it is seen that the expected measurement can be realized by responding the participants with at least three items. However, similar measurements can be obtained with the items to be presented to the participants between four and nine items for the scale. The tree algorithm of the MPPU scale is shown in Fig. 1 below. The section that starts with Q indicates the question number, while the section after colon shows the result value. For example, the first question is the 13th question and those who give the answer less than 4.5 for this question are directed to the 17th question. Those who give the answer greater than 3.5 for the 17th question is directed to the 2nd question. Those who give the answer less than 2.5 for the 2nd question is directed to the 8th question. For these people, the 8th question is the last question. Thus, these people complete the survey in 5 questions.

V. DISCUSSION
Scales are often used to scale the levels of participants towards a situation that is curious through scientific studies. This purpose is served with the dimensions and items determined in these scales. In some cases, due to the long scales, it may cause participants to never respond, get bored, leave missing, or not respond in a way that fully reflects their perceptions. Financial or immaterial incentives are among the frequently used methods to prevent such negative situations [4,5,7,8]. Short forms of scales can be preferred to reduce the problems caused by the long scales. Because the scales in short form load the respondents less, they are seen better [3]. However, measuring instruments in short form have higher response rates than long measuring instruments [9]. In addition, it was revealed that the desired measurements were successfully completed in the studies conducted with the long and short forms of the scales [12].
Short forms of the scales can be revealed by choosing the appropriate decision tree and determining the appropriate item to start the measurement process. With the responses that users will give to the first item and the next item is determined by the decision tree, measurements can be made with a minimum of three and a maximum of 10 items for the scales in the study. The number of items to be responsed may vary to obtain the highest correlated result for the scales. This is a natural situation in which the participants should have different perceptions. The level of Attitude Towards Learning can be revealed through the responses given by the participants to 40 items with the ATL scale's long form. Instead, according to the findings obtained from the study, this measurement can be obtained by responding at least three and at most nine items with the correct algorithm. Similar results can be seen in Kacmar et al. [11] short form of scale with six-item instead 18-item, Kwon et al. [14] short form of scale with 10-item instead 48-item, and Sendjaya et al. [15] short form of scale with six-item instead 35-item. Similar results are valid for the MDA and MPPU scales in the study in order to reveal the highest level of correlation. These scales can also be shortened by selecting the appropriate decision tree and node.
Scales in the study can be shortened between three and 10 items with appropriate nodes in appropriate algorithm. In the literature, scales in the shortened version, which are more flexible and require less time, are recommended [26,28]. The length of the measuring tools causes the participants to get tired and bored and is defined as an important research obstacle by increasing the potential of response bias [29]. For this reason, it can be stated that the scales carry out the expected measurements with the number of items between three and 10. Because short forms of scales are suitable tools that can be used effectively for the studies to be carried out [30]. In addition, with the short form scales developed, measurements of expected dimensions can be realized in a shorter time [31].

VI. CONCLUSIONS AND RECOMMENDATIONS
Short forms of scales can been developed with the preference of appropriate tree algorithm and high correlation level. In this way, instead of applying the scales in long form, participants can work with very few items in order to obtain a similar result. Because there is a strong relationship between short and long forms of scales and therefore short forms can be used reliably [16]. Thus, scale studies with shorter form and higher response rate can be done. This gave the appropriate result for all three scales in the study. As a result, instead of applying 20 or 40 item measurement tools to the participants, correct results can be obtained with at least three and 10 items with the appropriate algorithm. The results obtained in the study were shared with the owners of the scales. From 2012 to 2015, he was an Expert with the Istanbul Commerce University. Since 2017, he has been an Assistant Professor with the Computer Engineering Department, Kirsehir Ahi Evran University. He is the author of three books, more than 10 articles, and more than 40 conference presentation. His research interests include artificial intelligence, microcontroller, database and software.