Research Implications for Computer Science Education based on Darmstadt Model

Yasemin GÜLBAHAR1 , Filiz KALELİOĞLU* 2 1 Ankara University, Ankara, Turkey, gulbahar@ankara.edu.tr 2 Başkent University, Ankara, Turkey, filizk@baskent.edu.tr * Corresponding Author: filizk@baskent.edu.tr Article Info Abstract The purpose of the current study is to examine published studies in computer science education in a systematic way, and to present a history of the research and new research trends in this area. This research study reports the findings of the systematic literature review according to the educational relevant areas dimension of the Darmstadt Model. The procedures of systematic text analysis were performed as a qualitative content analysis. Prior to the systematic text analysis, the primary term ‘computer science education and K-12’ was searched for along with data in the abstract, title and keyword section for publications between 2013 and 2018 in the databases and digital libraries of Academic Search Complete, Business Source Complete, Eric, Science Direct, and the IEEE Digital Library. A total of 87 articles formed the sample of the study. Although the current study was limited to the stated journal articles, it provides insight to the field by shedding light on important issues relevant to future research studies. Received: 19 October 2020 Accepted: 27 December 2020


Introduction
In recent reports, there has been a call for changes in education based on a vision that focuses on future jobs and expected work-related skills. In the next five years, skills gaps continue to be high as new job opportunities require new skills (World Economic Forum, phenomenon. Curriculum standards, curricula, differentiated implementations at various levels, and target groups have since seen a continual change that has brought about significant challenges during the process. The recent reports of the State of Computer Science Education Policy and Implementation (Code Advocacy Coalition, 2018), and by Blikstein (2018) about the state of the phenomenon, reported the findings of various implementations. The reports revealed a growing interest and success of these implementations by prioritising research on the teaching, learning, and assessment of computer science education. Blikstein (2018) specifically mentioned the need for ongoing and thorough research in order to facilitate more successful implementations in the areas of computer science concepts, programming tools design and experience, tools for formal learning environments, and other forms/paradigms of programming, as well as in the arena of the arts and creative computing. Blikstein (2018) also recommended improvement of the equitable participation of all students in computer science education. However, although there have been numerous efforts and initiatives like 'CS for All' in the US and 'Informatics for All' in Europe, the expected spread of access and learning outcomes are far from being observed since there is a highly variable level of effort and achievement across different countries and education autonomous regions (Gretter, Yadav, Sands, & Hambrusch, 2019). Diverse and replicated research studies lead to consistent results, and may thereby reveal the best approaches which could help fill the gap between implementations at the national level and the pedagogical target groups. Hubwieser, Armoni, and Giannakos (2015) postulated certain research questions that they considered both important and relevant to many different contexts, cases, and countries. These questions specifically addressed the topics of alignment of CS competencies, content and learning outcomes, didactical approaches, teaching methods, instructional media and materials, programming environments and languages with the target age group and school context. Furthermore, they structured and deepened their questioning by applying the Darmstadt Model (Hubwieser, 2013;Hubwieser et al., 2011) (see Figure 1) and posing even more questions.  (Hubwieser, 2013;Hubwieser et al., 2011) The Darmstadt Model is a three-dimensional model having educational relevant areas as one dimension, level of responsibility/range of influence as a second dimension, and the Berlin Model Top Dimension as a third dimension. Educational relevant areas are focused on the following 13 components:  How did the studies vary in terms of their purpose and research design?
 What were the data collection methods and data analysis approaches in the studies?
 How did the studies vary in terms of their major findings?

Method
The purpose of the current study was to examine published studies in the area of computer science education in a systematic way, and to present a history of the research and also to identify new research trends in this area. The procedures of systematic text analysis were performed as a qualitative content analysis. Prior to the systematic text analysis, the primary search term 'computer science education and K-12' was entered along with the data in the abstract, title and keyword section, for publications between 2013 and 2018 in databases and digital libraries of Academic Search Complete, Business Source Complete, Education Resources Information Center (ERIC), Science Direct, and the IEEE Digital Library.

Sample
Initially, 152 articles were accessed. Intentionally, and parallel with the aim of the current study, articles which did not mention Computer Science Education conceptually, or did not include research in this field, were removed because they were considered to be not suited to the purposes of the current study. The remaining 87 articles formed the sample of this study.

Data Analysis
The articles were divided equally between the two researchers. Then the researchers individually qualitatively analysed the articles according to predefined criteria. The predefined criteria were: title of the article, year of publication, number of authors, keywords, country of study, purpose of the study, variables, research design/type, target group, sample size/data size, data sources, data collection method, data analysis methods and techniques, and major findings.
During this process, the researchers met weekly in order to discuss their findings and progress. According to the criteria, the articles were coded by each researcher and frequency tables were formed. The tables and codes of each researcher were rechecked and combined into a single file in order to obtain a general picture of the combined results from the two researchers.
In the following section, the results of the inductive analysis are presented according to the research questions. Following detailed document analysis, some parts of the analysed data are transformed into numerical values and illustrated through the use of graphics and word clouds.

Numbers of Articles per Journal
When the studies were examined in terms of number of articles in each journal, it was seen that 29 were published in the Journal of Educational Computing Research, 26 in Informatics in Education, 13 in Computer Science Education and 19 in different journals (see Table 1). A total of 87 articles were systematically analysed within the scope of this research.

Year of Publication
When the year of publication of the articles were examined; there were four articles

Number of Authors
When the number of authors attributed to the articles was examined; it was found that the number of studies with a single author was 18, with two authors it was 25, with three authors it was 17, with four authors it was 17, with five authors it was four, and the number of studies with six or more authors was six. As can be deduced, half of the articles had only one or two authors, whereas only six articles were written by groups of six or more authors.

Most Used Keywords
In examining the keywords describing the articles; many keywords were found to have been used (see Figure 3). The most used keyword was Programming, which was used as a keyword in 44 of the studies, which was followed by Computational thinking in 21 articles. Other keywords used in the articles were Computing (n = 18), Teaching (n = 17), Teacher professional development (n = 3), Information technology (n = 3), and Teaching and learning strategies (n = 3).

Country of Study
In examining the country in which each of the studies were conducted, it was found that while many studies indicated the country of study, some studies did not include this information. Of the studies that included the country of study (see Figure 4), most were conducted in the USA (n = 21), followed by Turkey (n = 7). Other countries where studies were performed were Taiwan (n = 5), Brazil (n = 5), Greece (n = 4), Germany (n = 4), China (n = 3), Slovenia (n = 2), Lithuania (n = 2), Slovakia (n = 2), and the UK (n = 2).  Outcomes/effects: The effects of methodologies, tools and media on computer science education and on thinking skills.
Teaching methods: Techniques and teaching methods in computer science education such as game-based and problem-based instruction, plus virtual reality applications.
Media: Mostly focused on the design and development of games, software or tools used in programming instruction.
Curriculum issues: K-12 level curriculum proposals in computer science education, physical programming course design, or course design at the university level.
Intentions: Standards or competencies expected in computer science education.
Motivation: Teacher and student attitudes and their learning motivations.
Knowledge: Learning the concepts of computer science and ICTs.
Teacher qualification: Teacher education and development for computer science education.
Extracurricular activities: Competitions and contests, and their effects.
Educational system: Institutional structure studies mostly at the macro level.

Variables
In considering the variables examined in the studies, information about the variables was not clearly presented in most of the articles. However, from the available information, it was seen that the most researched variables were Prior grade in programming courses (n = 6), Types of tasks that students work on in research (n = 6), Perception (n = 6), Selfefficacy for learning computers or programming (n = 5), Attitudes towards learning 0 5 10 15 20 48 programming or computer science (n = 4), Achievement in different courses (n = 4), Computational thinking (n = 4), Interest in CS (n = 4) and, Learning Performance (n = 3).

Research Design/Type
When the studies were investigated in terms of their research design, it was seen that most relied upon quantitative measures, where Experimental design was the most preferred, rather than Qualitative measures. Survey was the second most preferred approach, followed by Case study, and Mixed methods design (see Figure 6). Although there was some overlapping of concepts in the classification such as Experimental design already being a Quantitative approach, the reason for this treatment is that the concepts are presented as mentioned by the authors of the studies.

Target Group
When the studies were explored in terms of the research target group, it was noted that higher education students, especially undergraduates, were considerably more prevalent than other groups (see Table 2), which was followed by studies that addressed different K-12 age groups, and studies on 5th-8th Grade groups. There were three separate studies observed with one for kindergarten, one for professors, and one for parents as their target group of study, which are not shown in Table 2.

Sample Size
It is important for any research to select a valid representative sample of the population in order to reach generalisable conclusions. However, almost half of the studies were conducted with 100 or fewer participants (see Figure 7), which may constitute a general threat in terms of the validity and generalisability of the research findings. However, several studies reached a wider target population with larger samples.

Data Sources
For any research study, the source of data is important in terms of many aspects such as triangulation, participant honesty, accessibility, and being free from bias. For most of the research carried out on computer science education, the students formed the primary data source (69%). Of the other studies, teachers were the data source for 10% of the articles, with documentation used in 8% of studies, and research databases in 5% of cases. Only 5% of the studies used more than one type of data source, which raises questions as to the reliability of the findings. Additionally, 3% of the studies had professors or parents as data sources.

Data Collection Methods
When the studies were categorised in terms of their data collection methods, a variety were observed (see Figure 8). Survey, Questionnaire, and Achievement tests were the most commonly used data collection instruments for quantitative measures. In terms of qualitative measures, Interviews, Document analysis, Recordings, and Observations were also used for the purposes of data collection. There were also single studies that utilised PISA results, Usability tests, and Attitude scales as their data collection instruments, although these are not included in Figure 8. Realising that most of the studies focused on the academic performance of students, the common use of instruments to grade and explore the learning outcomes is not unsurprising.

Major Findings
The major findings of the research studies were classified according to the Darmstadt Model, based on their contribution to the field (see Figure 10).  integrated designs for the cultivation of Computational Thinking, and the use of concept maps as an educational tool were reported to be either effective or caused significantly better learning of programming. In another study, a variety of pedagogical strategies were also recommended by teachers based on their individual experience as: unplugged type activities, contextualisation of tasks, collaborative learning, developing computational thinking, and scaffolding programming tasks. It can be seen that numerous ways were put forward to accomplish the task of teaching programming with some premising research findings.
The second most covered area of findings related to the category of 'Media', which generally investigated the effect of using different software on learning outcomes. Blockbased programming (mostly Scratch and Alice), text-based programming experience, learning to program robots with developmentally appropriate tools, and the use of animations and algorithm visualisation tools were found to be mostly effective in reaching expected learning outcomes. However, one study reported that 'programming in Scratch platform did not cause any significant differences in the problem solving skills of primary school students (Kalelioglu & Gulbahar, 2014, p. 33), whereas another study reported on the low usage of various teaching aids in programming classes as an obstacle to learning.
Studies that focused on 'Sociocultural-Related Factors' were observed to have dealt with variables such as like skills prior to and after having learnt programming, learning styles and knowledge map construction (Shaw, 2017), different generation and nonimmigrant students, the performance of problem-solving ability, having prior programming experience (Kim, 2018;Veerasamy, D'Souza, Lindén, & Laakso, 2018), students' selfexplanation quality, number of code edits (Liu, Zhi, Hicks, & Barnes, 2017), computational thinking, spatial and reasoning ability where all variables were found to have some effect on either the learning outcomes of learner characteristics except learning styles, and knowledge map construction. Two studies reported no gender difference in terms of academic achievement in computer science, whereas problem-solving ability was associated positively with performance at the concrete operational stage in one study (Kožuh, Krajnc, Hadjileontiadis, & Debevc, 2018), and significantly correlated with students' self-explanation quality, number of code edits, and prior programming experience in another (Liu et al., 2017). Yet, another study reported significant correlations of at least moderate intensity between computational thinking with spatial ability, reasoning ability, and problem-solving ability.
The 'Outcomes/Effects' category showed diverse research interests among the studies. Effects of persistence levels on self-efficacy beliefs (Lin, 2016), level of knowledge on specific subjects, increasing accessibility to CS resources, a measurement model, a model for building CS content as a scaffolder for higher-level learning in transdisciplinary settings, providing teachers with more computer-based training (Coleman, Gibson, Cotten, Howell-Moroney, & Stringer, 2016), and early access to computer science lessons were among the topics explored. There were also some research recommendations put forward based on literature reviews.
For the aspect of 'Teacher Qualification', several studies reported on the challenges that teachers face such as: isolation (Yadav, Gretter, Hambrusch, & Sands, 2016), lack of adequate computer science background (Yadav et al., 2016), limited professional development resources (Yadav et al., 2016), and limited knowledge of and experience with computer science. Menekse (2015) cited not only limited collaboration between educational organisations to develop computer science teachers' professional development, but also limited duration and lack of clear focus on discipline-specific pedagogical content knowledge for existing professional development programmes.
Under the 'Curriculum Issues' category, researchers efforts were proven to be successful on developing, implementing, and evaluating various curriculum for different target groups like teachers (Kucuk & Sisman, 2018) and students (Kynigos & Grizioti, 2018) based on different approaches like robotics and 3D spaces (Kucuk & Sisman, 2018;Kynigos & Grizioti, 2018). One study stated that "Despite the recent revived interest in programming for K-12, little studies have been conducted to inform the researchers and educators on implementing suitable curriculum for the group of students" (Lye & Koh, 2014, pp. 59-60).
Having three studies under the "Knowledge" category, the authors of the researchers were noted as searching for defining constructs in order to reveal perceptions of parents toward programming (Kong, Li, & Kwok, 2018), dimensions for assessment of computational thinking (Zhong, Wang, Chen, & Li, 2016), and students use of computational thinking 54 concepts in a Story-Writing-Coding context (Price & Price-Mohr, 2018). Hence, all of the studies were types of assessment of cognitive knowledge and abilities.
Research under the "Motivation" category revealed facts about students in STEM fields, indicators of student engagement (Benotti, Martinez, & Schapachnik, 2018), and the importance of taking more units in the computer science subject in order to increase student motivation (Lee, 2015). Thus, considering which factors can affect motivation levels. There were also two studies grouped under "Extracurricular Activities" based on challenges for solving tasks, and two studies under the "Intentions" category that focused on competence areas for computer science education (Zendler, Klaudt, & Seitz, 2014). There was also one article under the category of "Educational System", where the authors (Aleksić & Ivanović, 2016) investigated the use of programming languages at different universities. They concluded that "Universities from Central and Eastern-European countries mainly based their study programmes on teaching C and C++ programming languages, while programmes of the Scandinavian universities were mainly based on Java" (p. 177). The researcher's also added that while most of programmes highly support object-oriented paradigm of programming, introductory programming subjects were mainly based on imperative paradigm. Additionally, there were no research studies conducted that were categorised as regarding 'Policies' or 'Examination/Certification'.

Discussion
When the studies were examined, most of the research articles between 2013 and 2018 were published in the Journal of Educational Computing Research and Informatics in Education. In terms of publication years, while the distribution is fairly equally distributed, 2018 saw almost twice as many studies published. The reason for this may be the increased interest in the area due to curricular studies in computer science education and changes to national education policies (Bocconi et al., 2016).
Although computer science education and K-12 keywords were systematically searched; the keywords entered for the articles gives an idea of the research topics published for computer science education. Whilst a very comprehensive field, it was observed that the keywords used in computer science education research focused on programming and computational-thinking skills; which reflects the nature of computer science education 55 research at the K-12 level. It can also be said that at the K-12 level; teaching methods, teaching tools and evaluation methods constitute the most researched topics.
As to which countries are undertaking research on computer science education, conclusions can be drawn based on examining the countries where the published research has taken place. It was seen that studies were conducted in America, Europe, and Asia. This result is supported by the Developing Computational Thinking in Compulsory Education report by Bocconi et al. (2016), in which it was stated that since 2012, many countries have carried out curriculum developments in order to integrate computer science education and computational thinking into compulsory education. Therefore, developments and changes in national education are naturally reflected in the published research. It can be concluded that researchers are generally more interested in the effects of different teaching methods and media for teaching programming, as well as in student characteristics. As educators, having an effective learning environment supported by appropriate pedagogical approaches is the most significant aspect of the teaching-learning process. Hence, reaching such a finding is not unsurprising. However, having diverse teaching methods where all of them have a large or small effect but are somehow successful is quite perplexing. This critique can also be applied to the findings about media; that whatever media may be used, students are still likely to learn from it.
From decades of research on the effect of technology on learning, it is known that technology-rich or computer-based environments may or may not enhance learning, but generally a negative impact would not be expected. Thus, rather than investigating the effect of a certain piece of software, learning environment, or tool, it would perhaps be more useful to explore the appropriateness of these media according to learner age groups from a cognitive and constructive point of view.
One other issue may be the order of the software and tools used from kindergarten through to high school; starting with block-based then switching to text-based, or starting with robots and then switching to text-based etc. More empirical evidence is needed of which tool should be used at what age, and the same goes for pedagogy too. Which teaching methods should be used for specific age groups so as to increase academic performance?
Studies addressing these issues should be conducted with participants of different age groups. Among those investigated in the current research, only 17 studies were observed that undertook a comparison of different age groups.
Studies that attempt to define and assess the concepts of the field should be continued until an agreed upon framework has been reached. Theories as didactical approaches are quite important, since they form a base for the academic research work of the future. Curriculum issues are also important in order to frame the knowledgebase for students of different ages. Hence, studies should be conducted by targeting different age groups. Personal constructs are what carries graduated students on to a computer sciencerelated profession; therefore, studies that reveal the interest, cognitive, and problem-solving abilities of students are needed in order to better support their decisions about their future career.
Research studies to fulfil the premise of "Policies", "Examination/Certification", and "Extracurricular Activities" should be reaching thousands of participants in order to properly suggest a policy or to improve the effectiveness of being "qualified" by way of examination. However, having a strong voice in this area is not easy, so it is unsurprising that no more than a few studies are published; or none, as in the case of the current study's sample. This situation does not imply any lack of necessity for such studies, but underlines the importance of handling issues of a wider audience and reaching sound conclusions. The Darmstadt model also has a dimension called "range of influence". Findings of a single study may not be true for a different content or culture, which makes research questionable as to its international validity and reliability. In order to reach sound conclusions at the policy level, collaborative research studies should be planned first at the national level and then internationally.

Conclusion
Although the current study was limited to 87 journal articles, it provides insight into the computer science education field by highlighting certain important issues for future research studies in the field. It is clear that there is a need to broaden the research about the teaching and learning of computer science at the K-12 level by also trying to improve the range of influence. Hence, encouraging teamwork rather than only as individuals may fulfil the premise of broadening the impact of future studies. Adding to the theory of knowledge, reaching proven competencies, standards, curriculums, teaching and assessment strategies, professional development approaches and instructional media and content is as important as making them valid and reliable by being culture and context independent for educational systems in the sociocultural context.
On the other hand, all research studies should be suggesting ideas for those who may benefit from the results; after all, that is one of the primary reasons why research is undertaken and published. All studies should be contributing to at least one stakeholder in the field with at least one of its findings. But is that true for all research studies? What if one develops a new curriculum and proved its effectiveness based on research and did not then subsequently share the curriculum itself? What if one implements a teaching strategy or instructional media and reports its success, but never reveals the details? How can such research studies benefit others or contribute to the field? If there is anything found to have a positive impact on the process or the product of study, then the details and its documentation should be publically shared so that replicated studies may be conducted, either to validate or to falsify the original study, and thereby strengthen the literature of the field. Hence, there should be a knowledgebase of curricula, competencies, learning outcomes, assessment scales and inventories, achievement tests and tasks etc. In the case where a researcher adds to this base rather than producing a new curriculum, inventory, or survey, the findings of research studies are more likely to merge in order to present a more valid and reliable picture. Although the role of systematic literature studies here serves a similar purpose, there should be a higher level of systematisation in order to distribute what is known from research studies to the wider academic world. Hence, organisations, conferences, and journals will be in a better position to provide researchers with future research topics in order to form a reliable knowledgebase on specific dimensions, and thereby prevent loss of time and effort in producing studies with little or no contribution to the field.
Research trends are changing according to technology, culture, and societal needs, which currently seems to be moving more towards computational thinking, teacher education and professional development, pedagogical aspects, gender and diversity issues, stem/steam approaches, as well as physical computing and robotics. However, social, economic, and cultural barriers surrounding computing should also form an important focus for research, with studies carried out in collaboration with an interdisciplinary approach at the international level that subsequently informs the wider academic community (Blikstein, 2018). "Only in this way can we achieve the hoped for scale and sustainability, and realise the ultimate vision of generations of researchers, practitioners, and policy makers that have been trying, for the last 50 years, to bring CS to all students" (Blinkstein, 2018, p. 35).

Acknowledgement
The data used in this study was confirmed by the researchers that it belongs to the years before 2020.