<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.4 20241031//EN"
        "https://jats.nlm.nih.gov/publishing/1.4/JATS-journalpublishing1-4.dtd">
<article  article-type="research-article"        dtd-version="1.4">
            <front>

                <journal-meta>
                                                                <journal-id>jmeep</journal-id>
            <journal-title-group>
                                                                                    <journal-title>Journal of Measurement and Evaluation in Education and Psychology</journal-title>
            </journal-title-group>
                                        <issn pub-type="epub">1309-6575</issn>
                                                                                            <publisher>
                    <publisher-name>Association for Measurement and Evaluation in Education and Psychology</publisher-name>
                </publisher>
                    </journal-meta>
                <article-meta>
                                        <article-id/>
                                                                <article-categories>
                                            <subj-group  xml:lang="en">
                                                            <subject>Item Response Theory</subject>
                                                    </subj-group>
                                            <subj-group  xml:lang="tr">
                                                            <subject>Madde-Cevap Kuramı</subject>
                                                    </subj-group>
                                    </article-categories>
                                                                                                                                                        <title-group>
                                                                                                                        <article-title>Assessing Pre-Service Teachers’ Competencies in Open-Ended Item Development: Self-, Peer, Instructor Assessment</article-title>
                                                                                                    </title-group>
            
                                                    <contrib-group content-type="authors">
                                                                        <contrib contrib-type="author">
                                                                    <contrib-id contrib-id-type="orcid">
                                        https://orcid.org/0000-0002-1991-1416</contrib-id>
                                                                <name>
                                    <surname>Yavuz</surname>
                                    <given-names>Emine</given-names>
                                </name>
                                                                    <aff>ERCİYES ÜNİVERSİTESİ</aff>
                                                            </contrib>
                                                    <contrib contrib-type="author">
                                                                    <contrib-id contrib-id-type="orcid">
                                        https://orcid.org/0000-0003-2683-4997</contrib-id>
                                                                <name>
                                    <surname>Şata</surname>
                                    <given-names>Mehmet</given-names>
                                </name>
                                                                    <aff>VAN YÜZÜNCÜ YIL ÜNİVERSİTESİ, VAN EĞİTİM YÜKSEKOKULU</aff>
                                                            </contrib>
                                                                                </contrib-group>
                        
                                        <pub-date pub-type="pub" iso-8601-date="20260401">
                    <day>04</day>
                    <month>01</month>
                    <year>2026</year>
                </pub-date>
                                        <volume>17</volume>
                                        <issue>1</issue>
                                        <fpage>24</fpage>
                                        <lpage>41</lpage>
                        
                        <history>
                                    <date date-type="received" iso-8601-date="20250806">
                        <day>08</day>
                        <month>06</month>
                        <year>2025</year>
                    </date>
                                                    <date date-type="accepted" iso-8601-date="20260309">
                        <day>03</day>
                        <month>09</month>
                        <year>2026</year>
                    </date>
                            </history>
                                        <permissions>
                    <copyright-statement>Copyright © 2010, Journal of Measurement and Evaluation in Education and Psychology</copyright-statement>
                    <copyright-year>2010</copyright-year>
                    <copyright-holder>Journal of Measurement and Evaluation in Education and Psychology</copyright-holder>
                </permissions>
            
                                                                                                <abstract><p>This research focused on the role of self-, peer, and instructor assessments in assessing pre-service teachers’ competencies in open-ended item development and aimed to determine whether there were rater biases in the scorings. Besides, it focused on the differences in the scores of pre-service teachers according to their gender and education type. Participants who 116 pre-service students were asked to prepare one open-ended item in the measurement and evaluation course as a performance task. Self and peers scored the items, and the instructor did so through a holistic rubric. Many facets of Rasch were used to analyze the data. Analysis showed that self-assessment was the most lenient rater type while instructor assessment had the most severe ratings, and there was no rater bias in the scoring. Besides, female pre-service teachers were more lenient than male pre-service teachers, and pre-service teachers studying in daytime classes were more lenient than those in evening classes.</p></abstract>
                                                            
            
                                                            <kwd-group>
                                                    <kwd>rater severity</kwd>
                                                    <kwd>  rater leniency</kwd>
                                                    <kwd>  rater bias</kwd>
                                                    <kwd>  open-ended items</kwd>
                                                    <kwd>  validity</kwd>
                                            </kwd-group>
                            
                                                                                                                        </article-meta>
    </front>
    <back>
                            <ref-list>
                                    <ref id="ref1">
                        <label>1</label>
                        <mixed-citation publication-type="journal">Adesiji, K. M., Agbonifo, O. C., Adesuyi, A. T., &amp; Olabode, O. (2016). Development of an automated descriptive text-based scoring system. British Journal of Mathematics &amp; Computer Science, 19(4), 1-14. https://doi.org/10.9734/BJMCS/2016/27558</mixed-citation>
                    </ref>
                                    <ref id="ref2">
                        <label>2</label>
                        <mixed-citation publication-type="journal">Almazroa, H. &amp; Alotaibi, W. (2023). Teaching 21st century skills: Understanding the depth and width of the challenges to shape proactive teacher education programmes. Sustainability, 15, 7365. https://doi.org/10.3390/su15097365</mixed-citation>
                    </ref>
                                    <ref id="ref3">
                        <label>3</label>
                        <mixed-citation publication-type="journal">Alver, B. (2005). The emphatic skills and decision-making strategies of the students of the department of guidance and psychological counseling, faculty of education were studied. Journal of Social Science and Humanities Researches, (14), 19-34.</mixed-citation>
                    </ref>
                                    <ref id="ref4">
                        <label>4</label>
                        <mixed-citation publication-type="journal">Anderson, L. W., &amp; Krathwohl, D. R. (Eds.) (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom&#039;s taxonomy of educational objectives. Pearson.</mixed-citation>
                    </ref>
                                    <ref id="ref5">
                        <label>5</label>
                        <mixed-citation publication-type="journal">Ateş, G. Ç. &amp; Köse, M. F. (2024). An analysis of university students’ academic achievement in relation to accommodation and various dependent variables. Journal of University Research, 7(3), 212-223. https://doi.org/10.32329/uad.1500037</mixed-citation>
                    </ref>
                                    <ref id="ref6">
                        <label>6</label>
                        <mixed-citation publication-type="journal">Bandura, A. (1986). Social Foundations of Thought and Action: A Social Cognitive Theory. Prentice-Hall.</mixed-citation>
                    </ref>
                                    <ref id="ref7">
                        <label>7</label>
                        <mixed-citation publication-type="journal">Barrette, C. (2004). An analysis of foreign language achievement test drafts. Foreign Language Annals, 37(1), 58–70. https://doi.org/10.1111/j.1944-9720.2004.tb02173.x.</mixed-citation>
                    </ref>
                                    <ref id="ref8">
                        <label>8</label>
                        <mixed-citation publication-type="journal">Bastarrica, M. C., &amp; Simmonds, J. (2019). Gender differences in self and peer assessment in a software engineering capstone course. IEEE/ACM 2nd International Workshop on Gender Equality in Software Engineering, Montreal, CA, May 2019. https://doi.org/10.1109/GE.2019.00014</mixed-citation>
                    </ref>
                                    <ref id="ref9">
                        <label>9</label>
                        <mixed-citation publication-type="journal">Birenbaum, M., Tatsuoka, K. &amp; Gutvirtz, Y. (1992). Effects of response format on diagnostic assessment of scholastic achievement. Applied Psych. Measurement, 16(4), 353-363. https://doi.org/10.1177/0146621692016004</mixed-citation>
                    </ref>
                                    <ref id="ref10">
                        <label>10</label>
                        <mixed-citation publication-type="journal">Bond, T. G., &amp; Fox, C. M. (2015). Applying the Rasch model: Fundamental measurement in the human sciences (3rd ed.). Routledge.</mixed-citation>
                    </ref>
                                    <ref id="ref11">
                        <label>11</label>
                        <mixed-citation publication-type="journal">Brookhards, S. (2014). How to design questions and tasks to assess student thinking. ASCD.</mixed-citation>
                    </ref>
                                    <ref id="ref12">
                        <label>12</label>
                        <mixed-citation publication-type="journal">Cabello, V. M., &amp; Topping, K. J. (2020). Peer assessment of teacher performance. What works in teacher education? International Journal of Cognitive Research in Science, Engineering and Education (IJCRSEE), 8(2), 121-132. https://doi.org/10.5937/IJCRSEE2002121C</mixed-citation>
                    </ref>
                                    <ref id="ref13">
                        <label>13</label>
                        <mixed-citation publication-type="journal">De Marsico, M., Sciarrone, F., Sterbini, A., &amp; Temperini, M. (2017). Supporting mediated peer-evaluation to grade answers to open-ended questions. EURASIA Journal of Mathematics Science and Technology Education, 13(4), 1085-1106. https://doi.org/10.12973/eurasia.2017.00660a</mixed-citation>
                    </ref>
                                    <ref id="ref14">
                        <label>14</label>
                        <mixed-citation publication-type="journal">Demir, M. K. (2012). Analyzing empathy skills of primary school teacher candidates. Buca Faculty of Education Journal, (33), 107-121.</mixed-citation>
                    </ref>
                                    <ref id="ref15">
                        <label>15</label>
                        <mixed-citation publication-type="journal">Douchy, F., Segers, M., &amp; Sluijsmans, D. (1999). The use of self- peer and coassessment in higher education. Studies in Higher Education, 24(3), 331-350. https://doi.org/10.1080/03075079912331379935</mixed-citation>
                    </ref>
                                    <ref id="ref16">
                        <label>16</label>
                        <mixed-citation publication-type="journal">Ebel, R.L., &amp; Frisbie, D.A. (1991). Essentials of educational measurement. Prentice Hall.</mixed-citation>
                    </ref>
                                    <ref id="ref17">
                        <label>17</label>
                        <mixed-citation publication-type="journal">Eckes, T. (2015). Introduction to Many-Facet Rasch Measurement: Analyzing and Evaluating Rater-Mediated Assessments. Peter Lang.</mixed-citation>
                    </ref>
                                    <ref id="ref18">
                        <label>18</label>
                        <mixed-citation publication-type="journal">Eke, C. (2018). Analysis of objectives of high school physics curriculum according to the revised Bloom&#039;s taxonomy. Journal of Social Research and Behavioral Sciences, 4(6), 69-84.</mixed-citation>
                    </ref>
                                    <ref id="ref19">
                        <label>19</label>
                        <mixed-citation publication-type="journal">Ercoşkun, M. H., &amp; Nalçacı, A. (2008). The investigation of the empathic skills and democratic attitudes of the primary school teacher candidates. Milli Eğitim Dergisi, 37(180), 204-215.</mixed-citation>
                    </ref>
                                    <ref id="ref20">
                        <label>20</label>
                        <mixed-citation publication-type="journal">Ercoşkun, M. H., Dilekmen, M., Ada, Ş., &amp; Nalçacı, A. (2006). The investigation of empathic skills of the department of primary school teaching students as regards individual variations. Educational Academic Research, (13), 207–217.</mixed-citation>
                    </ref>
                                    <ref id="ref21">
                        <label>21</label>
                        <mixed-citation publication-type="journal">Erkayıran, O., Şenocak, S. Ü., &amp; Demirkıran, F. (2018). Investigation of empathic skill levels of nursing students in terms of some variables: A cross-sectional study. Journal of Nursing Science, 1(2), 01–04.</mixed-citation>
                    </ref>
                                    <ref id="ref22">
                        <label>22</label>
                        <mixed-citation publication-type="journal">Erman-Aslanoglu, A., Karakaya, İ., Sata, M. (2020). Evaluation of university students’ rating behaviors in self and peer rating process via many facet rasch model. Eurasian Journal of Educational Research, 20(89), 25-46. https://izlik.org/JA58KH69DL</mixed-citation>
                    </ref>
                                    <ref id="ref23">
                        <label>23</label>
                        <mixed-citation publication-type="journal">Falchikov, N., &amp; D. Boud. (1989). Student self-assessment in higher education: A meta-analysis. Review of Educational Research, 59(4), 395–430. https://doi.org/10.2307/1170205</mixed-citation>
                    </ref>
                                    <ref id="ref24">
                        <label>24</label>
                        <mixed-citation publication-type="journal">Falchikov, N., &amp; Goldfinch, J. (2000). Student peer assessment in higher education. A meta-analysis comparing peer and teacher marks. Review of Educational Research, 70(3), 287-322. https://doi.org/10.2307/1170785</mixed-citation>
                    </ref>
                                    <ref id="ref25">
                        <label>25</label>
                        <mixed-citation publication-type="journal">Fang, J.-W., Chang, S.-C., Hwang, G.-J., &amp; Yang, G. (2021). An online collaborative peer‑assessment approach to strengthening pre-service teachers&#039; digital content development competence and higher‑order thinking tendency. Education Tech Research Dev, 69, 1155–1181. https://doi.org/10.1007/s11423-021-09990-7</mixed-citation>
                    </ref>
                                    <ref id="ref26">
                        <label>26</label>
                        <mixed-citation publication-type="journal">Farrokhi, F., Esfandiari, R., &amp; Schaefer, E. (2012). A many-facet Rasch measurement of differential rater severity/leniency in three types of assessment. JALT Journal, 34(1), 79-101.</mixed-citation>
                    </ref>
                                    <ref id="ref27">
                        <label>27</label>
                        <mixed-citation publication-type="journal">Farrokhi, F., Esfandiari, R., &amp; Vaez Dalili, M. (2011). Applying the many-facet Rasch model to detect centrality in self-assessment, peer-assessment and teacher assessment. World Applied Sciences Journal, 15(11), 76-83.</mixed-citation>
                    </ref>
                                    <ref id="ref28">
                        <label>28</label>
                        <mixed-citation publication-type="journal">Fraenkel, J. R., &amp; Wallen, N. E. (2005). How to design and evaluate research in education. McGraw-Hill.</mixed-citation>
                    </ref>
                                    <ref id="ref29">
                        <label>29</label>
                        <mixed-citation publication-type="journal">Genç, S. Z., &amp; Kalafat, T. (2010). Prospective teachers’ problem solving skills and emphatic skills. Journal of Theoretical Educational Science, 3(2), 135-147.</mixed-citation>
                    </ref>
                                    <ref id="ref30">
                        <label>30</label>
                        <mixed-citation publication-type="journal">Gielen, S., Dochy, F., &amp; Onghena, P. (2010). An inventory of peer assessment diversity. Assessment and Evaluation in Higher Education 36(2), 137-155. https://doi.org/10.1080/02602930903221444</mixed-citation>
                    </ref>
                                    <ref id="ref31">
                        <label>31</label>
                        <mixed-citation publication-type="journal">Gierl, M. J., Latifi, S., Lai, H., Boulais, A. P., &amp; Champlain, A. D. (2014). Automated essay scoring and the future of educational assessment in medical education. Medical Education, 48, 950-962. https://doi.org/10.1111/medu.12517</mixed-citation>
                    </ref>
                                    <ref id="ref32">
                        <label>32</label>
                        <mixed-citation publication-type="journal">Goodrich, H. (1997). Understanding Rubrics: The dictionary may define &quot;rubric&quot;, but these models provide more clarity. Educational Leadership, 54(4), 14-17.</mixed-citation>
                    </ref>
                                    <ref id="ref33">
                        <label>33</label>
                        <mixed-citation publication-type="journal">Heritage, M. (2007). Formative assessment: What do teachers need to know and do? Phi Delta Kappan.  https://kappanonline.org/formative-assessment-heritage/</mixed-citation>
                    </ref>
                                    <ref id="ref34">
                        <label>34</label>
                        <mixed-citation publication-type="journal">Kane, J., Bernardin, H., Villanueva, J., &amp; Peyrefitte, J. (1995). Stability of rater leniency: Three studies. Academy of Management Journal, 38(4), 1036-1051. https://doi.org/10.2307/256619</mixed-citation>
                    </ref>
                                    <ref id="ref35">
                        <label>35</label>
                        <mixed-citation publication-type="journal">Kane, L. S. &amp; Lawler, E. E. (1978). Methods of peer assessment. Psych. Bull., 85(3), 555-586. https://doi.org/10.1037/0033-2909.85.3.555</mixed-citation>
                    </ref>
                                    <ref id="ref36">
                        <label>36</label>
                        <mixed-citation publication-type="journal">Karakaya, İ. (2015). Comparison of self, peer, and instructor assessments in the portfolio assessment by using the many-facet RASCH model. Journal of Education and Human Development 4(2), 182-192. https://doi.org/10.15640/jehd.v4n2a22</mixed-citation>
                    </ref>
                                    <ref id="ref37">
                        <label>37</label>
                        <mixed-citation publication-type="journal">Kim, Y., Park, I., &amp; Kang, M. (2012). Examining rater effects of the TGMD-2 on children with intellectual disability. Adapted Physical Activity Quarterly, 29(4), 346-365. https://doi.org/10.1123/apaq.29.4.346</mixed-citation>
                    </ref>
                                    <ref id="ref38">
                        <label>38</label>
                        <mixed-citation publication-type="journal">Knoch, U., Read, J., &amp; von Randow, T. (2007). Re-training writing raters online: How does compare with face-to-face training? Assessing Writing, 12(2), 26-43. https://doi.org/10.1016/j.asw.2007.04.001</mixed-citation>
                    </ref>
                                    <ref id="ref39">
                        <label>39</label>
                        <mixed-citation publication-type="journal">Kylonen, P. C. (2012). Measurement of 21st century skills within the common core state standards. Paper presented at the Invitational Research Symposium on Technology Enhanced Assessments, USA, May 2012.</mixed-citation>
                    </ref>
                                    <ref id="ref40">
                        <label>40</label>
                        <mixed-citation publication-type="journal">La Velle, L. (2019). The theory–practice nexus in teacher education: New evidence for effective approaches. Journal of Education for Teaching, 45(4), 369-372. https://doi.org/10.1080/02607476.2019.1639267</mixed-citation>
                    </ref>
                                    <ref id="ref41">
                        <label>41</label>
                        <mixed-citation publication-type="journal">Lejk, M. &amp; Wyvill, M. (2001). The effect of the inclusion of self-assessment with peer-assessment of contributions to a group project: A Quantitative study of secret and agreed assessments. Assessment and Evaluation in Higher Education, 26(6), 551–61. https://doi.org/10.1080/02602930120093887</mixed-citation>
                    </ref>
                                    <ref id="ref42">
                        <label>42</label>
                        <mixed-citation publication-type="journal">Leonard, D. K., &amp; Jiang, J. (1999). Gender bias and the college predictors of the SATs: A cry of Despair. Research in Higher Education, 40(4), 375-407.</mixed-citation>
                    </ref>
                                    <ref id="ref43">
                        <label>43</label>
                        <mixed-citation publication-type="journal">Li, H., Xiong, Y., Hunter, C. V., Guo, X., &amp; Tywoniw, R. (2020). Does peer assessment promote student learning? A meta-analysis. Assessment and Evaluation in Higher Education, 45(2), 193-211. https://doi.org/10.1080/02602938.2019.1620679</mixed-citation>
                    </ref>
                                    <ref id="ref44">
                        <label>44</label>
                        <mixed-citation publication-type="journal">Linacre, J. M. (1989). Many-facet Rasch measurement. MESA Press.</mixed-citation>
                    </ref>
                                    <ref id="ref45">
                        <label>45</label>
                        <mixed-citation publication-type="journal">Linacre, J. M. (2012). FACETS (Version 3.70.1) [Computer Software]. https://www.winsteps.com/facgood.htm</mixed-citation>
                    </ref>
                                    <ref id="ref46">
                        <label>46</label>
                        <mixed-citation publication-type="journal">Linacre, J. M. (2017). FACETS (Version 3.80.0) [Computer Software]. https://www.winsteps.com/facgood.htm</mixed-citation>
                    </ref>
                                    <ref id="ref47">
                        <label>47</label>
                        <mixed-citation publication-type="journal">Main, J. B., &amp; Sánchez-Peña, M. (2015). Student evaluations of team members: Is there gender bias? Paper presented at IEEE Frontiers in Education Conference (FIE), TX, USA, October, 2015. https://doi.org/10.1109/FIE.2015.7344177.</mixed-citation>
                    </ref>
                                    <ref id="ref48">
                        <label>48</label>
                        <mixed-citation publication-type="journal">Maslach, C., &amp; Jackson, S. E. (1981). The measurement of experienced burnout. Journal of Occupational Behavior, 2(2), 99–113. https://doi.org/10.1002/job.4030020205</mixed-citation>
                    </ref>
                                    <ref id="ref49">
                        <label>49</label>
                        <mixed-citation publication-type="journal">Mertler, C. A. (2016). Classroom assessment: A practical guide for educators. Routledge.</mixed-citation>
                    </ref>
                                    <ref id="ref50">
                        <label>50</label>
                        <mixed-citation publication-type="journal">Ministry of National Education (2017). Fen bilimleri dersi öğretim programı (ilkokul ve ortaokul 3, 4, 5, 6, 7 ve 8 .sınıflar) [Science course curriculum (primary and secondary school 3rd, 4th, 5th, 6th, 7th and 8th grades)]. Ankara, Turkey</mixed-citation>
                    </ref>
                                    <ref id="ref51">
                        <label>51</label>
                        <mixed-citation publication-type="journal">Mumpuni, K. E., Priyayi, D. F., &amp; Widoretno, S. (2022). How do students perform a peer assessment? International Journal of Instruction, 15(3), 751-766. https://doi.org/10.29333/iji.2022.15341a</mixed-citation>
                    </ref>
                                    <ref id="ref52">
                        <label>52</label>
                        <mixed-citation publication-type="journal">Myford, C. M., &amp; Wolfe, E. W. (2003). Detecting and measuring rater effects using many-facet Rasch measurement: Part I. Journal of Applied Measurement, 4(4), 386-422.</mixed-citation>
                    </ref>
                                    <ref id="ref53">
                        <label>53</label>
                        <mixed-citation publication-type="journal">Myford, C.M., &amp; Wolfe, E.W. (2000). Strengthening the ties that bind: Improving the linking network in sparsely connected rating designs. Educational Testing Service. https://doi.org/10.1002/j.2333-8504.2000.tb01832.x</mixed-citation>
                    </ref>
                                    <ref id="ref54">
                        <label>54</label>
                        <mixed-citation publication-type="journal">Myford, C.M., &amp; Wolfe, E.W. (2004). Detecting and measuring rater effects using many-facet Rasch measurement: Part II. Journal of Applied Measurement, 5(2), 189-227.</mixed-citation>
                    </ref>
                                    <ref id="ref55">
                        <label>55</label>
                        <mixed-citation publication-type="journal">Nilsson, P. (2013). What do we know and where do we go? Formative assessment in developing student teachers’ professional learning of teaching science. Teachers and Teaching, 19(2), 188-201. https://doi.org/10.1080/13540602.2013.741838</mixed-citation>
                    </ref>
                                    <ref id="ref56">
                        <label>56</label>
                        <mixed-citation publication-type="journal">Oluwatayo, J. A., &amp; Adebule, S. O. (2012). Assessment of teaching performance of student-teachers on teaching practice. International Education Studies, 5(5), 109-115. https://doi.org/10.5539/ies.v5n5p109</mixed-citation>
                    </ref>
                                    <ref id="ref57">
                        <label>57</label>
                        <mixed-citation publication-type="journal">Osman, S. (2021). Basic school teachers’ assessment practices in the sissala east municipality, Ghana. European Journal of Education Studies, 8(7), 44-74. https://doi.org/10.46827/ejes.v8i7.3801</mixed-citation>
                    </ref>
                                    <ref id="ref58">
                        <label>58</label>
                        <mixed-citation publication-type="journal">Palmer, K., &amp; Richardson, P. (2003). On-line assessment and free-response input-a pedagogic and technical model for squaring the circle. Paper presented at Proc. 7th CAA Conference, Loughborough, UK, December 2003.</mixed-citation>
                    </ref>
                                    <ref id="ref59">
                        <label>59</label>
                        <mixed-citation publication-type="journal">Popham, W. J. (2009). Assessment literacy for teachers: Faddish or fundamental? Theory Into Practice, 48(1), 4–11. https://doi.org/10.1080/00405840802577536</mixed-citation>
                    </ref>
                                    <ref id="ref60">
                        <label>60</label>
                        <mixed-citation publication-type="journal">Rahmawati, Y., Ridwan, A., Hadinugrahaningsih, T., &amp; Soeprijanto. (2019, January). Developing critical and creative thinking skills through STEAM integration in chemistry learning. In Journal of Physics: Conference Series (Vol. 1156, p. 012033). IOP Publishing.</mixed-citation>
                    </ref>
                                    <ref id="ref61">
                        <label>61</label>
                        <mixed-citation publication-type="journal">Sadler, P.M. &amp; Good, E. (2006). The Impact of Self- and Peer-Grading on Student Learning. Educational Assessment, 11(1), 1-31.</mixed-citation>
                    </ref>
                                    <ref id="ref62">
                        <label>62</label>
                        <mixed-citation publication-type="journal">Sari, D.K., Dinata, P. A. C., &amp; Uspayanti, R. (2022). The teachers’ competencies to develop assessments for high school students in Merauke. Ishlah: Jurnal Pendidikan, 14(3), 3199 – 3206. https://doi.org/10.35445/alishlah.v14i</mixed-citation>
                    </ref>
                                    <ref id="ref63">
                        <label>63</label>
                        <mixed-citation publication-type="journal">Sasmaz-Oren, F. (2012) The effects of gender and previous experience on the approach of self and peer assessment: a case from Turkey. Innovations in Education and Teaching International, 49(2), 123-133. https://doi.org/10.1080/14703297.2012.677598</mixed-citation>
                    </ref>
                                    <ref id="ref64">
                        <label>64</label>
                        <mixed-citation publication-type="journal">Şata, M. (2022). Açık uçlu sorular. In İ. Karakaya (Edt.), Açık uçlu soruların hazırlanması uygulanması ve değerlendirilmesi, 1-11. Pegem.</mixed-citation>
                    </ref>
                                    <ref id="ref65">
                        <label>65</label>
                        <mixed-citation publication-type="journal">Şata, M., &amp; Karakaya, İ. (2021). Investigating the effect of rater training on differential rater function in assessing academic writing skills of higher education students. Journal of Measurement and Evaluation in Education and Psychology, 12(2), 163-181. doi: 10.21031/epod.842094</mixed-citation>
                    </ref>
                                    <ref id="ref66">
                        <label>66</label>
                        <mixed-citation publication-type="journal">Shen, B., &amp; Bai, B. (2019). Facilitating university teachers’ continuing professional development through peer-assisted research and implementation teamwork in China. Journal of Education for Teaching, 45(4), 476-480. https://doi.org/10.1080/02607476.2019.1639265</mixed-citation>
                    </ref>
                                    <ref id="ref67">
                        <label>67</label>
                        <mixed-citation publication-type="journal">Soland, J., Hamilton, L. S., &amp; Stecher, B. M. (2013). Measuring 21st century competencies: Guidance for educators. RAND Corporation.</mixed-citation>
                    </ref>
                                    <ref id="ref68">
                        <label>68</label>
                        <mixed-citation publication-type="journal">Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12(2), 257–285. https://doi.org/10.1207/s15516709cog1202_4</mixed-citation>
                    </ref>
                                    <ref id="ref69">
                        <label>69</label>
                        <mixed-citation publication-type="journal">Takeda, S., &amp; Homberg, F. (2014). The effects of gender on group work process and achievement: An analysis through self- and peer-assessment. British Educational Research Journal, 40(2), 373–396.   https://doi.org/10.1002/berj.3088</mixed-citation>
                    </ref>
                                    <ref id="ref70">
                        <label>70</label>
                        <mixed-citation publication-type="journal">Taş, U. E., Arıcı, Ö., Ozarkan, H. B., &amp; Özgürlük, B. (2016). PISA 2015 ulusal raporu [PISA 2015 national report]. Ankara, Turkey: Milli Eğitim Bakanlığı Yayınları.</mixed-citation>
                    </ref>
                                    <ref id="ref71">
                        <label>71</label>
                        <mixed-citation publication-type="journal">Torres-Guijarro, S., &amp; Bengoechea, M. (2017). Gender differential in self-assessment: A fact neglected in higher education peer and self-assessment techniques. Higher Education Research and Development, 36(5), 1072-1084. https://doi.org/10.1080/07294360.2016.1264372</mixed-citation>
                    </ref>
                                    <ref id="ref72">
                        <label>72</label>
                        <mixed-citation publication-type="journal">van Zundert, M., Sluijsmans, D., &amp; van Merriënboer, J. J. G. (2010). Effective peer assessment processes: Research findings and future directions. Learning and Instruction, 20(4), 270-279. https://doi.org/10.1016/j.learninstruc.2009.08.004</mixed-citation>
                    </ref>
                                    <ref id="ref73">
                        <label>73</label>
                        <mixed-citation publication-type="journal">van-Trieste, R. F. (1990). The relation between Puerto Rican university students’ attitudes toward Americans and the students’ achievement in English as a second language. Homines, (13–14), 94–112.</mixed-citation>
                    </ref>
                                    <ref id="ref74">
                        <label>74</label>
                        <mixed-citation publication-type="journal">Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Harvard University Press.</mixed-citation>
                    </ref>
                                    <ref id="ref75">
                        <label>75</label>
                        <mixed-citation publication-type="journal">Wainer, H., &amp; Steinberg, L. S. (1992) Sex differences in performance on the mathematics section of the scholastic aptitude test: A bidirectional validity study. Harvard Educational Review, 62(3), 323-336. https://doi.org/10.17763/haer.62.3.1p1555011301r133</mixed-citation>
                    </ref>
                                    <ref id="ref76">
                        <label>76</label>
                        <mixed-citation publication-type="journal">Wen, M. L., &amp; Tsai, C. (2008). Online peer assessment in an in-service science and mathematics teacher education course. Teaching in Higher Education, 13(1), 55-67. https://doi.org/10.1080/13562510701794050</mixed-citation>
                    </ref>
                                    <ref id="ref77">
                        <label>77</label>
                        <mixed-citation publication-type="journal">Wilson, F. R., Pan, W., &amp; Schumsky, D. A. (2012). Recalculation of the critical values for Lawshe’s content validity ratio. Measurement and Evaluation in Counseling and Development, 45(3), 197-210.   https://doi.org/10.1177/07481756124402</mixed-citation>
                    </ref>
                                    <ref id="ref78">
                        <label>78</label>
                        <mixed-citation publication-type="journal">Winke, P., Gass, S., &amp; Myford, C. (2013). Raters’ L2 background as a potential source of bias in rating oral performance. Language Testing, 30(2), 231-252. https://doi.org/10.1177/0265532212456968</mixed-citation>
                    </ref>
                                    <ref id="ref79">
                        <label>79</label>
                        <mixed-citation publication-type="journal">Wolfe, E. W., &amp; McVay, A. (2012). Application of latent trait models to identify raters exhibiting score scale usage problems. Applied Measurement in Education, 25(2), 125–143.</mixed-citation>
                    </ref>
                                    <ref id="ref80">
                        <label>80</label>
                        <mixed-citation publication-type="journal">Yaz, Ö., &amp; Kurnaz, M. (2017). The Examination of 2013 Science Curricula. International Journal of Turkish Education Science, 2017(8), 173-184.</mixed-citation>
                    </ref>
                                    <ref id="ref81">
                        <label>81</label>
                        <mixed-citation publication-type="journal">Yenen, E. T. (2021). Prospective teachers’ professional skill needs: A Q method analysis. Teacher Development, 25(2), 196–214. https://doi.org/10.1080/13664530.2021.1877188</mixed-citation>
                    </ref>
                                    <ref id="ref82">
                        <label>82</label>
                        <mixed-citation publication-type="journal">Yeşilçınar, S., &amp; Şata, M. (2021). Examining Rater Biases of Peer Assessors in Different Assessment Environments. International Journal of Psychology and Educational Studies, 8(4), 136-151. https://izlik.org/JA45TW35KZ</mixed-citation>
                    </ref>
                            </ref-list>
                    </back>
    </article>
