Assessment of PISA 2012 Results With Quantile Regression Analysis Within The Context of Inequality In Educational Opportunity

The importance of educational opportunity inequality has been increasing within the context of education systems during recent years. In addition to quality in education, opportunity equality is among the significant paradigms in countries of high educational performance. Thus, it is of utmost importance to research the relationship between socio-economic characteristics of the students and achievement based on opportunity equality. Especially to remove the gap observed in Turkish literature is among the objectives of the present study. The main objective of the study is to assess the socio-demographic characteristics that affect the achievement of students in mathematics within the context of educational opportunity equality for PISA 2012 Turkey sample. Data analysis was conducted with quantile regression (QR) and classical linear regression (OLS). As a result, it was determined that students’ family background, familiarity with information and communication technology and school climate were affective on mathematics achievement. It was observed that as parentel education, educational resources at home, and index of familty wealth increased, mathematics achievement increased as well. It was also observed that time of computer use had a negative effect on achievement in mathematics. Furthermore, study findings identified that the achievement of male students was higher than females.


Introduction
The future of individuals and their future life standards are closely related to the education they receive.A higher level of education is significant in achieving higher standards of life.Thus, the main purpose of education with respect to personal development and benefit of the society is to elevate the development potential of individuals to the highest possible level.In other words, the education system should operate to provide equal opportunities for equal success and to shape their future for all independent of individual and sociocultural characteristics, socioeconomic standing and health conditions or other factors.Because, a failure in achievement of educational goals would influence the individual all through her or his life negatively.Therefore, it is necessary to scrutinize the problem of inequality in education comprehensively.In their report titled "Turkish Education System Equality and Educational Achievement Report and Analysis," Oral and McGivney (2014) stated that "highest performing education systems are those that integrate quality and equality and could provide quality education opportunities for all children." The relationship between the socioeconomic status and the educational level and success of the student is a subject that has been often investigated in the literature.Hence, James S. Coleman during 1960's reported that familial characteristics of students in the United States had a significant impact on educational achievement in his study titled "Equality of Educational Opportunity".Under the light shed by that study, the subject of equality of opportunity in education was assessed from different perspectives since 1960's and the studies especially focus on reorganization of education systems in countries to provide better equality of opportunity.Heyneman and Loxley determined that the effect of school resources on educational achievement was more significant than individual traits in developed countries and as the income level of the country decreased, the impact of school resources on educational achievement increased in a study they conducted.Also in the PISA research, which is also the scope of our study, it was determined that, in countries with low performances in PISA, educational achievement was closely related to the socioeconomic status of the family.
The objective of the study is to assess PISA 2012 Turkey sample with respect to equality of opportunity in education and socio-demographical features that affect achievement.Classical linear regression, as well as quantile regression were perused in this study.Classical linear regression analysis results provide us the effect of all explanatory variables (socio-demographic variables) at the conditional mean of the scores' distribution.However, we also want to know the effect of the explanatory variables at different parts of the score's distribution.Thus, we estimated quantile regressions (introduced by Koenker and Basset, 1978) at 5th, 25th, 50th, 75th and 95th quantiles.We followed Giambona and Porcu (2015), Santos (2007) andFertig's (2003b) methods in this matter.Apart from estimating variables' effects at different parts of the distribution, quantile regressions have several other advantages compared to OLS (Ordinary Least Squares).First, they give less weight to outliers in the dependent variable when compared to the OLS.Second, estimation is a more robust method, because it allows the marginal effects of explanatory variables to differ across the quantiles of the dependent variable.Third, when error-terms are non-normal, quantile regression estimators may be more efficient than OLS Alphanumeric Journal Volume 4, Issue 2, 2016 estimators.Finally, the semi-parametric nature of the approach relaxes the restrictions on the parameters to be constant across the entire distribution of the dependent variable (Rangvid, 2003: 12).
In the present study, a literature review of the studies that addressed PISA data with respect to equality of opportunity in education and quantile regression analysis is initially presented, and later on a detailed explanation of quantile regression analysis is given, followed by the actual quantile regression analysis to demonstrate the sociodemographic characteristics that affect the success of Turkey in PISA mathematics scores.

Literature Review
During recent years, several studies emerged that scrutinize inequality of opportunity in education with respect to PISA results along with the increasing importance of equality of opportunity in education.In one of these studies conducted by Natkhov and Kozina (2012), a measure of inequality of educational opportunity, which was estimated as a share of the variation in the 2009 PISA test scores explained by predetermined family characteristics, was reported for 72 countries.The results of this study showed that there was a negative relationship between the inequality of educational opportunity and educational achievement.In countries where family background played a major role in determining individual progress, a lower mean educational achievement was found.Tansel A. (2015), scrutinized the effects of inequality of opportunity in education on educational achievements in Turkey over time.Tansel used mathematics, science and reading achievement scores of 15 years old students in PISA between the years of 2003 and 2012.Study results demonstrated that inequality of opportunity in educational achievement declined marginally in Turkey in time and the most important determinants of the inequality in educational achievement were family background variables, which remained consistent over time.
In a study by Carvalho et al. (2013), a bi-dimensional (achievement-EOp and access-EOp) index and a Cobb-Douglas functional form were utilized.In that study recent PISA data for six Latin-American countries were used and rank reversals were observed with respect to orderings based on a single dimension.Authors of that study hypothesized two educational opportunity dimensions: (i) access, and (ii) achievement conditional on access, which relied on different perspective in EOp literature, and hence measured using a specific procedure before they were intergrated to form a composite index.
By simply using the variance or the standard deviation of test scores, Ferreira and Gignoux (2011) measured the inequality of achievement among all countries that participated in PISA 2006 survey.They utilized two alternative two-sample nonparametric procedures in order to assess the strength of the inequality measure on sample selection biases and utilized those on four countries with the smallest sample coverage in PISA (as a share of the total number of 15 years old students).Martins and Veiga (2010)  the study demonstrated that the differences between these countries were significant.In Germany, Greece, Great Britain, Belgium, and Portugal, the inequality was greater, while in in Sweden and Finland the results were reversed.According to the findings, socioeconomic factors explained 14.9% -34.6% of the overall inequality in education.
Authors compared the most successful five OECD countries with the five most unsuccessful in PISA Mathematical literacy exams both in 2003 and 2006 based on social justice and equality of opportunity in education in a study by Aydın, Uysal and Sarıer (2010).They scrutinized the data based on financial and human resources invested in education, the learning environment and organization in schools.Inadequacies in these dimensions were determined to have a negative impact on achievement of the students.As a result, the study reflected that the conditions and opportunities of the students were not equal.They have adequately emphasized that since the system could not provide equal opportunities and chances in education, it might not be possible to talk about success and failure.
Studies which assessed educational opportunity equality with quantile regression analysis provided a significant contribution to the literature during recent years.In one of these studies, Beblavy et al. (2014) investigated a subject which has been subject of extensive discussions previously.They analyzed the relationship between grouping based on ability and equality of educational opportunities in students with different performances in 4 OECD countries.Results demonstrated that in-classroom ability could have a positive or negative impact on educational equality based on the location of performance of the student on PISA achievement distribution.Based on quantile regression results, low performers were affected most by inequality effects in Belgium, while high performers were the most affected ones in Austria.The study was not able to find a significant relationship in Finnish students, while the results for Hungary were similar to those in Austria.Fertig (2003a) conducted a study with 15 -16 years old German students where relationship between the individual-level reading test scores of students with individual and family background information and their characteristics of the school and class were scrutinized.Several quantile regression analyses were conducted in the study, which showed that the findings did not support several known explanations such as high level school regulations or sharing non-citizen students among the participants substantially.On the contrary, findings demonstrated that schools with a more homogenous body of students had better educational achievement.
A study by Schneeweis and Winter-Ebmer's study (2005) was focused on educational production in Austria and potential effect of peers on students' academic achievement where PISA 2000 data for 15 -16 years old students were used to estimate peer effects.The estimates demonstrated that socioeconomic composition of peer groups had significant positive effects on student achievement.Also, quantile regressions suggested that peer effects favored low-ability students that is students with lower skills benefited more by exposure to clever peers, while high-skill students were not much affected.It was also reported that social heterogeneity did not have substantial adverse effects on academic achievement.Volume 4, Issue 2, 2016 Using PISA 2000 reading test scores, Santos (2007) studied the factors and distribution of educational quality in Argentina.Survey weighted regressions and quantile regressions were used to estimate educational production functions at the mean of the distribution in the study.Study results showed that, since girls perform significantly better than boys, educational policies should address gender issues to improve mean reading scores.It was also determined that the classroom population should not be more than 40-45 students and quality resources such as libraries, laboratory equipment and multi-media technology should be provided for schools.Study findings suggested that autonomy of the teachers was also important at improving students´ achievements, in addition to their relationships with students and their openness to institutional change.Lounkaew (2013) contributed to the educational equality debate by comparing urban and rural areas using the extensive data variables available in Thai PISA data.Students' education production function estimates based on unconditional quantile regression showed that the student and family contributions, and school characteristics were not symmetrical across achievement distributions and gender.Oaxaca-Blinder decomposition results indicated the significance of nontangible school characteristics in explaining the differences between the achievements of urban and rural students.Achievement percentile decomposition exercises demonstrated the increasing role of these characteristics; as the achievement percentile increased, these characteristics explained the gap better.Giambona and Porcu (2015) examined the individual background characteristics that influenced the achievement of Italian 15 years-old students in reading via the analysis of 2009 OECD-PISA survey data and using the quantile regression (QR) approach.Results demonstrated that the predictors had significant impact on reading achievement operating differently across quantiles.These findings suggested that different paths should be followed to improve the achievements of low and high performing readers.Specifically, certain predictors such as family background (parental education, computer availability at home, and availability of a desk for homework at home), the school program attended and, the region of student domicile all played a significant but a different role when low and high performing readers were compared.For instance, parental education demonstrated a positive impacy on student reading and general programs performed better than occupational or technical programs, and Northern regions performed better than Central-Southern regions, and all had different effects on the the distribution of students' reading scores.

Alphanumeric Journal
Conducted literature review revealed that a study which assessed PISA scores using quantile regression has not been conducted in Turkey.Thus, it was considered that the current study would significantly contribute to the literature.The Programme for International Student Assessment (PISA) is an international educational research program where students' knowledge and skills on the fields of mathematics, science, and reading skills were assessed and the research is organized by Organisation for Economic Co-operation and Development (OECD) in member and non-member nations.The present study evaluates the level of basic knowledge and skills of students in 15 age group in OECD and other participating countries to participate in modern society (PISA 2009National Preliminary Report, 2010: 1).

Data and Method
PISA project aims to measure the proficiencies of 15-year-old pupils that attend formal education at the end of compulsory education in using their knowledge and skills in situations they could encounter in today's information society, not the level of their learning achievements in subjects in the curriculum (mathematics, science and reading skills) (http://earged.meb.gov.tr).
PISA project focuses on only one of the fields of reading skills, mathematics and science in each period as the main field.However, the assessments in the other two fields are included in the context of the study as well.In a circle of nine years, each of these fields become the main field once.In 2000 main field was reading skills in PISA application, it was mathematics literacy in 2003, and it was science literacy in 2006.In 2009, a new nine years long cycle has started and the application again focused on reading skills (PISA 2009National Preliminary Report, 2010: 2).Finally, in the most recent report published in 2012, PISA 2012, mathematics literacy was the primary field.
Since the primary field in PISA 2012 application was achievement in mathematics, this field was selected as the dependent variable in the present study.Explanatory variables used in the study were common variables used extensively in the literature and their definitions are presented in Table 1.Literature review was determinative in the selection of explanatory variables.In fact, in the literature the explanatory variables were categorized under the titles of i) students' characteristics, ii) students' family background, iii) familiarity with information and communication technologies, and iv) school climate.It was assessed that these explanatory variables would be an indicator of equality/inequality when considered with respect to inequality of opportunity.1. Description of the Variables *: These indices were computed by OECD and available in the PISA database.

Quantile Regression Analysis
As is known, the objective of classical regression analysis is to establish a relationship between a dependent variable (response variable) and independent variables Alphanumeric Journal Volume 4, Issue 2, 2016 (explanatory, predictor variables).In applications, response variable for any constant value of explanatory variable is accepted as a random variable and the value of response variable is generally summarized with the arithmetic mean in this framework.Classical regression analysis focuses on the mean and utilizes a function that determines the conditional mean of the response value for any constant explanatory variable (Hao and Naiman, 2013: 1).
In the following classical linear regression model: where i y is a continuous response variable and i x is an explanatory variable.It is assumed that i  random variable has a normal distribution with a zero mean and 2  variance.After this model was applied to data,   E y x , which is the conditional mean of y for given xi variable is calculated.
Parameters of this model is found by minimizing the residual total of squares:


There is no problem in classical regression analysis when the hypotheses of classical regression analysis are proven, in other words under ideal conditions.However, these hypotheses do not always conform to the real world.The corresponding response values to constant explanatory variable values could not always reflect a symmetrical distribution.Furthermore, the variances of these values could (heteroscedasticity). Heavy tailed distributions and outliers could be encountered (Jalali andBabanezhad, 2011: 1947).General approach within the framework of classical regression analysis is to exclude the outliers from the analysis.Which translates into losing valuable data for social sciences.Quantile regression facilitates the understanding of outliers at the end of the tails of the distribution instead of excluding them.Under these conditions, quantile regression becomes a tool for more productive analyses instead of using classical regression.Quantile regression, similar to other regression models, aims to explain the relationship between the variables.
Quantile regression models are used to estimate conditional mean functions and conditional quantile functions.Quantile regression is the generalized form of median regression for the determined quantiles.These regression models are less sensitive to marginal values and skewness when compared to OLS.Quantile regression initially emerged as a robust regression technique that ignores the normal distribution of error terms, one of the classical hypotheses and it is a method that was designed to present a mor comprehensive regression view (Koenker, 2005:112).
OLS method models the relationship between one or more explanatory variables and the conditional mean of Y dependent variable when X-x is given.On the other hand, quantile regression proposed by Koenker and Bassett (1978) provides a suitable method for modeling conditional quantile functions (Koenker and Hallock, 2001: 145).Quantile regression is especially useful in cases where conditional quantiles reflect variations.The method identifies the regression coefficients based on quantiles (Kurtoğlu, 2011: 33).
As stated by Mosteller and Tukey (1977, p. 266); "What the regression curve does is give a grand summary for the averages of the distributions corresponding to the set Alphanumeric Journal Volume 4, Issue 2, 2016 of x's.We could go further and compute several different regression curves corresponding to the various percentage points of the distributions and thus get a more complete picture of the set.Ordinarily this is not done, and so regression often gives a rather incomplete picture.Just as the mean gives an incomplete picture of a single distribution, so the regression curve gives a corresponding incomplete picture for a set of distributions." Since numerous quantiles are modeled in quantile regression, it is possible to understand how the response variable is affected by the predictors in detail.
For instance, the relationship between the years as a professor and their salaries of 459 professors in the USA is presented in Figure 1.The figure is plotted for 0.25th, 0.50th, and 0.75th quartiles.The most interesting aspect of the figure is the fact that despite the quantile regression obtained for 0.25th and 0.50th quartiles gravitated downwards after working as a professor for 20 years, the regression for the 0,75th quartile gravitated upwards after 20 years.In short, these data, figure and results demonstrate that focusing on the mean in regression could be misleading in certain situations.Median is a special quantile.A median divides a data cluster into two, quartiles into four, deciles into ten, and percentiles into hundred.Generally all divisors are called "quantiles."Quantile regression models could be fitted to data with the minimization of a distance measurement that is generalized using algorithms based on linear programming.Provided that the quantile is shown as p (0 < p < 1), weighted total of the distances to the theoretical line (without squaring) is minimized in quantile regression: p is accepted as the weight for the points above the matching line and 1-p is accepted as the weight for the points below the matching line.For instance, quantile functions with different conditions could be matched for quantile values such as p=0.25; 0.50; 0.75 (Hao and Naiman, 2013:33).
Conditional quantile is specified by the basic quantile regression model as a linear function of explanatory variables.The usual way to write the quantile regression model for the τth quantile is: The minimization of the sum of absolute deviations residuals could be solved with the τth quantile (0< τ<1) of y:


Where τ is the conditional quantile of interest and all positive residuals receive a weight of τ, while the negative ones receive a weight of ( τ-1).Thus, any component of the Quantile regression coefficients β(τ) provides an estimate for the marginal effect of the associated explanatory variable on the response for the τth quantile, controlling for the remaining variables.
Thus, the quantile regression model to be used in the present study could be written as follows: Where Quant

( / )
ii YX denotes the τ th quantile of the math score Y conditional on the vector of explanatory i X variables.

Results
Table 2 presents the results obtained with OLS and QR from the 5th to 95th percentile of the distribution of the students' math scores.The first column reports the selected covariates, columns 2-11 reflect the coefficients for the three main quantiles 0.05, 0.25, 0.50, 0.75 and 0.95, reported with their level of significance, finally the last two columns report the OLS coefficients and the related levels of significance.
According to Giambona and Porcu (2015), if β coefficients of the quantile regression are significant and different from the OLS β regressions, then the use of quantile regression is more efficient than the regression on the average.Thus, when OLS and quantile regression results are compared, it is possible to argue that there were several differences between the results obtained with two models generally.
Comparison of the model results based on gender demonstrates that both results show gender variable significantly affected mathematics achievement and male students were more successful than female students.Quantile regression findings showed that the gender variable coefficient increased from 0.05th quantile towards the 0.95th quantile.This finding was consistent with the main experimental results found in the literature, despite the fact that the gender gap decreased at the upper quantile of the distribution as observed in quantile regression coefficients.The mathematics achievement scores of students that never failed a year at school were higher than those who did and the impact of REPEAT increased across the quantiles.
Alphanumeric Journal Volume 4, Issue 2, 2016 It was observed that mathematics anxiety (ANXMATH) significantly affected achievement both in OLS and quantile regression, and quantile regression findings demonstrated that anxiety increased at a great extend at high achievement levels.
While the impact of BELONG variable on mathematics performance was significant based on OLS regression results (p = 0.0068), the parameter of this variable was not significant based on quantile regression.Similarly, AGE AT ISCED1 variable was significant in OLS regression (p = 0.0034), it was insignificant based on quantile regression results.Based on OLS findings, there was a negative correlation between schooling age and mathematics achievement.This could be interpreted as the mathematics achievement would decrease as the the schooling age would increase.
Disciplinary climate (DICIPLINE) was positively correlated with mathematics performance and the effect was increasing across the quantiles based on OLS and quantile regression results.The highest parental education (PARED) had a significant effect on mathematics scores and there was a positive correlation between the two.That is, mathematics achievement score increased with the parent's education level.Home educational resources (HEDRES) had significantly positive effects on mathematics score, but at the 95th quantile the effect was not significant.Between the index of family wealth (WEALTH) and mathematics score there was a significant and positive correlation based on OLS results, however, quantile regression scores varied across the quantiles.Wealth was not significant and negatively correlated with mathematics score in the 0.05th quantile, but there was a significantly positive correlation between the coefficient and mathematics score in the remaining four quantiles and the correlation increased from 0.25th to 0.95th quantiles.
There was no correlation between availability of computer resources at school (ICTSCHOLL) and mathematics score, whereas availability of computer resources at home (ICTHOME) has differing effects across different quantiles and based on the OLS.Based on OLS findings, there was a significant and positive correlation between ICTHOME and mathematics score, as well as 0.05th, 0.25th and 0.75th quantiles based on quantile regression findings.However, ICTHOME was not significantly correlated with mathematics score based on the median and the 0.95th quantile regression results.There was a negative correlation between the time of computer use (TIMEINT) and mathematics score for all results except for the quantile regression at the 0.05th quantile (p = 0.8209).This finding demonstrates that mathematics score decreased with an increased time on computers.
An analysis of the variables included in the study to evaluate mathematics achievement based on school climate for the results of both regression models showed that student-teacher relationship (STUDREL) did not affect mathematics achievement, teacher behavior: formative assessment (TCHBEHFA) created a positive and significant effect only in OLS and median regression results, TCHBEHSO had a negative and significant effect in all regression results, and the coefficient for this variable demonstrated an increase from the 0.05th quantile towards the 0.95th quantile.

Conclusion
In this study we analyzed Turkish students' mathematics achievement determinants using the last PISA 2012 survey.We applied Quantile Regression (QR) to assess the impact of selected variables at more than one level of mathematics achievement distribution.We selected 0.05th, 0.25th, 0.50th, 0.75th and 0.95th quantiles for Quantile regression analysis.And also, we used the Ordinary Least Square (OLS) regression to compare with the quantile regression results.We found that students' characteristics, students' family background, familiarity with information and communication technology and school climate were effective factors for math achievement in both OLS and QR models.Furthermore, an aspect of QR results were superior to OLS findings, and that was its capacity to analyze the effect of variables on different distribution points.
Based on gender variable, it was observed that this variable affected mathematics achievement significantly and male students had a higher level of achievement when compared to females.However, QR results showed that gender gap decreased as the analysis moved to upper quantiles of the distribution.
OLS and QR model findings demonstrated that mathematics anxiety (ANXMATH) had a significant impact on achievement and anxiety had an increasing effect in higher levels of achievement.
OLS and QR results showed that disciplinary climate (DICIPLINE) was positively correlated with mathematics achievement and this effect increased across the quantiles.There was a positive and significant correlation between the highest Alphanumeric Journal Volume 4, Issue 2, 2016 parental education (PARED) and mathematics score.That is, mathematics achievement score increased with the parents' education level.Home educational resources (HEDRES) estimation coefficients had significant and positive effects on mathematics achievement with the exception of the 95th quantile.OLS results showed that there was a positive and significant correlation between index of family wealth (WEALTH) and mathematics score, but quantile regression findings differed across the quantiles.Wealth did not have a significant effect in the 0.05th quantile and was correlated negatively with mathematics score, however, in the other four quantiles, there was a positive and significant correlation between this coefficient and mathematics score, which increased through 0.25th to 0.95th quantiles.
There was no correlation between availability of computer resources at school (ICTSCHOLL) and mathematics score, while the effects of availability of computer resources at home (ICTHOME) differed across the OLS and different quantiles.OLS results suggested that there was a positive and significant correlation between ICTHOME and mathematics score in the 0.05th, 0.25th and 0.75th quantiles of quantile regression results.However, there was no correlation based on median and 0.95th quantile regression results.There was a negative and significant correlation between time of computer use (TIMEINT) and mathematics score for all regression results except the 0.05th quantile (p = 0.8209).It could be deducted that mathematics score decreased with the time spent on computer.
Finally, overall application results demonstrated that quantile regression approach was a more robust when compared to OLS since it could analyze the effects on different ends of the distribution.This fact shows that QR approach is significantly superior in demonstrating the differences in educational opportunity inequalities.
investigated mathematics achievements of 15 EU member students in PISA based on socioeconomic inequalities using 2003 data and a concentration index (CI) to analyze the differences between countries.The results of Alphanumeric Journal Volume 4, Issue 2, 2016 PISA 2012 Turkey sample was used in the present study.PISA 2012 Turkey sample included 4848 data.Lost observations based on the variables included in this study were excluded and the data count was reduced to 2962.Alphanumeric Journal Volume 4, Issue 2, 2016