Estimation of population mean under different stratified ranked set sampling designs with simulation study Application to BMI Data

In this article, we have compared the performance ratio-type estimators in some stratified ranked set sampling methods. These sampling methods are stratified random sampling, stratified ranked set sampling, stratified double ranked set sampling and stratified median ranked set sampling. In these methods, the ratio type estimators using auxiliary variable information such as coefficient of variation and kurtosis are examined. We have used a real data set to see the performance of estimators. We use the data concerning body mass index (BMI) as a study variable and the age and the weight as auxiliary variables for 800 people in Turkey in 2014. We stratified the data set using gender. A simulation study is carried out to see performance of the proposed ratio type estimators in these stratified ranked set sampling designs. The performances of these estimators are compared in terms of mean squared error (MSE) and percent relative efficiency (PRE). The importance of this study is that, a detail simulation study is done using a real data set and compare these stratified sampling designs in a study in the sampling literature.


Introduction
Ranked set sampling (RSS) technique was …rst introduced by Mclntyre [11], and Dell and Clutter [3] showed that the mean of the RSS is an unbiased estimator of the population mean, whether or not there are errors in ranking. Following, strati…ed ranked set sampling (SRSS) was suggested by Samawi and Muttlak [13] to obtain a more e¢ cient estimator for a population mean. Samawi [14] proposed an e¢ cient estimator in strati…ed ranked set sampling. Al-Saleh and Al-Kaddiri [1] introduced the concept of double-ranked set sampling (DRSS) and showed that the DRSS estimator is more e¢ cient than the usual RSS estimator in estimating the …nite population mean. Using SRSS, the performances of the combined and STRATIFIED RANKED SET SAM PLING DESIG NS 561 separate ratio estimates were obtained by Samawi and Siam [15]. Muttlak [12] has suggested the median ranked set sampling (MRSS) method for estimating the population mean. Ibrahim at al. [5] suggested estimating the population mean using strati…ed median ranked set sampling (SMRSS). Al-Omari [2] suggested ratio estimation of the population mean using auxiliary information in simple random sampling (SRS) and median ranked set sampling (MRSS). Following Kadilar and Cingi [6], Mandowara and Mehta [10] used the idea of SRSS instead of strati…ed simple random sampling (SSRS) and obtained more e¢ cient ratio type estimators. Koyuncu [7] has proposed ratio and exponential type estimators in MRSS. Khan et al. [9] improved ratio-type estimators using strati…ed double-ranked set sampling (SDRSS). Khan et al. [8] introduced e¢ cient classes of ratio-type estimators of population mean under strati…ed median ranked set sampling. In this article we compares the performance ratio-type estimators given by Mandowara and Mehta [10], ratio-type estimators using strati…ed double-ranked set sampling (SDRSS) given by Khan et al. [9] and e¢ cient classes of ratio-type estimators of population mean under strati…ed median ranked set sampling given by Khan et al [8]. The aim of this study is to make a performance comparison of the proposed ratio-type estimators in these strati…ed sampling designs as a result of a simulation study using a real data set. The remainder of the paper is organized as follows. In Section 2, the designs of strati…ed sampling methods are explained and their ratio-type estimators are given and MSE equations are o¤ered. The results of simulation are reported in Section 3. Finally, we arrive at a conclusion from these results in the last section.

Strati…ed Simple Random Sampling.
In strati…ed sampling the population of N units is …rst divided into L subpopulations of N 1 ; N 2 ; : : : ; N L units. These subpopulations, or also known as strata, are not overlapping and when combined together they form the whole population, i.e. N 1 + N 2 + : : : + N L = N To obtain the full bene…t from strati…cation, the values of the N h , h = 1; 2; : : : ; L must be known. After the strata have been determined, a sample is drawn from each stratum. The sample sizes within the strata are denoted by n 1 ; n 2 ; : : : ; n L respectively and n = P L h=1 n h simple random sample is taken in each stratum, the whole procedure is described as strati…ed simple random sampling (SSRS). Let x hi and y hi show the observed values of the variable of interest and the auxiliary variable for h th stratum, respectively. When the relationship between the X and Y variables is positive, Hansen et al. [4] proposed compound and separate proportional predictors, respectively, To the …rst degree of approximation the mean square errors (MSE) of the estimators y RC and y RS respectively are given as follows where h = 1 n h 1 N h correction term for h th stratum, C xh and C yh are the population coe¢ cients of variation of auxiliary and study variables for h th stratum, respectively. C xyh = C yh C xh and is the population correlation coe¢ cient between the auxiliary and the study variables.
are the population variances of the auxiliary and the study variables for h th stratum, is the population covariance between auxiliary variate and variate of interest in stratum h, and R = Y X is the population ratio.

2.2.
Strati…ed Ranked Set Sampling. In ranked set sampling, r independent random sets, each of size r and each unit in the set being selected with equal probability and without replacement, are selected from the population. The members of each random set are ranked with respect to the characteristic of the study variable or auxiliary variable. Then, the smallest unit is selected from the …rst ordered set and the second smallest unit is selected from the second ordered set. By this way, this procedure is continued until the unit with the largest rank is chosen from the r th set. This cycle may be repeated m times, so n = mr units have been measured during this process. In strati…ed ranked set sampling, for the h th stratum of the population, …rst choose r h independent samples each of size r h , h = 1; 2; :::; L. Rank each sample, and use RSS scheme to obtain L independent RSS samples of size r h , one from each stratum. Let r 1 + r 2 + : : : + r L = r. This complete one cycle of strati…ed ranked set sample. The cycle may be repeated m times until n = mr elements have been obtained. A modi…cation of the above procedure is suggested here to be used for the estimation of the ratio using strati…ed ranked set sample. For the h th stratum, …rst choose r h independent samples each of size r h of independent bivariate elements from the h th subpopulation (stratum) h = 1; 2; :::; L. Rank each sample with respect to one of the variables say Y or X. Then use the RSS sampling scheme to obtain L independent RSS samples of size r h one from each stratum. This complete one cycle of strati…ed ranked set sample. Sampling units for strati…ed ranked set sample can be ranking is on the variable X or Y . When the ranking is on the variable Y , for the k th cycle and the stratum h th , the SRSS is denoted by f(Y h(1)k ; X h[1]k ); :::; (Y h(r h )k ; X h[r h ]k ) : k = 1; 2; :::; m : h = 1; 2; :::; Lg , where Y h(i)k is the i th judgement ordering in the i th set for the study variable and X h[i]k is the i th order statistic in the i th set for the auxiliary variable. When ranking in terms of Y and X variables, the formulas are the same, but the variable ordered is represented by the index () and the other variable is represented by the index []. The compound and separate ratio estimators of population mean respectively given by Samawi and Siam [15], using strati…ed ranked set sampling is de…ned as where are the unbiased estimators of population means Y and X in SRSS.
The MSE of the estimator y SS(c) and y SS(s) to the …rst degree of approximation are respectively given by Following Samawi and Siam [15], Mandowara and Mehta [10] suggested a modi…ed ratio-type estimator for population mean ( Y ) using SRSS, when the population coe¢ cient of variation of the auxiliary variable for the h th stratum C xh and 2h(x) the coe¢ cient of kurtosis of the auxiliary variable X in the h th stratum, are known as where

Strati…ed Double Ranked Set Sampling.
In strati…ed double-ranked set sampling, for the h th stratum of the population, …rst choose r 3 h independent random samples (h = 1; 2; :::; L). Arrange these selected units randomly into r h sets, each of size r 2 h . The procedure of RSS is then applied on each of the sets to obtain the r h sets of ranked set samples each of size r h . These ranked set samples are collected together to form r h sets of observations each of size r h . The RSS procedure is then applied again on this set to obtain L independent DRSS samples each of size r h , to get r 1 + r 2 + ::: + r L = r observations. This completes one cycle of SDRSS . The whole process is repeated m times to get the desired sample size n = mr. Following Samawi and Siam [15], Khan and et al. [9] propose combined ratio-type estimator of population mean Y using SDRSS and is de…ned as where are the unbiased estimators of population means Y and X respectively in SDRSS .
The MSE of the estimators y R(StDRSS)SS to the …rst degree of approximation is given by Y X Khan et al. [9] suggested e¢ cient classes of ratio-type estimators of population mean under strati…ed median ranked set sampling and is de…ned as The MSE of the estimators y R(StDRSS)SD , y R(StDRSS)KC , y R(StDRSS)U S1 , y R(StDRSS)U S2 , to the …rst degree of approximation are respectively given by where i = 1; 2; 3; 4 , F = SD; KC; U S1; U S2, The MRSS procedure as proposed by Muttlak [12] can be formed by selecting r random samples of size n units from the population and rank the units within each sample with respect to a variable of interest. If the sample size r h is odd, then from each sample select for the measurement the r h + 1 2 th smallest ranked unit,i.e., the median of the sample. If the sample size n is even, then select for the measurement from the …rst 566 ARZU ECE CETIN AND NURSEL KOYUNCU r h 2 samples the r h 2 th smallest ranked unit and from the second r h 2 samples the r h 2 + 1 th smallest ranked. The cycle can be repeated m times if needed to get a sample of size mr units. If the MRSS is performed in each stratum instead of SRSS described, the method is known as strati…ed median ranked set sampling SM RSS . To illustrate the method, let us consider the following two cases, if the subpopulations involve odd number of elements in each set, and the second example if the subpopulations involve even number of elements in each set. Note that the number of subpopulations (strata) is immaterial, either odd or even. Following Ibrahim et al. [5], Khan and et al. [8] propose two e¢ cient classes of ratio-type estimators for estimating the …nite population mean under strati…ed median ranked set sampling using the known auxiliary information . Khan and et al. [8] propose the following class of estimators in SM RSS , given by where are the unbiased estimators of population means Y and X respectively in SM RSS. Also, a h and b h are known population parameters, which can be coe¢ cient of variation, coe¢ cient of skewness, coe¢ cient of kurtosis and coe¢ cient of quartiles of the auxiliary variable and k = O; E denote the sample size odd and even respectively.
The MSE of the estimators for odd and even sample sizes are respectively, given by Khan and et al. [8] proposed y R(StM RSSk)p a h and b h are known population parameters; coe¢ cient of variation, coe¢ cient of skewness, coe¢ cient of kurtosis and coe¢ cient of quartiles of the auxiliary variable, The MSE of y R(StM RSSk)1 , y R(StM RSS2)p , y R(StM RSS3)p , y R(StM RSS4)p and y R(StM RSS5)p for odd and even sample sizes are respectively, given by Khan et al. [8] proposed an other class of ratio-type estimators in SM RSS, given by where ! is scalar quantity q [1h] and q [3h] are the …rst and third quartiles of auxiliary variable in the h th stratum respectively. The MSEs of y (StM RSSk)G , upto …rst order of approximation, for odd and even sample sizes are respectively, given by . STRATIFIED RANKED SET SAM PLING DESIG NS 569 Estimators proposed by Khan et al. [8] using the following values for the scalar number ! in the estimator y R(StM RSSk)G , The MSE of y R(StM RSSk)6 and y R(StM RSSk)7 for odd and even sample sizes are respectively, given by (37) where i = 1; 2 G = 6; 7

Simulation Study
In this section a simulation study is conducted to investigate the performance of SSRS, SRSS, SDRSS and SM RSS in ratio type estimators the population mean. To observe performances of the estimators, we use the real data concerning body mass index (BMI) as a study variable and the age and weight as auxiliary variable for 800 people in Turkey in 2014. We have investigated correlation quantity between study variable Y and auxiliary variable X for odd or even sample sizes. Also, we considered on both variable Y and X. The simulation study was performed …rst by using BMI with age variables and second by using BMI with weight variables. The correlation coe¢ cient of BMI with age was 0:60 and the correlation coe¢ cient with weight was 0:86. Thus, sampling methods were compared in di¤erent correlations. For both cases, 10000 samples of size r h = 4; 5; 6; 7 were selected from N = 800 units using SSRS, SRSS, S t DRSS and S t M RSS methods. Also, we strati…ed the data 570 ARZU ECE CETIN AND NURSEL KOYUNCU set using gender (h = 1; 2). Estimators are compared in terms of mean squared errors (MSEs) and percent relative e¢ ciencies (PREs). We used the following expressions to obtain the MSEs and PREs, respectively In this study, the PRE values of the average estimators in other sampling methods were calculated based on the classical mean estimator in the SSRS method. In Table 1, statistical summary of population information about BMI, age and weight variables are given. In Table 2, statistical summary of population strati…ed information about BMI, age and weight variables are given. Simulation results obtained when the auxiliary variable is taken as age are given in Table 3 and Table  4. The results obtained when the weight is taken as auxiliary variables are given in Table 5 and Table 6. Since the e¢ ciency of the estimators changes according to the sample size being odd and even, the sample size is considered as 4, 5, 6 and 7 in the simulation study and the results are given in all tables. When ranking on variable X and Y , the results obtained by using the MSE and PRE formulas of the estimators calculated by using the population strati…ed information of the Body Mass Index (Y ) and Age (X 1 ) variables of the SSRS, SDRSS, SRSS and SM RSS methods are given in the following Table 3 and Table 4, respectively. When ranking on variable X and Y , the results obtained by using the MSE and PRE formulas of the estimators calculated by using the population strati…ed information of the Body Mass Index (Y ) and Weight (X 2 ) variables of the SSRS, SDRSS, SRSS and SM RSS methods are given in the following Table 5 and Table 6, respectively.
From Table 3, it can be easily seen that when the sample size was both odd and even and ranking on age, the lowest predictive value of MSE and the highest predictive value of PRE were found to be y (StM RSSk)7 the estimator proposed by Khan et al. [8].
From Table 4, it have seen that when the sample size was both odd and even and ranking on BMI, the lowest predictive value of MSE and the highest predictive value of PRE were found to be the y (StM RSSk)7 estimator proposed by Khan et al. [8].
From Table 5, it have seen that when the sample size was both odd and even and ranking on weight, the lowest predictive value of MSE and the highest predictive value of PRE were found to be the y (StM RSSk)6 estimator proposed by Khan et al. [8].
From Table 6, it can be easily seen that when the sample size was both odd and even and ranking on BMI, the lowest predictive value of MSE and the highest predictive value of PRE were found to be the y (StM RSSk)7 estimator proposed by Khan et al. [8].
According to the results of the ranking of X and Y variables, it was seen that the PRE value was highest in the ranking on Y and the MSE value was lowest in the ranking according to Y and the weight auxiliary variable gave the better results in the ranking according to Y . Table 1. Population Information about Body Mass Index (Y), Age (X 1 ) and Weight (X 2 ) Variables Table 2. Population Strati…ed Information about Body Mass Index (Y), Age (X 1 ) and Weight (X 2 ) Variables Table 3. MSE and PRE values of estimators according to even and odd of sample size when ranking on variable Age (X 1 ) y RC 5 .9 9 3 1 .6 9 Table 5. MSE and PRE values of estimators according to even and odd of sample size when ranking on variable Weight (X 2 )

Conclusion
In this article, it is aimed to compare the performances of the population mean estimators of various strati…ed sampling methods in the literature. These sampling methods are SSRS, SRSS, SDRSS and SM RSS. In these methods, the ratio type estimators using auxiliary variable information such as coe¢ cient of variation and kurtosis are examined. MSE and PRE values of these estimators are shown on a numerical sample and their performance is evaluated. Firstly, general information about these methods and estimators is given and introduced. Then, the MSE and PRE values of these estimators were found and the results of the simulation were interpreted. To observe performances of the estimators, we use the real data. The simulation study was performed …rst by using BMI with age variables and second by using BMI with weight variables. In the simulation study, when the sample size is odd and even and sorted by X and Y variables and di¤erent correlations are calculated by using di¤erent auxiliary variables, performance evaluation is performed. The aim here is to compare the same sampling methods in di¤erent correlations. According to the results obtained from the simulation, the best sampling method was found to be the SM RSS method.