THE INCLUSION PROBABABILITIES OF MEDIAN RANKED SET SAMPLING UNDER DIFFERENT SELECTION PROCEDURES

In this paper, we developed generalized formulas to compute the inclusion probabilities of a median ranked set sample in a nite population setting under Level 0 and Level 2 sampling procedures given by Deshpande et al.(2006). We also compared the inclusion probabilities of these sampling procedures with the inclusion probabilities of Level 1 given by Ozdemir and Gokpinar(2008) under di¤erent population and sample sizes. 1. Introduction McIntyre [4] introduced a sampling design, called Ranked Set Sampling (RSS) which has a better design than the Simple Random Sampling (SRS) design for the estimation of the population mean. RSS is preferred for use in some elds such as the environment, ecology, agriculture and medicine in which measurement of the sampling units in terms of the variable of interest is quite di¢ cult or expensive in terms of cost, time and other factors. In order to increase the e¢ ciency of RSS, modied RSS designs have in the past been suggested for di¤erent distribution types. Some of the modied RSS designs are Median RSS (MRSS), Extreme RSS (ERSS) and Multistage RSS (MSRSS) ( [5], [8], [1]). MRSS is used to reduce the errors in ranking and to increase the e¢ ciency of the estimator for symmetric unimodal distributions, such as the normal distribution. These studies were based on the assumption of an innite population setting. In recent years, RSS is also investigated under a nite population setting. Takahasi and Futatsuya ([9], [10]) were the rst authors to give nite population theory in RSS. Al-Saleh and Samawi [2] gave an adjusted selection procedure for RSS. Also, Deshpande et al. [3] described three di¤erent sample selection procedure in RSS-called Level 0, Level 1 and Level 2to construct nonparametric condence Received by the editors Dec. 12, 2009, Accepted: Feb. 26, 2010. 2000 Mathematics Subject Classication. Primary 62D05; Secondary 65C60. Key words and phrases. Finite Population, Inclusion Probability, Median Ranked Set Sampling. c 2010 Ankara University 9 10 YAPRAK ARZU OZDEMIR AND FIKRI GOKPINAR intervals for the quantile of a nite population. The Level 1 sampling procedure is equivalent to Al-Saleh and Samawis [2] adjusted selection procedure. In SRS, all units in a population have the same inclusion probabilities. However, in RSS and in its modications some units in the population may have di¤erent inclusion probabilities. In sampling theory, the inclusion probabilities give an insight into how the RSS designs control the inclusion of the units in the sample. On the other hand, the unequal inclusion probabilities must be taken into account in order to come up with reasonable estimates of population parameters. Estimators like Horvitz-Thompson (HT) which is unbiased for the population mean and total for any sampling design, often depend on the inclusion probability of each unit in the sample. The HT estimator of the population total T is dened as below, b THT =X ui i ; where i is the inclusion probability of the unit ui in the sample. This estimator is often feasible even for very complex sampling designs. In order to calculate HT estimator in RSS, the inclusion probabilities of each unit in the population and also their rank with respect to a variable of interest must be known. The rank of each unit in the population can be determined by using a concomitant variable which is highly correlated with the variable of interest. However, it is hard and complex to calculate the inclusion probabilities in RSS. Al-Saleh and Samawi [2] obtained the inclusion probabilities with respect to their adjusted selection procedure in a nite population setting only when the sample size was 2 or 3. Using the same selection procedure, Ozdemir and Gokpinar [6] derived a generalized formula for computing the inclusion probabilities in RSS for any sample and population size. Ozdemir and Gokpinar [7] developed a new formula to calculate the inclusion probabilities of population units based on the MRSS design for any sample and population size under the Level 1 sampling procedure. However, in practical use it may be necessary to know the inclusion probabilities for Level 0, Level 1 and Level 2 sampling procedures. In this study, we developed generalized formulas for the inclusion probabilities for Level 0 and Level 2 sampling procedures and shortened the formulas for the inclusion probabilities for the Level 1 sampling procedure which is given by Ozdemir and Gokpinar [7]. We also compared Level 0, Level 1 and Level 2 sampling procedures to the SRS sampling procedure. The paper is organized as follows: In Section 2, we give generalized formulas for computing the inclusion probabilities in MRSS under Level 0, Level 1 and Level 2 sampling procedures. In Section 3, the inclusion probabilities under these sampling procedures based on MRSS and SRS are compared for di¤erent sample and population sizes. Section 4 draws the main conclusions. THE INCLUSION PROBABABILITIES OF MEDIAN RANKED SET SAMPLING 11 2. Inclusion Probabilities of the Units Let u1 < u2 < ::: < uN be the distinct ordered population units. Suppose that n = mr units are to be chosen from this population with MRSS, where m and r are the set and cycle sizes, respectively. We use a modied version of the "Level 0 sampling", "Level 1 sampling" and "Level 2 sampling" procedures given in Deshpande et al. [3]. In order to calculate the inclusion probabilities which are based on the MRSS design for any set size m, cycle size r or population size N; the required denitions are given as follows: Ac;j ; the event of choosing uk in the c cycle and j selection, yc;j ; the unit selected in the c cycle and j selection, Using these denitions, the inclusion probability of the k unit uk, N (k), for k = 1; 2; :::; N can be dened as follows:


Introduction
McIntyre [4] introduced a sampling design, called Ranked Set Sampling (RSS) which has a better design than the Simple Random Sampling (SRS) design for the estimation of the population mean.RSS is preferred for use in some …elds such as the environment, ecology, agriculture and medicine in which measurement of the sampling units in terms of the variable of interest is quite di¢ cult or expensive in terms of cost, time and other factors.
In order to increase the e¢ ciency of RSS, modi…ed RSS designs have in the past been suggested for di¤erent distribution types.Some of the modi…ed RSS designs are Median RSS (MRSS), Extreme RSS (ERSS) and Multistage RSS (MSRSS) ( [5], [8], [1]).MRSS is used to reduce the errors in ranking and to increase the e¢ ciency of the estimator for symmetric unimodal distributions, such as the normal distribution.These studies were based on the assumption of an in…nite population setting.In recent years, RSS is also investigated under a …nite population setting.Takahasi and Futatsuya ([9], [10]) were the …rst authors to give …nite population theory in RSS.Al-Saleh and Samawi [2] gave an adjusted selection procedure for RSS.Also, Deshpande et al. [3] described three di¤erent sample selection procedure in RSS-called Level 0, Level 1 and Level 2-to construct nonparametric con…dence 10 YAPRAK ARZU OZDEM IR AND FIKRI GOKPINAR intervals for the quantile of a …nite population.The Level 1 sampling procedure is equivalent to Al-Saleh and Samawi's [2] adjusted selection procedure.
In SRS, all units in a population have the same inclusion probabilities.However, in RSS and in its modi…cations some units in the population may have di¤erent inclusion probabilities.In sampling theory, the inclusion probabilities give an insight into how the RSS designs control the inclusion of the units in the sample.On the other hand, the unequal inclusion probabilities must be taken into account in order to come up with reasonable estimates of population parameters.Estimators like Horvitz-Thompson (HT) which is unbiased for the population mean and total for any sampling design, often depend on the inclusion probability of each unit in the sample.The HT estimator of the population total T is de…ned as below, where i is the inclusion probability of the unit u i in the sample.This estimator is often feasible even for very complex sampling designs.In order to calculate HT estimator in RSS, the inclusion probabilities of each unit in the population and also their rank with respect to a variable of interest must be known.The rank of each unit in the population can be determined by using a concomitant variable which is highly correlated with the variable of interest.However, it is hard and complex to calculate the inclusion probabilities in RSS.Al-Saleh and Samawi [2] obtained the inclusion probabilities with respect to their adjusted selection procedure in a …nite population setting only when the sample size was 2 or 3. Using the same selection procedure, Ozdemir and Gokpinar [6] derived a generalized formula for computing the inclusion probabilities in RSS for any sample and population size.Ozdemir and Gokpinar [7] developed a new formula to calculate the inclusion probabilities of population units based on the MRSS design for any sample and population size under the Level 1 sampling procedure.However, in practical use it may be necessary to know the inclusion probabilities for Level 0, Level 1 and Level 2 sampling procedures.In this study, we developed generalized formulas for the inclusion probabilities for Level 0 and Level 2 sampling procedures and shortened the formulas for the inclusion probabilities for the Level 1 sampling procedure which is given by Ozdemir and Gokpinar [7].We also compared Level 0, Level 1 and Level 2 sampling procedures to the SRS sampling procedure.
The paper is organized as follows: In Section 2, we give generalized formulas for computing the inclusion probabilities in MRSS under Level 0, Level 1 and Level 2 sampling procedures.In Section 3, the inclusion probabilities under these sampling procedures based on MRSS and SRS are compared for di¤erent sample and population sizes.Section 4 draws the main conclusions.

Inclusion Probabilities of the Units
Let u 1 < u 2 < ::: < u N be the distinct ordered population units.Suppose that n = mr units are to be chosen from this population with MRSS, where m and r are the set and cycle sizes, respectively.We use a modi…ed version of the "Level 0 sampling", "Level 1 sampling" and "Level 2 sampling" procedures given in Deshpande et al. [3].
In order to calculate the inclusion probabilities which are based on the MRSS design for any set size m, cycle size r or population size N; the required de…nitions are given as follows: A c;j ; the event of choosing u k in the c th cycle and j th selection, y c;j ; the unit selected in the c th cycle and j th selection, Using these de…nitions, the inclusion probability of the k th unit u k , N (k), for k = 1; 2; :::; N can be de…ned as follows: where (c) N (k) is the inclusion probability of u k in the c th cycle (c = 1; 2; :::r) and (c;j) N (k) is the inclusion probability of u k in the j th selection (j = 1; 2; :::; m) and c th cycle.
2.1.Inclusion Probabilities under the Level 0 Sampling Procedure.In the Level 0 sampling procedure, all sample units are selected without replacement but all of these units are returned to the population.So in this selection procedure, a unit can be observed more than one time in the …nal sample.The algorithm of the Level 0 sampling procedure is given below: Case 1 (Odd set size m) : In the j th selection, 1.A simple random sample of size m is selected without replacement from the population.
2. The sampled units are ranked with respect to the variable of interest and the m+1 2 th order statistic is selected for measurement.
3. All other m units are returned to the population.4. Steps 1-3 are repeated for j = 1; 2; :::; m to obtain a sample of size m.
Case 2 (Even set size m): In the j th selection, 1.A simple random sample of size m is selected without replacement from the population.
2. The sampled units are ranked with respect to the variable of interest and the m 2 th order statistic and m+2 2 th order statistic are selected for measurement when j m 2 and j > m 2 ; respectively.3.All other m units are returned to the population.4. Steps 1-3 are repeated for j = 1; 2; :::; m to obtain a sample of size m.
The entire cycle in both cases may be repeated, if necessary, r times to produce a median ranked set sample of size n = mr.
In this sampling procedure, since the selected units for ranking and measurement are returned to the population, the probabilities of A c;j are independent of previous selections and can be written as follows: Case 1: Case 2: N m c = 1; :::r; j In case 1, u k cannot be selected with the probability 1 P (A).In case 2, u k cannot be selected with the probability 1 P (A 1 ) when j m 2 or 1 P (A 2 ) when j > m 2 .In this way the probability that u k is not selected in any r cycle and m selection is (1 P (A)) rm for case 1 and (1 P (A 1 )) rm=2 (1 P (A 2 )) rm=2 for case 2: Therefore 1 (1 P (A)) rm and 1 (1 P (A 1 )) rm=2 (1 P (A 2 )) rm=2 are the probabilities of chosing u k at least one time in all r cycle and m selection.So N (k) can be written as, 2.2.Inclusion Probabilities under the Level 1 Sampling Procedure.In the Level 1 sampling procedure, all sample units are selected without replacement and all of these units except the measured one are returned to the population.So in this selection procedure, a unit cannot be observed more than one time in the …nal sample but it can be used for ranking purpose.The algorithm of the Level 1 sampling procedure is given below: Case 1 (Odd set size m) : In the j th selection, 1.A simple random sample of size m is selected without replacement from the population.
2. The sampled units are ranked with respect to the variable of interest and the m+1 2 th order statistic is selected for measurement.
3. All other m 1 units are returned to the population.4. Steps 1-3 are repeated for j = 1; 2; :::; m to obtain a sample of size m.
Case 2 (Even set size m): In the j th selection, 1.A simple random sample of size m is selected without replacement from the population.
2. The sampled units are ranked with respect to the variable of interest and the m 2 th order statistic is selected for measurement when j m 2 and the m+2 2 th order statistic is selected for measurement when j > m 2 .3. All other m 1 units are returned to the population.4. Steps 1-3 are repeated for j = 1; 2; :::; m to obtain a sample of size m.
The entire cycle in both cases may be repeated, if necessary, r times to produce a median ranked set sample of size n = mr.
For obtaining the inclusion probabilities, we required some other de…nitions for the Level 0 sampling procedure.In this procedure, since only units selected for ranking are returned to the population and the measured one is not returned to the population, the probabilities of A c;j are dependent on previous selections.For this reaason, we must know how many measured units are greater or smaller than u k in the previous selections.Let l c;j indicate that the selected unit is greater or smaller than u k in the c th cycle and j th selection as follows: B lc;j c;j = B 0 j ; the event of fy c;j > u k g in the c th cycle and j th selection, B lc;j c;j = B 1 c;j ; the event of fy c;j < u k g in the c th cycle and j th selection.The number of units smaller than u k that can be chosen in the previous selections from the j th selection in the c th cycle is de…ned by and the number of units greater than u k that can be chosen in the previous selections from the j th selection in the c th cycle is de…ned by Thus, where the summation includes all 2 (c 1)m+j 1 possible permutations of (l c;j ; :::l c;1 ; :::; l 1;m ; :::l 1;1 ): Case 1: When the set size m is odd, in cycle c before the j th selection, (c 1)m + j 1 = a 1 + b 1 units must be selected from the population so that the number of remaining units in the population is N (a 1 + b 1 ): These remaining units contain k a 1 1 units smaller than u k and N b 1 k units greater than u k .In the j th selection, we now consider the probability of selecting a unit y c;j greater than u k (l c;j = 0) given that the unit u k is not selected prior to the cycle c and selection j: This probability can be computed from units smaller than u k and m 1 2 units greater than u k .Thus, the probability of choosing u k in the c th cycle and j th selection is given by When l c;j = 1, the probability P (B 1 c;j =B Finally, for choosing u k in the c th cycle and j th selection, when j m 2 ; the sample must have m 2 1 units smaller than u k and m 2 units greater than u k .Moreover, when j > m 2 , the sample must have m 2 units smaller than u k and m 2 1 units greater than u k .As a result, the probability of choosing u k in the c th cycle and j th selection is given by Using these formulas, the inclusion probabilities for all the units in the population can be derived easily.

Inclusion Probabilities under the Level 2 Sampling
Procedure.In the Level 2 sampling procedure, all sample units are selected without replacement and all of these units are not returned to the population.So in this selection procedure, a unit can not be observed more than one time in the …nal sample and cannot be used for ranking.The algorithm of Level 2 sampling procedure is given below: Case 1 (Odd set size m) : In the j th selection, 1.A simple random sample of size m is selected without replacement from the population.
2. The sampled units are ranked with respect to the variable of interest and the m+1 2 th order statistic is selected for measurement.
3. No units are returned to the population.4. Steps 1-3 are repeated for j = 1; 2; :::; m to obtain a sample of size m.
Case 2 (Even set size m): In the j th selection, 1.A simple random sample of size m is selected without replacement from the population.
2. The sampled units are ranked with respect to the variable of interest and the m 2 th order statistic is selected for measurement when j m 2 and the m+2 2 th order statistic is selected for measurement when j > m 2 .3. No units are returned to the population.4. Steps 1-3 are repeated for j = 1; 2; :::; m to obtain a sample of size m.
The entire cycle in both cases may be repeated, if necessary, r times to produce a median ranked set sample of size n = mr.
For obtaining the inclusion probabilities we required some de…nitions.t c;j indicates that the number of units smaller than u k in the c th cycle and j th selection and t c;j = 0; 1; 2; :::; m: So B tc;j c;j is the event that t c;j units are chosen which are smaller than u k in the c th cycle and j th selection.
The number of units smaller than u k that can be chosen in the selections previous to the j th selection in the c th cycle is de…ned by and the number of units greater than u k that can be chosen in the selections previous to the j th selection in the c th cycle is de…ned by Thus, where the summation includes all (m + 1) (c 1)m+j 1 possible permutations of (t c;j ; :::t c;1 ; t c 1;m ; :::t c 1;1 ; :::; t 1;m ; :::t 1;1 ): The conditional probabilities for the Level 2 sampling procedure, P (B

Comparison of the Sampling Procedures
In this section, we investigate the e¤ects of the sample size m and population size N on the inclusion probability of elements in the population for Level 0, Level 1and Level 2 sampling procedures.The inclusion probabilities are calculated using MATLAB 7.0.The calculated inclusion probabilities for Level 0, Level 1 and Level 2 sampling procedures are compared to the inclusion probabilities derived from SRS with the same sample and population sizes.It is well known that the inclusion probability for all elements in the population is m=N for SRS.For this comparison, Figures 1-6 are constructed for N = 20; 50; 500 and sample size m = 3; 4.      As shown in Figures 1 to 6, the inclusion probabilities of all the population units are equal in SRS when the sample and population sizes are …xed.In all MRSS procedures, the m 1 2 units at the extremes will have zero inclusion probabilities for all sample and population sizes.In addition, the middle values of the population units will have greater inclusion probabilities than the other units in the population.When the set size is 3 the inclusion probabilities are symmetric around the median of the population.However, when the set size is 4, this property is not valid since the second greatest unit of the …rst two selections and the third greatest unit of the last two selections are chosen.For odd set sizes, on the other hand, just the median values of the sets are chosen for all selections.Furthermore, when the sample size increases, the inclusion probability of any unit in the population increases for all the sampling designs considered.For all set sizes, the inclusion probabilities of middle values under Level 2 sampling procedure are greater than the other sampling procedures.But the inclusion probabilities of the extreme values under Level 1 sampling procedure are greater than the others.When the set size is 3 the di¤erence between the inclusion probabilities under Level 0, Level 1 and Level 2 sampling procedures are symetrical, but when the set size is 4 these di¤erences are not symetrical.In all sample sizes, the Level 0 sampling procedure gives the units smallest inclusion probabilities.Also, when the population size increases, the inclusion probabilities under Level 0, Level 1 and Level 2 sampling approach equivalence.

Conclusions
MRSS, a modi…cation of RSS, is more e¢ cient than RSS for estimating the population mean, as long as the underlying distribution is symmetric.In MRSS, there are three sampling procedures-Level 0, Level 1 and Level 2-which can be used for sample selection.In this study, we provided formulas for calculating the inclusion probabilities in MRSS under these three sampling procedures.These inclusion probabilities indicate that the Level 2 sampling procedure generates greater inclusion probabilities for the middle values in the population than the other procedures.By using the Level 2 sampling procedure, we get more information about the population than in the other procedures.

:j 1 c:
When l c;j = 1 , the probability of choosing a unit smaller than u k in the c th cycle and j th selection when u k is not included in the previous selections, P (B 1 c;j =B lc;Finally, for choosing u k in the c th cycle and j th selection, the sample must have m 1 2

m 1 2 N: 2 :j 1 c;j 1 \ B lc;j 2 c;j 2 : 1 1; 1 )j 1 c;j 1 \ B lc;j 2 c;j 2 :
(a1+b1) mCase When the set size m is even the probabilities P (B 1 c;j =B lc;can be obtained in a fashion similar to Case 1.However, these probabilities depend on whether j m 2 or not.Therefore, the probability P (B 0 c;j =B lc;

: 1 1; 1 j 1 c;j 1 \ B tc;j 2 c;j 2 \
) is di¤erent from calculated conditional probabilities for the Level 1 sampling procedure and is independent of cases 1 and 2 as given below:But the probability of P (A c;j =B tc;j 1 c;j 1 \B tc;j 2 c;j 2 \:::\B tc;1 c;1 \:::\B t1;) is dependent on these cases.So these probabilities are given for case 1 and 2 as follows:Case1: P (A c;j =B tc;::: \ B