10.2. The use of weighting in statistical analysis
In an analysis of a survey based on random selection, if the survey does not use simple random sampling or sampling with replacement, then participants are not all equally likely to be selected from the population. In this case, to achieve unbiased estimation, a case weight should be assigned to each of the participants returning data. These sampling weights should be inversely proportional to the probability of selection of each participant and should sum, over all the participants, to the sample size. Sampling weights can only be calculated if probability sampling is used. The software package SPSS, for example, allows a weighted analysis to be carried out.
For example, if a stratified sample is used, based on geographical area, for a case sampled from stratum i the sampling case weight is given by (Ni/N)/(ni/n) =(Nin)/(niN) where N is the population size, Ni is the size of stratum i in the population, ni is the number of people sampled from stratum i and n is the total sample size. This requires knowledge of which area or stratum a respondent comes from. Numbering questionnaires and recording in the data spreadsheet the area in a column beside the questionnaire number is probably the best way to ensure that the required information is available. Inclusion of appropriate questions can make it feasible to set up weights to be used in a weighted analysis of the data, however not all participants may respond to these questions, so it is safer to record the information in advance.
Weights can also be used to allow for unit non-response, i.e. where some people do not respond at all. These weights are inversely proportional to the probability of responding. So in stratum , each person would have a non-response weight of where is the number of people selected from stratum and is the number of people responding from that stratum. This and more sophisticated methods are discussed in Lehtonen and Pahkinen (2004).
The weights for sample design and the weights for non-response can both be used at once, by multiplying the two columns of weights together and rescaling so that the new weights add to the sample size.
In a multi-cultural survey, in which different sampling designs may have been used to select participants in different countries, different weight calculations will be needed for respondents from each constituent country, and this requires detailed knowledge of the different survey designs which were used to select the samples. In practice this information may not be readily available.