- In random sampling from a population, units are selected by a probability mechanism.
- Simple random sampling from a finite population gives every item the same probability of selection, but in less simple methods the probabilities need not be the same.
- For example, in stratified sampling a random sample is taken from each stratum, but the strata are usually of different sizes.
- Other methods include cluster sampling and multi-stage sampling, in which primary units (for example geographical units such as villages) are selected at random from all those available and these units are either studied completely or subsampled.
- Exact estimation methods for means, totals or proportions can be developed for a method of sampling that is based on probability rules
- However, these sampling methods require setting up carefully and this can be very time-consuming and expensive.
- Non-random sampling methods are usually much quicker, particularly quota sampling in which interviewers are typically sent to central points, such as shopping areas, and given a quota of people to be interviewed.
- These are specified by characteristics, such as age-group or sex or voting intentions, which can be discovered by a few simple questions so that the specified number (quota) in each sub-group of the population can be obtained.
- There is no restriction on which actual individuals in each sub-group shall be interviewed, and the easiest to obtain (the most co-operative) will usually be included in the sample.
- Bias often results from this, and usually also the population to be found in the shopping area (if that is indeed the situation) at the time of the survey is not representative of the whole population of the town.
- Analysis has to use the methods based on probability because no others are available.
- Systematic sampling is done from a population whose members are listed in some standard order (such as alphabetical).
- It consists of choosing a random starting point at the beginning of the list followed by a regular selection of every kth item, where k = N/n = (population size)/(sample size).
- Systematic sampling (with random starting point) is much quicker and simpler than pure random sampling.
- There may be refusals, as in any method of choosing individuals, but this is so in random sampling also.
- Provided enough is known about possible regular trends in the list used, this method does have a reasonable theoretical base
- If there are no trends, a systematic sample might behave as if it were a simple random sample, though strictly speaking it is not.
- Sometimes the methods for cluster samples can be used for analysis, if there are no trends.
What is meant by random sampling? What is its importance?
- Random sampling is a sampling procedure by which each member of a population has an equal chance of being included in the sample.
- Random sampling ensures a representative sample.
- There are several types of random sampling.
- In simple random sampling, not only each item in the population but each sample has an equal probability of being picked.
- In systematic sampling, items are selected from the population at uniform intervals of time, order, or space (as in picking every one-hundredth name from a telephone directory).
- Systematic sampling can be biased easily, such as, for example, when the amount of household garbage is measured on Mondays (which includes the weekend garbage).
- In stratified and cluster sampling, the population is divided into strata (such as age groups) and clusters (such as blocks of a city) and then a proportionate number of elements is picked at random from each stratum and cluster.
- Stratified sampling is used when the variations within each stratum are small in relation to the variations between strata.
- Cluster sampling is used when the opposite is the case.
- In what follows, we assume simple random sampling.
- Sampling can be from a finite population (as in picking cards from a deck without replacement) or from an infinite population (as in picking parts produced by a continuous process or cards from a deck with replacement).
How can a random sample be obtained?
- A random sample can be obtained
- (1) by a computer programmed to assemble numbers,
- (2) from a table of random numbers, and
- (3) by assigning a number to each item in a population, recording each number on a separate slip of paper, mixing the slips of paper thoroughly, and then picking as many slips of paper and numbers as we want in the sample.
- The last method of obtaining a random sample is very cumbersome with large populations and may not give a representative sample because of the difficulty of thoroughly scrambling the pieces of paper.
- Sampling can provide reliable information at far less cost than a census. With probability samples (described in the next chapter), you can quantify the sampling error from a survey. In some instances, an observation unit must be destroyed to be measured, as when a cookie must be pulverized to determine the fat content. In such a case, a sample provides reliable information about the population; a census destroys the population and, with it, the need for information about it.
- Data can be collected more quickly, so estimates can be published in a timely fashion. An estimate of the unemployment rate for 2005 is not very helpful if it takes until 2015 to interview every household.
- Finally, and less well known, estimates based on sample surveys are often more accurate than those based on a census because investigators can be more careful when collecting data. A complete census often requires a large administrative organization, and involves many persons in the data collection. With the administrative complexity and the pressure to produce timely estimates, many types of errors can be injected into the census. In a sample, more attention can be devoted to data quality through training personnel and following up on non-respondents. It is far better to have good measurements on a representative sample than unreliable or biased measurements on the whole population.
- (Step 1): Ask “What is expected of the sample, and how much precision do I need?” What are the consequences of the sample results? How much error is tolerable? If your survey measures the unemployment rate every month, you would like your estimates to be very precise indeed so that you can detect changes in unemployment rates from month to month.A preliminary investigation, however, often needs less precision than an ongoing survey. Instead of asking about required precision, many people ask, “What percentage of the population should I include in my sample?” This is usually the wrong question to be asking. Except in very small populations, precision is obtained through the absolute size of the sample, not the proportion of the population covered.
- (Step 2): Find an equation relating the sample size n and your expectations of the sample.
- (Step 3): Estimate any unknown quantities and solve for n.
- (Step 4): If you are relatively new at designing surveys, you will find at this point that the sample size you calculated in step 3 is much larger than you can afford. Go back and adjust some of your expectations for the survey and try again. In some cases, you will find that you cannot even come close to the precision you need with the resources you have available; in that case, perhaps you should consider whether you should even conduct your study.