Photo by Edu Grande on Unsplash

Sampling

Focussing your efforts to understand the bigger picture

  • Will you vote next elections? And if the answer is yes:
  • Who are you voting for?
  • How would you make sure individuals are only surveyed once?
  • How long is it going to take to query everyone?
  • How do you poll everyone: in person, via phone, via mail, via email?

The Sampling Process

First Step: Sample Frame Definition

Photo by Martin Péchy on Unsplash

Second Step: Sampling Method Selection

Probability Sampling:

  • Random Sampling:
    Every member has the same chance to be selected.
  • Systematic Sampling:
    Every member is assigned a number, their selection follows a numerical logic: all the odd numbers; every 10th, starting from the 3rd position; etc. It is vital to ensure there is no hidden pattern skewing the sample in the initial list. Imagine our list of voters is alternating female-male. Sampling the even numbers would not result in the most representative sample.
  • Stratified Sampling:
    The population is categorised into meaningful sub-categories, called strata. Each of these strata is then sampled, either randomly or systematically. The sample size per strata depends on the size of the strata. In our example, the electors could be stratified by age group or ethnicity.
  • Cluster Sampling:
    Unlike stratified sampling, which has homogeneity within the groups, here, the creation of the group is arbitrary. The homogeneity is between the groups. The sampling is achieved by selecting a random set of groups. We could categorise our voters by county, for example. Note that because the population density varies significantly between counties, this would not be the best approach.
  • Quick and easy to implement.
  • Does not require a high level of expertise in the studied field.
  • Reduces sample bias and systematic error.
  • Sample accurately represents the population, which allows inferences to be generalised to the whole sample frame.
  • Bonds well with a diverse population.

Non-probability Sampling:

  • Convenience Sampling:
    The sample is whatever is convenient to collect, without too much effort. In our example, you can ask your friends who they will vote for, or you publish a survey online and wait for (willing) people to answer.
  • Snowball Sampling:
    You ask 10 of your friends, each one of them asks 10 of theirs, and so on.
  • Purposive Sampling:
    As a scientist, you choose the sample because, based on your expertise, you believe it is representative.
  • Quota Sampling:
    Like stratified sampling, the population is segmented by characteristics. But then an arbitrary number of elements are selected in those strata. In our example, you establish the sub-groups then you select 500 men and 500 women between 25 and 45 years old.
  • Quicker and cheaper to implement.
  • Used when some parameters, such as the sample frame, are unknown.

Third Step: Sample Size Definition

Photo by National Cancer Institute on Unsplash

Fourth Step: Data Collection

Final Words

--

--

Data Scientist

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store