What is Random Sampling and How to run it in R


We will walk you through how to run random sampling in R, but first lets discuss what is random sampling and why is it useful?

Right now we are in the middle of an election. If you are wondering how pollsters are predicting who will win then you are going to be interested in random sampling. Since it is nearly impossible to ask everyone in America who they will vote for in the presidential election, researchers will rely on sampling to acquire a section of the population to perform an experiment or observational study.

The group selected should be representative of the population, and not biased in a systematic manner. For example, a group comprised of women only in California would not accurately reflect the opinions of the entire population for the presidential election. Therefore, randomization is typically employed to achieve an unbiased sample.

Random sampling is the sampling technique where we select a group of subjects (a sample) for study from a larger group (a population). Each individual is chosen entirely by chance and each member of the population has an equal chance of being included in the sample. Every possible sample of a given size has the same chance of selection.

Now that you know what random sampling is, we can show you how to run it in R.

Housekeeping notes regarding what is available to use with sample in R:
Sample takes a sample of the specified size from the elements of x using either with or without replacement.

Usage > sample(x, size, replace = FALSE, prob = NULL)

> set.seed(1)
> sample(1:10)
[1]  3  4  5  7  2  8  9  6 10  1

The above is a random sample of numbers from 1 to 10 and is a size of 10. The set.seed was used in case we wanted to be able to reproduce the same set of random samples. Learn more about set.seed here.

> set.seed(2)
> sample(1:10, 4)
[1]  2  7  5 10

The above is a random sample of numbers from 1 to 10 and is a size of 4.

> sample(1:10, replace = TRUE)
[1] 10 10  2  9  5  6  6  3  8  2

The above is a random sample of numbers from 1 to 10 however we allowed for the numbers to repeat by adding a replace = TRUE.