1.2. Basic statistical concepts

 

1.2. Basic statistical concepts

Population

A population is the entire group of individuals, objects, or events that we are interested in studying. It is often too difficult or expensive to study an entire population, so we usually select a sample from the population. For example, let's say we are interested in studying the average height of all women in India. The population in this case would be all women in India.

Examples:

  • All registered voters in India
  • All iPhone users in the world
  • All students enrolled in a university

Sample

A sample is a subset of the population that we observe and collect data from. In the previous example, it would be impossible to measure the height of every single woman in India, so we would instead measure the height of a sample of women. For instance, we could measure the height of 1,000 randomly selected women from across India and use that information to make inferences about the population. Then we can say 1,000 is the sample size.

Examples:

  • A random sample of 1,000 registered voters in India i.e., Sample size = 1,000
  • A convenience sample of 100 iPhone users in a certain region i.e., Sample size = 100
  • A systematic sample of 500 students enrolled in a university i.e., Sample size = 500

Variable

A variable is any characteristic or attribute that can take on different values. For example, in a study measuring the effect of smoking on lung cancer, smoking status (smoker or non-smoker) would be variable. In another study measuring the impact of exercise on weight loss, the weight would be variable.

Examples:

  • Age of a person
  • Gender of a person
  • Income of a household
  • Height of a tree
  • Number of siblings a person has

Data types:

There are two main types of data: Categorical and Numerical.

Categorical data are data that fall into categories or groups, such as gender, race, or favourite colour.

Numerical data are data that represent a quantity or measurement, such as weight, age, or income. Numerical data can be further classified as either discrete or continuous.

Discrete data are data that can only take on certain values, such as the number of children in a family. Continuous data are data that can take on any value within a range, such as height or weight.

Examples:

  • Categorical: Gender, Hair colour, Political affiliation, Type of car
  • Numerical (Discrete): Number of children in a family, Number of pets in a household, Number of employees in a company
  • Numerical (Continuous): Height of a person, The weight of a person, Temperature, Age

Comments

Popular posts from this blog

Chapter 1: Introduction to Statistics