1.2. Basic statistical concepts
1.2. Basic statistical concepts
Population
A population is the entire group of
individuals, objects, or events that we are interested in studying. It is often
too difficult or expensive to study an entire population, so we usually select
a sample from the population. For example, let's say we are interested in
studying the average height of all women in India. The population in this case would
be all women in India.
Examples:
- All registered voters in India
- All iPhone users in the world
- All students enrolled in a university
Sample
A sample is a subset of the population
that we observe and collect data from. In the previous example, it would be
impossible to measure the height of every single woman in India, so we would
instead measure the height of a sample of women. For instance, we could measure
the height of 1,000 randomly selected women from across India and use that
information to make inferences about the population. Then we can say 1,000 is
the sample size.
Examples:
- A random sample of 1,000 registered voters in
India i.e., Sample size = 1,000
- A convenience sample of 100 iPhone users in a
certain region i.e., Sample size = 100
- A systematic sample of 500 students enrolled
in a university i.e., Sample size = 500
Variable
A variable is any characteristic or
attribute that can take on different values. For example, in a study measuring
the effect of smoking on lung cancer, smoking status (smoker or non-smoker)
would be variable. In another study measuring the impact of exercise on weight
loss, the weight would be variable.
Examples:
- Age of a person
- Gender of a person
- Income of a household
- Height of a tree
- Number of siblings a person has
Data types:
There are two main types of data: Categorical
and Numerical.
Categorical data are data that fall into categories or groups,
such as gender, race, or favourite colour.
Numerical data are data that represent
a quantity or measurement, such as weight, age, or income. Numerical data can
be further classified as either discrete or continuous.
Discrete data are data that can only
take on certain values, such as the number of children in a family. Continuous
data are data that can take on any value within a range, such as height or
weight.
Examples:
- Categorical: Gender, Hair colour, Political affiliation, Type of car
- Numerical (Discrete): Number of children in a family, Number of
pets in a household, Number of employees in a company
- Numerical (Continuous): Height of a person, The weight of a person,
Temperature, Age
Comments
Post a Comment