Statistics
Statistics are everywhere, and in this course we will learn methods of analyzing these statistics we are bombarded with.

You are bombarded with data and statistics every day. In fact, 90% of the world's data has been created in the last 2 years, and we produce 2.5 quintillion bytes of data every day! Data appears on your television and computer screens, in advertisements, on radio news reports, in newspapers and magazines, and on websites. You have to deal with streams of data at
work, and then again when you get home. The ability to assess the accuracy and relevance of data is one of the most important skills to possess in the Information Age. Sources for statistics:
https://www.mediapost.com/publications/article/291358/90-of-todays-data-created-in-two-years.html
https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-day-the-mind-blowing-stats-everyone-should-read/#51840bb260ba
Data
What is data? How is it classified?
Data can be defined as facts. It may be numerical or not. Data can be collected on almost anything. The object you are collecting the data about is called a ‘variable’ (since the observed value can vary). For example, the characteristic or ‘variable’ being studied could be height; which is numerical or ‘quantitative’. Or the variable could be hair colour which is ‘qualitative’ or not numerical.
Quantitative vs Qualitative is One Way to Categorize Types of Data
Some examples:
Quantitative (or numerical) data |
Qualitative (or categorical) data |
---|---|
|
|
Qualitative data can be further classified as ‘ordinal’ if it can be ranked (e.g., poor, fair, good, very good), or ‘nominal’ if it cannot be ranked (e.g., eye colour).
Quantitative (i.e., numerical) values can be further classified as discrete or continuous.
Discrete:
whole numbers only (fractions or decimals not possible) example: number of children in a family
Continuous:
whole numbers, fractions or decimals example: temperature outside of 10.5 degrees
Quantitative or Qualitative variable? | If quantitative, is it a discrete or a continuous variable? | |
---|---|---|
a) height of a bicycle |
Quantitative |
Continuous |
b) age of a cat to the nearest year |
Quantitative |
Discrete |
c) volume of juice in a can |
Quantitative |
Continuous |
d) names of countries a person has visited |
Qualitative |
N/A |
e) letter grade in a course |
Qualitative |
N/A |
f) eye colour |
Qualitative |
N/A |
g) how someone feels about the current government |
Qualitative |
N/A |
h) whether someone drinks coffee, tea, both or neither in the morning |
Qualitative |
N/A |
i) the number of cars you can see from your window |
Quantitative |
Discrete |
j) whether you have ever gone fishing |
Qualitative |
N/A |
Sources of Data
Part of working with data involves assessing the source, and reliability.
Primary source = data you collect
Secondary source = data someone else collected

The benefit of primary data is that you know all about them. The problem with secondary data is that often, the reliability, accuracy, and integrity of the data is uncertain. Who collected the data? Are the data biased? How old are the data? Can the data be verified, or do they have to be taken on faith? All of these questions can be difficult to answer if you did not collect the data yourself.
Examples of secondary sources: scientific research papers, news reports, Wikipedia, textbooks, websites, etc.
However, some secondary sources are better than others. For example, academic institutions tend to present original information in a less biased way. Textbooks and some encyclopedias can be good as well. The worst secondary sources tend to be those advocating particular opinions, especially personal blogs. It can be hard to determine whether a bias exists in a secondary source, so it is always good to confirm the information with another source.
Can you think of a reliable secondary source that was mentioned earlier in this learning activity?
If you answered, Statistics Canada, then you are correct!
Examples
Categorize each data source as primary or secondary.
One Variable vs. Two Variable Data:
Example: One Variable Studies | Example: Two Variable Studies |
---|---|
Is an individual colour blind or not? |
Relationship between having a pet and a person’s emotional or physical health. |
Heights of a representative sample of adult Canadians |
Relationship between the level of pollution in a city and the average life span of the residents. |
Favourite colour |
Relationship between the proportion of Internet subscribers in a neighbourhood and voting patterns in the neighbourhood. |
The number of variables collected affects the way the data are analyzed.
In this unit we will investigate the way two variable data are analyzed. (One variable data analysis is studied in Unit 2.)
Practice Questions:
Variability in Data and Sampling:
In general there is variability in data. Variability can come from two sources:
1. Inherent variability
2. Measurement variability
Inherent Variability: refers to the variety of responses possible from the ‘sample’ surveyed (they are a smaller group representing the larger ‘population’). This inherent variability is minimized by choosing appropriate sampling methods but it cannot be avoided.
Measurement Variability: refers to variability from any minor differences in procedure of taking
measurements (human or mechanical). This variability can be minimized by experimental design but cannot be totally avoided.
Examples:
For each situation determine
i. the population
ii. the variable being researched
iii. whether the variable is quantitative or qualitative
iv. if quantitative, whether the data are continuous or discrete
a. You are hired by a restaurant to determine how often each customer visits the restaurant each week.
- all customers of the restaurant
- number of visits per week
- quantitative since number of visits is a number
- discrete since the count will always be a whole number (no decimals)
b. The transit commission hires you to record the time between buses at a specific stop.
- buses visiting the bus stop
- time between buses
- quantitative since you are measuring time which is a number
- continuous since time is continuous (can have fractions/decimals)
c. You conduct a study on whether residents in city A have more disposable income than those in city B.
- population of city A and city B
- disposable income
- quantitative since income is a number
- continuous since money can include decimal values (although they are usually rounded to 2 decimal places)
d. You survey members of your community for their opinion on the proposed name of the new community center.
- residents of your community
- opinion on the proposed name of the community center
- qualitative since you are asking their opinion
e. You collect data on the number of cars parked on your street at the top of each hour.
- people who park their car on your street
- the number of cars parked at the top of each hour
- quantitative since you are counting the number
- discrete since the you can’t have a fraction of a car
f. You are a marine biologist studying the biodiversity in Long Lake by identifying the species of fish in the lake caught by anglers.
- fish in the lake (of various species)
- species of fish caught by anglers
- qualitative since the species of fish is a name not a number
Ways to Depict 2 Variable Data:
Once data have been collected, they need to be presented effectively to communicate the patterns they contain. The most common ways to show patterns in data are to present them in tables or graphs. Data showing income distribution for an Ontario city are shown below as both a table and graph:
Table:
Annual income | Percentage |
---|---|
Less than $10 000 |
22% |
$10 000 to $25 000 |
29% |
$25 000 to $50 000 |
30% |
$50 000 to $100 000 |
16% |
Greater than $100 000 |
3% |
Graph:
Graphs are often used to display data so the patterns are observed quickly and easily, and used to draw conclusions. There are many types of graphs to choose from (e.g., pie graph, bar graph, histogram, line graph, scatter plot, etc.).
Examples of reading other types of graphs:
Another common graph is a bar graph.
Can you interpret the trends in this graph?
Another commonly used graph is the pictograph, where symbols represent counts. This graph describes the population in various Ontario towns. The stick person represents 1000 people.
For each question, select the best answer.
Self-check
In this learning activity we have been introduced to statistics and why they are important in our lives. Since statistics involves data we need to classify into different types of data. We looked at these different types of data: quantitative vs qualitative; discrete vs continuous; primary vs secondary; and one vs two variable data. We also looked at variability that occurs within data, and began looking at how two variable data is displayed.
Complete these practice questions to be sure that you can:
- classify different types of data
- understand why there is variability present within data sets
- read a variety of graphs for trends
Practice Questions:
(Check your answers with those provided.)
1. As an ongoing component of this course you will create a glossary of the many terms we will encounter throughout the units. Each term should have a clear description of the meaning of the term, and an example if appropriate.
Example:
Quantitative (i.e., numerical) data - data that are represented using numbers; also referred to as numerical data (e.g., height).
Begin your glossary of terms with this learning activity and any relevant terms it includes.
4. Provide an example of secondary source data, and explain the difference between it and primary source data.
Examples will vary. Primary source data is data that you have collected yourself (i.e., you conducted the survey or made the measurements and recorded the data), whereas secondary data has been collected by someone other than you.
5. You have been given the task of investigating students opinion of their school. You plan to ask them two questions:
- Do you enjoy going to school? Yes, No, or Undecided
- What is your overall average in your courses? _______ What is your age?___________
a) Explain why the first question is one variable data, and describe how you might present the results.
a) The first question is one variable data since it only asks for one response to one question. The results could be displayed as a pictograph or a pie chart.
b) Explain why the second question is two variable data, and describe how you might present the results.
b) The second question is two variable data, since it is asking and collecting two pieces of information from each person. The results could be graphed in a scatter plot to look for any trend between age and overall course average.
Note - The final assessment in unit 4 for this course is based on the material from Unit 1 and 2. The first part will be introduced at the end of Unit 1, and the second part at the end of Unit 2. This will allow time for you to collect data and analyze it using the skills from units 1 and 2, and to prepare a written report and presentation of your findings.
Reflection:
To determine your understanding of the concepts in this learning activity reflect on these questions:
- Can I classify types of data in a variety of ways?
- Can I describe why there is variation within data sets?
- Can I describe how to present one and two variably data and identify trends in the data?
If you have answered ‘no’ to any of these questions, go back and read through the relevant examples.
If you still have questions please message your teacher or search out additional examples on line.
Reflection

In this course you are an independent, self-directed learner. Consider this definition of a self-directed learner:
Self-directed learners are aware of how they learn best. They are confident and know when to ask for support. Self-directed learners set goals and make realistic plans to meet those goals. In other words, they make a commitment to their own learning and take responsibility for it. Remember you are an independent learner, but you are not alone. At any time you can reach out to your Academic Officer at the ILC for support.
As a self-directed learner, track your progress on these:
Rate your understanding on a scale of five to one.
Five means “I have a thorough understanding.” One means “I am confused.”

You will benefit from keeping an organized notebook, either digital or paper-based. Throughout the course, you will be prompted to reflect on your learning and document evidence of your growth.
Now that you have completed the first learning activity, go back and make notes. Consider adding definitions or terminology. You may even want to start creating your own personal word wall. A word wall is a type of glossary where you add the words you want to remember, including definitions and helpful examples.