Statistics
Statistics are everywhere, and in this course we will learn methods of analyzing these statistics we are bombarded with.
You are bombarded with data and statistics every day. In fact, 90% of the world's data has been created in the last 2 years, and we produce 2.5 quintillion bytes of data every day! Data appears on your television and computer screens, in advertisements, on radio news reports, in newspapers and magazines, and on websites. You have to deal with streams of data at
work, and then again when you get home. The ability to assess the accuracy and relevance of data is one of the most important skills to possess in the Information Age. Sources for statistics:
https://www.mediapost.com/publications/article/291358/90oftodaysdatacreatedintwoyears.html
https://www.forbes.com/sites/bernardmarr/2018/05/21/howmuchdatadowecreateeverydaythemindblowingstatseveryoneshouldread/#51840bb260ba
Data
What is data? How is it classified?
Data can be defined as facts. It may be numerical or not. Data can be collected on almost anything. The object you are collecting the data about is called a ‘variable’ (since the observed value can vary). For example, the characteristic or ‘variable’ being studied could be height; which is numerical or ‘quantitative’. Or the variable could be hair colour which is ‘qualitative’ or not numerical.
Quantitative vs Qualitative is One Way to Categorize Types of Data
Some examples:
Quantitative (or numerical) data 
Qualitative (or categorical) data 



Qualitative data can be further classified as ‘ordinal’ if it can be ranked (e.g., poor, fair, good, very good), or ‘nominal’ if it cannot be ranked (e.g., eye colour).
Quantitative (i.e., numerical) values can be further classified as discrete or continuous.
Discrete:
whole numbers only (fractions or decimals not possible) example: number of children in a family
Continuous:
whole numbers, fractions or decimals example: temperature outside of 10.5 degrees
Quantitative or Qualitative variable?  If quantitative, is it a discrete or a continuous variable?  

a) height of a bicycle 
Quantitative 
Continuous 
b) age of a cat to the nearest year 
Quantitative 
Discrete 
c) volume of juice in a can 
Quantitative 
Continuous 
d) names of countries a person has visited 
Qualitative 
N/A 
e) letter grade in a course 
Qualitative 
N/A 
f) eye colour 
Qualitative 
N/A 
g) how someone feels about the current government 
Qualitative 
N/A 
h) whether someone drinks coffee, tea, both or neither in the morning 
Qualitative 
N/A 
i) the number of cars you can see from your window 
Quantitative 
Discrete 
j) whether you have ever gone fishing 
Qualitative 
N/A 
Sources of Data
Part of working with data involves assessing the source, and reliability.
Primary source = data you collect
Secondary source = data someone else collected
The benefit of primary data is that you know all about them. The problem with secondary data is that often, the reliability, accuracy, and integrity of the data is uncertain. Who collected the data? Are the data biased? How old are the data? Can the data be verified, or do they have to be taken on faith? All of these questions can be difficult to answer if you did not collect the data yourself.
Examples of secondary sources: scientific research papers, news reports, Wikipedia, textbooks, websites, etc.
However, some secondary sources are better than others. For example, academic institutions tend to present original information in a less biased way. Textbooks and some encyclopedias can be good as well. The worst secondary sources tend to be those advocating particular opinions, especially personal blogs. It can be hard to determine whether a bias exists in a secondary source, so it is always good to confirm the information with another source.
Can you think of a reliable secondary source that was mentioned earlier in this learning activity?
If you answered, Statistics Canada, then you are correct!
Examples
Categorize each data source as primary or secondary.
One Variable vs. Two Variable Data:
Example: One Variable Studies  Example: Two Variable Studies 

Is an individual colour blind or not? 
Relationship between having a pet and a person’s emotional or physical health. 
Heights of a representative sample of adult Canadians 
Relationship between the level of pollution in a city and the average life span of the residents. 
Favourite colour 
Relationship between the proportion of Internet subscribers in a neighbourhood and voting patterns in the neighbourhood. 
The number of variables collected affects the way the data are analyzed.
In this unit we will investigate the way two variable data are analyzed. (One variable data analysis is studied in Unit 2.)
Practice Questions:
Variability in Data and Sampling:
In general there is variability in data. Variability can come from two sources:
1. Inherent variability
2. Measurement variability
Inherent Variability: refers to the variety of responses possible from the ‘sample’ surveyed (they are a smaller group representing the larger ‘population’). This inherent variability is minimized by choosing appropriate sampling methods but it cannot be avoided.
Measurement Variability: refers to variability from any minor differences in procedure of taking
measurements (human or mechanical). This variability can be minimized by experimental design but cannot be totally avoided.
Examples:
For each situation determine
i. the population
ii. the variable being researched
iii. whether the variable is quantitative or qualitative
iv. if quantitative, whether the data are continuous or discrete
a. You are hired by a restaurant to determine how often each customer visits the restaurant each week.
 all customers of the restaurant
 number of visits per week
 quantitative since number of visits is a number
 discrete since the count will always be a whole number (no decimals)
b. The transit commission hires you to record the time between buses at a specific stop.
 buses visiting the bus stop
 time between buses
 quantitative since you are measuring time which is a number
 continuous since time is continuous (can have fractions/decimals)
c. You conduct a study on whether residents in city A have more disposable income than those in city B.
 population of city A and city B
 disposable income
 quantitative since income is a number
 continuous since money can include decimal values (although they are usually rounded to 2 decimal places)
d. You survey members of your community for their opinion on the proposed name of the new community center.
 residents of your community
 opinion on the proposed name of the community center
 qualitative since you are asking their opinion
e. You collect data on the number of cars parked on your street at the top of each hour.
 people who park their car on your street
 the number of cars parked at the top of each hour
 quantitative since you are counting the number
 discrete since the you can’t have a fraction of a car
f. You are a marine biologist studying the biodiversity in Long Lake by identifying the species of fish in the lake caught by anglers.
 fish in the lake (of various species)
 species of fish caught by anglers
 qualitative since the species of fish is a name not a number
Ways to Depict 2 Variable Data:
Once data have been collected, they need to be presented effectively to communicate the patterns they contain. The most common ways to show patterns in data are to present them in tables or graphs. Data showing income distribution for an Ontario city are shown below as both a table and graph:
Table:
Annual income  Percentage 

Less than $10 000 
22% 
$10 000 to $25 000 
29% 
$25 000 to $50 000 
30% 
$50 000 to $100 000 
16% 
Greater than $100 000 
3% 
Graph:
Graphs are often used to display data so the patterns are observed quickly and easily, and used to draw conclusions. There are many types of graphs to choose from (e.g., pie graph, bar graph, histogram, line graph, scatter plot, etc.).
Examples of reading other types of graphs:
Another common graph is a bar graph.
Can you interpret the trends in this graph?
Another commonly used graph is the pictograph, where symbols represent counts. This graph describes the population in various Ontario towns. The stick person represents 1000 people.
For each question, select the best answer.
Selfcheck
In this learning activity we have been introduced to statistics and why they are important in our lives. Since statistics involves data we need to classify into different types of data. We looked at these different types of data: quantitative vs qualitative; discrete vs continuous; primary vs secondary; and one vs two variable data. We also looked at variability that occurs within data, and began looking at how two variable data is displayed.
Complete these practice questions to be sure that you can:
 classify different types of data
 understand why there is variability present within data sets
 read a variety of graphs for trends
Practice Questions:
(Check your answers with those provided.)
1. As an ongoing component of this course you will create a glossary of the many terms we will encounter throughout the units. Each term should have a clear description of the meaning of the term, and an example if appropriate.
Example:
Quantitative (i.e., numerical) data  data that are represented using numbers; also referred to as numerical data (e.g., height).
Begin your glossary of terms with this learning activity and any relevant terms it includes.
4. Provide an example of secondary source data, and explain the difference between it and primary source data.
Examples will vary. Primary source data is data that you have collected yourself (i.e., you conducted the survey or made the measurements and recorded the data), whereas secondary data has been collected by someone other than you.
5. You have been given the task of investigating students opinion of their school. You plan to ask them two questions:
 Do you enjoy going to school? Yes, No, or Undecided
 What is your overall average in your courses? _______ What is your age?___________
a) Explain why the first question is one variable data, and describe how you might present the results.
a) The first question is one variable data since it only asks for one response to one question. The results could be displayed as a pictograph or a pie chart.
b) Explain why the second question is two variable data, and describe how you might present the results.
b) The second question is two variable data, since it is asking and collecting two pieces of information from each person. The results could be graphed in a scatter plot to look for any trend between age and overall course average.
Note  The final assessment in unit 4 for this course is based on the material from Unit 1 and 2. The first part will be introduced at the end of Unit 1, and the second part at the end of Unit 2. This will allow time for you to collect data and analyze it using the skills from units 1 and 2, and to prepare a written report and presentation of your findings.
Reflection:
To determine your understanding of the concepts in this learning activity reflect on these questions:
 Can I classify types of data in a variety of ways?
 Can I describe why there is variation within data sets?
 Can I describe how to present one and two variably data and identify trends in the data?
If you have answered ‘no’ to any of these questions, go back and read through the relevant examples.
If you still have questions please message your teacher or search out additional examples on line.
Reflection
In this course you are an independent, selfdirected learner. Consider this definition of a selfdirected learner:
Selfdirected learners are aware of how they learn best. They are confident and know when to ask for support. Selfdirected learners set goals and make realistic plans to meet those goals. In other words, they make a commitment to their own learning and take responsibility for it. Remember you are an independent learner, but you are not alone. At any time you can reach out to your Academic Officer at the ILC for support.
As a selfdirected learner, track your progress on these:
Rate your understanding on a scale of five to one.
Five means “I have a thorough understanding.” One means “I am confused.”
You will benefit from keeping an organized notebook, either digital or paperbased. Throughout the course, you will be prompted to reflect on your learning and document evidence of your growth.
Now that you have completed the first learning activity, go back and make notes. Consider adding definitions or terminology. You may even want to start creating your own personal word wall. A word wall is a type of glossary where you add the words you want to remember, including definitions and helpful examples.