Meg Gorzycki, Ed.D. and Alen Tersakyan
Purpose
Confronted with statistical information and representations such as bars, graphs, tables and charts, many students slide over the material, hoping that it will either be explained in class or that it will not appear on a test. There are several reasons why helping students understand this content is important:
 Help students understand research in their field of study
 Help students discern the veracity of claims in advertising
 Help students understand the implications of research for their own wellbeing
 Help students become critical readers of political rhetoric
 Help students comprehend the complexity of the world
 Help students understand how statistical representations can embody biases
The purpose of this material is to identify ways instructors can help students understand representations of data and provide examples of exercises and assessments.
Statistical Literacy
Statistical literacy concerns the individual's ability to "decode" information that numerically describes phenomena, is presented in statistical lexicon, and often uses graphic organizers to summarize data or illustrate patterns and relationships (Gal, 2004; Pfannkuck, 2004; Shields, 1999). At the basic level, students should be able to:
 Understand central tendency
 Interpret simple charts, graphs, and tables
 Describe the role of variables in research
 Define "sample population" and explain how it is determined
 Explain simple statistical information in narrative form
 Address the meaning and significance of data
What are the Basics?
Introduction to statistics typically address the following (Raykov & Marcouides, 2013):
 Why understanding statistics and graphic organizers is important
 Understanding mean, mode, median (central tendency)
 Understanding standard deviation
 How to interpret graphs, charts, and tables
 How to identify variables and assess their impact
 How to understand correlations
 How to understand and generate a hypothesis
 Identify methods of gathering data and appropriate applications
When Should Instructors Teach Statistics?
Statistical information and representations are found in all disciplines. They routinely appear in scholarly reading in STEM fields, health sciences, social sciences, and economics. To determine whether students in your class will benefit from explicit instruction on statistics and understanding statistical representations, and to integrate skillbuilding into courses, instructors may take the following steps:
 Preview the assigned reading material and identify where students will encounter statistics and statistical representations, and anticipate whether or not these elements may be problematic for students
 Ask students whether they understand statistics and the way data is represented in their assignments, and determine whether their ability to put the data into their own words is sufficient or deficient
 Develop exercises and class activities that strengthen students' skills to interpret statistics and statistical representation
 Reinforce student learning by administering abundant formative assessments of their work
Sample Exercises
Case Study: The Grading System
A student became very distressed and angry when she learned that her instructor, Professor Owens, was going to assign test grades based on a pointpercentage system whereby 97% and up was an “A+,” 9496% was an “A,” 9093% was an “A,” and so forth. The student complained, “None of my other instructors do it this way. You should curve everything so that the grades reflect the students’ level of learning.”
The professor replied, “Are you asking me to grade things on a class bell curve?”
The student said, “Yes, I think that is the only fair way to grade.”
The professor confirmed that the student wanted grades to land along a curve, wherein just as many scores were above average as they were below average. She then conducted an analysis and found that 31 students took an exam worth 100 points, and that the scores were as follows from highest to lowest: 98. 98, 98, 97, 96, 96, 82, 80, 74, 70, 70, 69, 69, 69, 68, 68, 67, 66, 63, 63, 63, 63, 58, 53, 53, 52, 50, 49, 49, 48, 44.
The mean score was 69, mode of 63, and median 68. She produced a bell curve based on a normal distribution, with 69 as the average score. Using confidence intervals in a normal distribution, whereby 68.3% of the sample population (the professor’s 31 students who took the test) is captured by scores ranging from 85 to 52.
The professor then pointed out that if the bell curve were used, she would have to award passing grades to students who got less than one half the material correct on the test. She felt that giving students passing grades for mediocre work was not appropriate, especially since the professional community and employers are counting on the professor to maintain high standards of knowledge and skill.
Instructional Suggestions
 Clearly communicate grading system in the syllabus and present a rationale
 Take class time to review the concepts of central tendency (mean, mode, median), standard deviation, and normal distribution
 Facilitate class discussion about the advantages and disadvantages of using a bell curve to represent grades and to “norm” human behavior
The Teacher's Teacher
A student aspiring to be a school administrator submitted a research paper to his instructor who awarded the essay with a “C.” Frustrated, the student claimed that the report was thorough, accurate, wellorganized, and welldocumented. The instructor disagreed, and pointed to a graph entitled: “National 8^{th} Grade Reading Scores, 20072015,” which appeared as follows:
The expository writing related to the graph read in part:
The national average on the reading test for all students in 8^{th} grade in 2015 was 265. The highest score possible on the reading test was 500, so a score of 265 means that on average, 8^{th} graders only got 53% of the test questions right. The threshold for basic level reading skills is a score of about 243, while the threshold for proficient reading is 281, and a score of 323 or better represents advanced level reading. Thus, on average, 8^{th} graders are reading at a level just below the level of proficiency.
The instructor then patiently asked the student a series of questions regarding his research, during which the student discovered:
 The statement about “all students in 8^{th} grade” could not possibly be true, and that his report should have alerted readers to the reality that thought the target population of 8^{th} graders in 2015 was 3,911,000, a sample size of 139,000 produced the data in his essay. This means that roughly 3.5% of the target population was represented in the testing.
 The statement about the average indicating that students on average got 53% of the test correct may or may not be true. The student did not report anything on how the test was scored, and so the assertion is something that might require further research.
 The graph itself is accurate, but could be enhanced. The Y axis ranges from 260 to 269, and thus does not reveal the distance between the average and a perfect score, and it tends to exaggerate the jump in scores from 2011 and to 2013. A graph illustrating the full range of scores possible would show readers that the average reading score over time has not dramatically changed, and is rather flat.
The Scale of Graph
A student examined a graph in which researchers illustrated the differences between the test scores of two populations in a test. The experimental group received a high dose of caffeine shortly before the test, while the control group received no caffeine before the test. At a glance, the differences between the two scores appeared to be dramatic, and so the student concluded that caffeine has a tremendously adverse effect on testtaking. Review the following two graphs (Figure 1 and Figure 2), and then address the subsequent questions.
Figure 1: Average Test Scores of Students with and without Caffeine
Figure 2: Average Test Scores of Students with and without Caffeine
Questions:
 How does format of the graph influence the scale of differences represented in the scores?
 What should readers do when they read graphs to avoid making errors in their interpretations of these differences?
The Hidden Variables
The previous study of caffeine’s influence on students’ test scores provides a second lesson on the importance of close reading and critical thinking. Read the following narrative from a fictitious report, and then discuss the questions that follow.
Researchers found that the ingestion of high doses of caffeine adversely impacts students’ test scores. They observed a consistent trend whereby the test scores of those who had no caffeine prior to the test did slightly better than those who had high doses of caffeine before the test, and that the differences in the tests scores were sustained across all classes.
Questions
 The tests scores appear to increase slightly by class level in both the experimental and control group; what does this suggest about the results?
 What other variables might have affected the test results?
The Trouble with Tables
Situating data in tables provides readers with a quick way to understand research findings, but like graphs, they can be difficult to interpret. Take the following quiz that requires readers to interpret a set of tables, then review the answers and provide insight on how to accurately read each table. (Please note, these tables are not based on actual studies).

Among U.S. children ages 1217 in 2005, 34.6% represents:
Percentage of Children Ages 1217 in U.S. Who Weekly Search Internet Pornography
Year
All
Male
Female
White
Black
Hispanic
Asian
1995
1.8
3.6
1.1
12.5
2.7
2.6
1.2
2000
4.4
6.7
2.6
16.7
9.5
5.6
1.8
2005
21.3
16.4
6.5
34.6
11.8
6.7
2.0
2010
48.4
37.7
18.6
42.5
21.7
10.2
4.7
2015
57.3
53.9
20.1
57.3
27.8
13.2
9.7
a. The percentage of children who search Internet pornography that are white
b. The percentage of white children in the survey of children’s Internet searches
c. The percentage of white children that searched Internet pornography
2. Which statement best describes the 18.3% highlighted in this table?
Percent Distribution of First Marriages by Age and Gender 


Gender and Marital Status 
Total 
Under 20 
2024 
2529 
3034 
3545 
4660 
Over 60 
Men 








1980 
100 
22.4 
37.6 
21.3 
6.2 
5.7 
4.8 
2.0 
1985 
100 
21.6 
35.8 
18.4 
15.6 
4.8 
2.7 
1.1 
1990 
100 
18.5 
30.4 
27.7 
15.2 
4.3 
2.7 
1.2 
1995 
100 
16.5 
26.9 
27.5 
18.3 
6.7 
2.4 
1.7 
Women 








1980 
100 
24.7 
37.8 
21.3 
10.7 
3.0 
1.0 
1.5 
1985 
100 
22.1 
36.2 
19.4 
15.4 
4.1 
1.7 
1.1 
1990 
100 
17.8 
20.5 
27.2 
15.7 
12.0 
4.0 
2.8 
1995 
100 
15.2 
20.9 
29.7 
17.6 
8.4 
6.4 
1.8 
a. In 1995, 18.3% of all men got married between the ages 3034
b. In 1995, 18.3% of all men ages 3034 who married were married for the first time
c. In 1955, 18.3% of firsttime marriages for men were for men aged 3034
3. Which assertion is true regarding the data in this table?
2010 Auto Accidents in the U.S. Involving Cell Phones 

Driver’s Gender 
White 
African American 
Hispanic 
Asian 
Native American 
All 
Male 
9,954 
4,672 
3,015 
2,877 
1,982 
22,500 
Female 
4,557 
3,201 
1,394 
1,415 
933 
11,500 
All 
14,511 
7,873 
4,509 
4,192 
2,915 
34,000 
a. In 2010, there were a total of 34,000 auto accidents in the U.S.
b. In 2010, white female drivers were involved in nearly the same number of accidents as all both male and female Hispanic drivers
c. The last number in each column represents 100% of the population named atop the column
d. All of the above are true
e. Only a and c are true
One of the keys to understanding each other three sample tables is to understand whether any of the cells in the table represent a total or 100 percent of a population, and if so, which cell is that representative. Here are some tips for each of the sample tables and the quiz questions.
 The correct answer for the first question regarding children and Internet searches is “C.” Note that each column in the table, except for “year” and “All,” represents 100 percent of the children in the survey who fit that description. Hence, one way to correctly answer the question might be to rephrase the question and to ask: “Of the 100 percent of all the white children ages 1217, how many searched the Internet for pornography on a weekly basis?” See also that none of the rows or column add up to 100 percent or even a grand total of the number in each category. This reinforces the fact that each percentage found under each column heading is a percentage of that whole, and it also underscores the reality that the table may not speak for all children ages 1217, as the column headings do not include headings for those of mixed race, Native Americans, or those of Middle Eastern descent.
 The correct response for the second table is “C.” It is helpful to focus on the title of the table and think about what is being measured. In this case, the data does not represent multiple marriages by age, and so answer “B” cannot be true; and, the research also did not study all men—it only studied the age at which men and women experienced their first marriage. The study does not include people who never got married. In the second question, the column labeled “total” indicates that each cell to the right of the “total” is 100 percent, and so, the percent of each cell (in this case range of age column) should add up to 100 percent. Readers should note that this table is subdivided so that the data for first marriage by age is arranged by gender.
 The correct response for the third example is “C.” Readers should note that the column labeled “All” appears with a row labeled “All.” This means that there is more than one total represented in the table. Just as each cell in the last column may be added to reach a total, so too the last number in each row may be added to reach a total. It is possible to render percentages from these numbers, but they are not the focus of the table.
 Each table allows readers to quickly make comparisons of discrete populations in the studies. In addition, tables are useful in understanding trends over time, as illustrated in the first and second example.
Smoky Stacks
Stacked charts are sometimes difficult to interpret, in part because many readers are accustomed to anticipating to see comparisons illustrated by individual bars in a graph and not within them. Read the following chart (based on fictitious data) and answer the question that follows.

Which statement is accurate?
 Twice as many people in Sweden as in Italy generally oppose ownership of semiautomatic weapons in Italy
 About 30% of Russians generally support ownership of semiautomatic weapons with exceptions
 There is a stronger consensus against public ownership of semiautomatic weapons in Japan than there is in France
 About one in four American absolutely oppose citizen’s ownership of semiautomatic weapons without exception
The correct response to the question is “B.” In general, stacked graphs represent 100 percent of a given population or thing in a single column. The difficult in reading them comes with the scattered middle ground. Notice how the middle responses for the question about gun ownership is not aligned across the graph. At a glance, it may seem that France has a stronger consensus on the matter than Japan, because more Japanese subjects absolutely object to citizens’ ownership of semiautomatic weapons than do the French; but, in both countries, only 5.5 to 4.5 percent support such ownership in any way, and in both cases, roughly 95 percent oppose such ownership in one form or the other.
References
Gal, I. (2004). Statistical literacy: Meanings, components, responsibilities. In Dani BenZvi and Joan Garfield (Eds.), The Challenge of developing statistical literacy, pp. 1746. Netherlands: Kluwer Publishers.
Pfannkuck, M. & Wild C. (2004). Towards and understanding of statistical literacy. In Dani BenZvi and Joan Garfield (Eds.), The Challenge of developing statistical literacy, pp. 315. Netherlands: Kluwer Publishers.
Raykov, T., & Marcoulides, G. A. (2013). Basic statistics: an introduction with R. Lanham, MD: Rowman & Little field Publishers, Inc.
Shields, M. (1999). Statistical literacy: Thinking critically about statistics. Association of Public Data Users. Retrieved from: http://www.statlit.org/pdf/1999SchieldAPDU.pdf