Correlation doesn’t equal causation (but it does equal a lot of other things)

Correlation ≠ causation

Data are pieces of information, like the number of books checked out at the library or reference questions asked. Those pieces of information are simply points on a chart or numbers in a spreadsheet until someone interprets their meaning. People create charts and graphs so that we can visualize that meaning more easily. However, sometimes the visualization misleads us and we come to the wrong conclusions. Such is the case when we confuse correlation (a statistical measurement of how two variables move in relation to each other) with causation (a cause-and-effect relationship). In other words, we assume one thing is the result of the other when that might not be the case.

Strong correlation = predictability

The confusion often occurs when we see what’s called a strong correlation—when we can predict with a high level of accuracy the values of one variable based on the values of the other. As an example, let’s say we notice our library is busier during the hotter months of the year, so we start writing down the temperature and number of people in the library each day. Our two variables are temperature and number of people. A graph representing these data might look like this: 

This graph is called a scatterplot, and researchers often use it to visualize data and identify any trends that might be occurring. In this case, it looks like as the temperature increases, more people are visiting the library. We would call this a strong positive correlation, which means both variables are moving in the same direction with a high level of predictability.

Correlation = positive or negative; weak or strong

You can also have a strong negative correlation, which would show one value increasing as the other decreases. It would look something like this, where the number of housing insecure patrons in the library are decreasing as the temperature outside increases. 

The closer the points are to forming a compact sloped line, the stronger the correlation appears. If the points were more scattered, but we could still see them trending up or down, we would call that a “weak” correlation. In a weak correlation the values of one variable are related to the other, but with many exceptions. 

Correlation = a statistical measurement known as r

Without getting too deep into statistical calculations, you can determine how strong a correlation is by the correlation coefficient, which is also called r. Values for r always fall between 1 and -1. 

  • The closer r is to 1, the stronger the positive correlation is. In the first example graph above, if r = 1, this would mean there is a uniform increase in temperature and patrons visiting the library, with no exceptions. An 80-degree day would always have more visitors than a 75-degree day. The points on the graph would form a straight line sloping up. 
  • The closer r is to -1, the stronger the negative correlation is. In the second example graph above, if r = -1, this would mean there is a uniform increase in temperature and decrease in housing insecure patrons visiting the library, with no exceptions. A 40-degree day would always have less housing insecure patrons than a 35-degree day. The points on the graph would form a straight line sloping down.
  • The correlation becomes weaker as r approaches 0, with a value of 0 meaning there is no correlation whatsoever. The change of one variable has no effect on the other. In the first example above, if r = 0, one 80-degree day may have more visitors than a 40-degree day, whereas a second 80-degree day may have less visitors than a 40-degree day. There is no consistent pattern.

Correlation = an observed association 

Let’s focus on the first chart. If we did this calculation, we would find that r = 0.947.  Should we conclude that high outside temperatures cause more people to visit the library? Does that mean we should crank up the air conditioning so we can draw in more visitors? Not so fast. 

All we can conclude from these data is that there is an association between the outside temperature and people in the library. It’s a good first step to figuring out what is going on, but it’s not possible to conclude temperature causes people to visit or not visit the library. There could be other causes at play. We call these lurking variables.

Correlation (might) = something else entirely

A lurking variable is a variable that we have not measured, but affects the relationship between the other two variables (outside temperature and number of people in the library). Warmer weather usually occurs in the summertime when kids are out of school. So the increase in the number of people could be because of your summer reading program and kids having more time to come visit. The temperature outside might also affect the hours of your library. Did you have to close often during the winter because of snowstorms? Maybe you operate longer hours in the summer because you know it’s busier that time of year.

The point of the previous example is to show that association does not imply causation. You could find support for a cause-and-effect link by asking patrons their reasons for coming to the library through surveys or interviews. However, only by conducting an experiment can you truly demonstrate causation.

Correlation = a starting point, not a conclusion

Before I leave you, there’s one very important point to make. Sometimes the best we can do is say there’s a correlation between these data and that’s it. In the real world, dealing with real people, it can be difficult or controversial to investigate causation through experiments. For instance, does education reduce poverty? There’s a strong correlation, but we can’t run an experiment where we educate one group of children and withhold education from another. Poverty is also a really complex issue and it’s difficult to control for all other interacting variables. In this case, and many others, researchers use the observed association as a first step in building a case for causation. 

LRS’s Between a Graph and a Hard Place blog series provides strategies for looking at data with a critical eye. Every week we’ll cover a different topic.You can use these strategies with any kind of data, so while the series may be inspired by the many COVID-19 statistics being reported, the examples we’ll share will focus on other topics. To receive posts via email, please complete this form.


What’s typical and why does it matter?

Average is one of those statistics that comes up a lot. What does it mean? How can we use it? What are its limitations? Today we’re going to talk about both the average, also known as the mean, and another statistic called the median. Means and medians are both ways to find out what’s typical and to compare multiple things.

It’s easier to understand what these two statistics tell us when you know how they are calculated. Don’t worry if you don’t think of yourself as a “math person.” We’re only going to use addition and division. 

What is it?

Let’s do the basic math of how a mean is calculated using an example. I have a storytime at my library, and I want to know the mean age of the children attending today. With their caregivers’ help, I find out the ages of the five children who are there: 3, 3, 4, 2, 3.

So, to calculate the mean age:
3+3+4+2+3 = 15 ← add up all the ages to make a total
15/5 = 3 ← divide the total by the number of children
The mean age is: 3

Why is it used?

The mean tells us what is typical for a group of values. It’s useful to know what a typical value is because it can help you compare multiple groups of values. Let’s say you think that one of your regular storytimes has younger children than another. You find out participants’ ages at your storytimes on Tuesdays and on Saturdays. After doing this for many months, you see a pattern: usually the typical age of participants on Tuesdays is three, but the typical age on Saturday is five. Because you have these data, you decide to start planning slightly different activities for the Tuesday and Saturday storytimes. Useful, right?

What can go wrong?

Means, like all statistics, have pros and cons. Outliers are unusual pieces of data that can really change the mean. Let’s say someone’s older sibling comes to storytime that same day we calculated the mean for already. So now our data are: 3, 3, 4, 2, 3, 9.

We calculate:
3+3+4+2+3 + 9 = 24 ← add up all the ages to make a total
24/6 = 4 ← divide the total by the number of children
The mean age is: 4

Four! Add one nine year old sibling, and the mean jumps all the way to four. Should you change the storytime to be more geared toward four year olds because this nine year old came once? No, probably not.

Enter the median. The median is another way of calculating a typical value, and is less impacted  by an outlier. The median is the middle value in a data set. Or to put it another way, half of the values in the data set are higher than the median and half are lower. 

To calculate the median, put the data in order from lowest to highest, and identify the middle value. Here are our data in order: 2, 3, 3, 3, 4, 9.

In this case, we have an even number of data points, so three and three are both middle values. If we had an odd number of data points, the middle value would be the median–end of calculation. When you have two middle values, you get to bring in your old friend mean to help figure out the median:

3+3 = 6 ← add up the two middle ages to make a total
6/2 = 3 ← divide the total by the number of middle ages 

Surprise, surprise. Three is our median! This is why the median can be so helpful. When there are outliers that will change the mean, the median is not impacted as much and is a more accurate indicator of what’s typical.

Cool math lesson, now what?

Means and medians are both ways to find out what’s typical and to compare multiple things. Here are some examples of how this comes up in everyday life.

What’s typical? Why would we want to know?

  • What is the mean temperature in Colorado in May? 
    • Should I keep a sweater out?
  • What’s the mean value of this car I might buy? 
    • Am I paying too much?

Are these two things similar or different? Why would we want to know?

  • What is the mean salary for library staff in one state compared to another state?
    •  Maybe there’s a place that pays similar where it doesn’t snow in the spring?
  • What was the mean ebook circulation in public libraries in 2019 compared to 2009?
    •  Is ebook circulation increasing, decreasing or staying the same?

The mean and median are a good place to start investigating a question to orient yourself. The key to using means and medians well is to not stop with them. They both indicate what is typical, but not the whole picture. It is important to check, like we discussed before, that what is being compared is actually comparable. The mean doesn’t necessarily take other variables into consideration. For example, comparing the mean salary for library staff in two different states doesn’t take into account the cost of living. The same salary could result in very different qualities of life in two different places. We’ll talk more about the importance of other variables soon. 

Any statistic is tied to the underlying data

Keep in mind that the accuracy of statistics depends on the quality of the dataset that statistic is about. How the data were collected, how much data were collected, and to what extent the data represent the subject all impact the quality. For example, if you collected the data by guessing children’s ages instead of asking, we don’t know if the mean is accurate because we don’t know if the underlying data is correct. 

Even with accurate data, there are limits on the conclusions you can draw. In our example, the data we were collecting about the age of storytime participants would be helpful to your specific library, but you can’t conclude that the mean age of all participants in all Tuesday storytimes everywhere is three. We didn’t collect those data. We have no idea if your storytime on Tuesday is like other libraries’ Tuesday storytimes.

Numbers can’t do the thinking for you

Life is unpredictable and messy. You may have storytimes for months where the mean age is three, and then one week a bunch of two year olds come. Your mean will change then, and you have to decide what to do with that information. Do you want to adjust the storytime? Do you find that there’s a key developmental change between ages two and three and you’d like to market some storytimes for children two and younger? The statistics can help guide your decision, but they will never tell you what to do. You have to decide how to use the statistics and the other information you have to understand what’s happening and what you want to do. 

The tip of the statistics iceberg

For the mean and median to be good measures of what’s typical, the dataset needs to meet some criteria. Those criteria get into probability and what the dataset looks like when it’s arranged a certain way (its distribution). For our purposes here, you don’t need to be deeply familiar with those concepts. If, however, you want to learn more, you could start here.

LRS’s Between a Graph and a Hard Place blog series provides strategies for looking at data with a critical eye. Every week we’ll cover a different topic. You can use these strategies with any kind of data, so while the series may be inspired by the many COVID-19 statistics being reported, the examples we’ll share will focus on other topics. To receive posts via email, please complete this form.

Do the Data Have an Alibi?

It’s hard to know who you can trust out there. “Fake news” is now a prevalent concept in society and the internet makes it possible for anyone to publish—well, anything. As library professionals, you’ve probably already acquired a great deal of experience determining the credibility of sources. Many of the same skills are needed to evaluate the credibility of data. So amid the bombardment of new (sometimes conflicting) data out there, here are some questions to ask when verifying sources and checking for bias. Although not fool-proof, these strategies can help you avoid misleading information and false claims.

Let’s establish something important upfront: ANY scientific paper, blog, news article, or graph can be wrong. No matter where it’s published, who wrote it, or how well supported are the arguments. Every dataset, claim, or conclusion is subject to scrutiny and careful examination. Sounds like a lot of work? As we incorporate these strategies into our everyday consumption of information, they become second-nature, like checking your blind spots when changing lanes in traffic. You don’t have a sticky note on your dashboard reminding you to do it, but you do it without even thinking about it. 

When reviewing any kind of research, we often think of the five Ws—where, when, who, what, and why. The same can apply here.

Question 1: Where are the data published? 

Imagine a spectrum of credibility with reputable scientific journals on one end and blogs on the other. Government, public sector, and news media sources fall somewhere in between. What distinguishes one source from another on the spectrum is their verification process. Most journal articles undergo peer-review, which checks if a researcher’s methods are sound and the conclusions consistent with the study’s results. But be aware of “predatory journals” that charge authors to publish their research. Here’s a nifty infographic to help tell the difference. 

On the other end of the spectrum, anyone can create a blog and publish unfounded claims based on questionable data. Somewhere between these two extremes are government and public-sector (think tanks, non-profits, etc.) sources. It wouldn’t make much sense for an organization to publicize data that doesn’t support their mission. So while the data might be accurate, it might not contain the whole story. On the other hand, objectivity serves as a foundation for journalism, but the speed at which journalists are forced to get the news out there means mistakes happen. For instance, data might be misrepresented to visualize the attention-grabbing portion of the story leading to distortions in a graph’s axis. When mistakes happen, it’s a good sign if the news source posts retractions and corrections when appropriate. 

Question 2: When were the data collected?

I was looking at a dataset recently on the use of technology in a given population. Only 27 percent accessed the internet and 74 percent had a VCR in their home. In this case it’s probably safe to assume the data are outdated, but it might not always be so obvious. Additionally, some data become outdated faster than others. For instance, technology and medicine can change day to day, but literacy rates don’t skyrocket overnight. Always try to contextualize information within the time it was written. 

Question 3: Who are the authors?

Part of the reason we aren’t talking specifically about COVID-19 data is because we aren’t experts in the fields of public health or epidemiology. When looking at data, take a minute to assess who conducted the research. Google their name to find out if they’re an expert on the topic, have published other articles on the subject matter, or are recognized as a professional in a related field. 

Question 4: What are their sources?

Police don’t take a suspect’s alibi at face value, they corroborate the story. Sources should also corroborate their claims by citing other credible sources. Journal articles should have a lengthy works cited section and news articles should name or describe their sources. Charts and graphs usually cite a source somewhere underneath the visual. For primary data, try to find a methodology section to check things like how the data were collected and if there was a sufficiently large sample size.

Question 5: Why did they publish this? 

For years, one of the world’s largest soft drink companies funded their own research institute to publish studies that said exercise was far more important than diet in addressing obesity. Their motivation? As obesity levels rose globally, they wanted to ensure people would continue buying their products. In short, everyone has a motivation and it’s our job to uncover it. A good place to start is by tracing the money trail. Who is funding the publication or study? Do they have an incentive (political, business, financial, to gain influence, etc) other than getting the data out there? Use this information to decide what kinds of bias might be impacting the findings.

Whew, that was a lot. Here’s a simple chart that summarizes the info if you need a review. Just remember, it’s hard to ever know if something is 100% accurate. By asking these questions, we aren’t just accepting information on its face, but rather taking the time to review it critically.

LRS’s Between a Graph and a Hard Place blog series provides strategies for looking at data with a critical eye. Every week we’ll cover a different topic. You can use these strategies with any kind of data, so while the series may be inspired by the many COVID-19 statistics being reported, the examples we’ll share will focus on other topics. To receive posts via email, please complete this form.


Habits of Mind for Working with Data

Welcome back! We’re excited to have you with us on this data journey. To work with data, it helps to understand specific concepts—what is per capita, what is an average, how to investigate sources. These are all valuable skills and knowledge that help you navigate and understand data. What you may not realize is that the mindset you use to approach data is just as important. That’s what this post is about: how to work with data and not melt your brain. I have melted my brain many times, and it can happen no matter how great your hard skills are. 

Imagine that working with data is a bit like working with electricity. Electricity is very useful and it’s all around us. At the same time, you can hurt yourself if you’re not careful. If you need to do something more involved than changing a light bulb, you should turn off the power and take off metal jewelry. Those are good habits that keep you safe. You need good habits to take care of yourself when you work with data too. Today I’m sharing four habits that I learned the hard way—by NOT doing them. Please learn from my mistakes and give them a try the next time you work with data.

Habit 1: Give yourself permission to struggle and permission to get help 

A big part of my job is teaching people to work with data. At the beginning, almost everyone feels self-conscious that they aren’t “numbers people.” Every time I work with a new dataset, I have a moment where I think, “What if I just look at these data forever and they’re gibberish to me?” I have to keep reminding myself that working with data is hard. If you look at a graph and think, “I have no idea what this says,” don’t assume that it’s beyond your comprehension. Talking to other people and asking them what they think is a vital tool for me. Sometimes they understand what’s going on and can explain it to me; other times they are equally confused. Either way, that feedback is helpful. When you get stuck, remember to be patient with yourself. Think about why you’re interested in what these data say, and focus on your curiosity about them. Working with data is a skill you can learn and get better at. It’s not a test of your intelligence. When it’s hard, that’s because working with data is hard. 

Habit 2: Acknowledge your feelings about the topic

It’s natural to have feelings about the world we live in, and data are a representation of our reality. Recently I did some research on suicide rates in rural Colorado, which is an important issue for libraries. I felt sad when I reviewed those data. Feelings can be even more tricky with data we collect about programs and services that directly involve us. Here at the State Library, we ask participants to complete a workshop evaluation whenever we provide training. When I get feedback that someone found my workshop useful, I am so excited. When I get feedback that they were bored, I feel bad. Check in with yourself when you’re working with data that may bring up negative feelings and take a break if you need to. Then see Habit 3.  

Habit 3: Like it or not, data provide an opportunity to learn

Don’t confuse your feelings about the topic with the value of the data or the data’s accuracy. We all have beliefs and values that impact how we see the world. That’s normal. At the same time, our beliefs can make certain data hard to swallow. If the data make us feel bad, and we wish they were different, it’s easy to start looking for reasons that the data are wrong. This applies both to data that directly involve us and large-scale, community data. Remember how I said I felt bad when I got feedback that someone was bored in my presentation? I still need to review and use those data. What if I read results from a national survey that a large percentage of people think libraries are no longer valuable? I don’t feel good about that, but it’s still true that the people surveyed feel that way. Try to think of the data like the weather. You can be upset about a snowstorm in April—but that doesn’t mean it’s not snowing. You could ignore that data and go outside in shorts and sandals, but you’re the one who suffers. Better to face the data and get a coat. Data—whether you like their message or not—give you an opportunity to learn, and often to make more informed and effective decisions. Acknowledge your feelings and then embrace that opportunity.

Habit 4: Take breaks

Between trying to understand what the data say, reminding yourself you’re smart and capable, and acknowledging your feelings about the topic, you can wear yourself out quickly. It’s important to take breaks, do something else, and come back when you’re ready. Think of analyzing data like running as fast as you can. You can run really fast for short periods of time, but you can’t run that fast all day every day. Learn to notice when the quality of your thinking is starting to deteriorate. Usually I reach a point when I start to feel more frustrated and confused, and I picture my synapses in workout clothes, and they’re all out of breath and refusing to get up and run more. That’s a good signal for me that it’s time to take a break.


Learning something new is hard. Many of us received very limited training in how to work with and understand data. As you learn these strategies, keep in mind that how you approach data is just as important as the hard skills you’re learning. Take care of yourself out there and we’ll see you back here next week.

LRS’s Between a Graph and a Hard Place blog series provides strategies for looking at data with a critical eye. Every week we’ll cover a different topic. You can use these strategies with any kind of data, so while the series may be inspired by the many COVID-19 statistics being reported, the examples we’ll share will focus on other topics. To receive posts via email, please complete this form.

Measuring Social and Emotional Learning Competencies in a Summer Learning Program


Denver Public Library (DPL), in collaboration with Library Research Service (LRS), was recently featured in School Library Journal. The article highlights DPL’s evaluation of their summer learning program and use of data to inform programmatic decision making. Below is a summary of the results. To learn more about their data collection methods, analysis, and application of findings, you can read the full article here.

Denver Public Library (DPL) knew anecdotally they were positively affecting the social and emotion learning (SEL) of their youngest patrons, but needed to find a way to measure it. So in 2017, when they began shifting from a summer reading program to a summer learning program, they wanted to take the opportunity to evaluate the program’s impact. Their new program, titled Summer of Adventure, aimed to build relationships and facilitate social and emotional learning in addition to addressing summer learning loss.

With the help of Library Research Service (LRS)’s research analyst, Katie Fox, DPL began focusing on outcomes (the impact a program has) over outputs (registration and attendance). Their outcome goal for the program was: “After Summer Academy, participants will gain or enhance their social and emotional skills.” Knowing they could not likely see measurable change in SEL skills during the month-long program, their evaluation question became: “What social and emotional skills do youth participants currently have?” By understanding what skills youth needed to build upon, DPL could learn more about how different types of programming could encourage positive SEL behaviors.

During the evaluation, DPL utilized various data collection methods, which presented some limitations and challenges. An analysis of the data revealed two key findings:

  1. Relationship building occurred more during unstructured rather than structured activities; and
  2. Youth participants showed the most positive self-management during moderately challenging activities allowing many ways to complete the product.

Library staff used this information to help make strategic decisions about future programs and communicate with external stakeholders and funders about the program’s value. DPL continues to adjust the program to better support SEL, intrinsic motivation, and life-long learning. To learn more, read the full article here.


How to Compare Apples to Oranges

As our brains process information, we constantly make comparisons. It’s how we decide if something is good or bad—by it being better or worse than something else. However, like apples and oranges, not all things can readily be compared, even if they appear similar enough on the surface. We often make this mistake with data because we want to be able to draw simple conclusions. But when our goal is accurate information, it’s imperative to look at presentations of data  through a critical lens by applying these basic strategies.

So who’s better? 

Let’s say you wanted to determine whether Library A or B was doing a better job at reaching its community. To do so, you compare annual visits at both. This chart would lead you to conclude Library B has much more annual traffic and is therefore reaching more of its community than Library A. But are Library A and B comparable?

Library A serves a population of 5,400 while Library B serves a population of 30,500. When making comparisons among different populations, data should be represented in per capita measurements. Per capita simply means a number divided by the population. For instance, when we compare countries’ Gross Domestic Products (GDP), or value of economic activity, we usually express it as GDP per capita because it would be misleading to compare China’s GDP to that of Denmark. China’s GDP trounces Denmark’s, but that doesn’t mean Denmark’s economy is struggling. China is larger both in terms of the land it covers and the number of people that live there. It would be really weird if they had similar GDPs without the per capita adjustment. The same is true in this example. Take a look at how we draw an entirely different conclusion when total visits are expressed in a per capita measurement.

*Due to a 2-month closure, Library B’s data was only collected over a 10-month period

Now we can see that Library A has 18.5 visits per person i(100,000/5,400), whereas Library B only has 6.6 visits per person (200,000/30,500). These are the same data, but expressed in more comparable terms.

Let’s say Library B also closed for two months to do some construction on their building. Therefore, their annual visits account for 10 months of operation, not 12. Contextual information like this – which has a direct effect on the numbers – needs to be clearly called out and explained, like in the example above.

Breaking it down…

To check for comparability, it’s helpful to keep three things in mind: completeness, consistency, and clarity.

Completeness: are the data comparing at least two things? 

It would be incorrect to say “Library B has 100,000 more visits.” More visits than…Library A? Than last year? Also be wary of results indicating that  something is better, worse, etc. without stating what it is better or worse than.

Consistency: are the data being compared equivalent? And even if they appear equivalent , what information is needed to confirm this assumption? 

One of the best examples of inconsistency occurs when comparing data from different populations, particularly when we focus on total counts. “Totals” are often a default metric because it’s simple for a range of audiences to understand, but it can be very misleading, like in the first chart above. By expressing the data as per capita measurements, we can account for population differences and create a basis of similarity. Additionally, even if data appear similar enough to compare, you also need to review how they were collected. Any reliable research will include these details  (big red flag if it doesn’t !). For instance, it would be important to know that Library A and B were counting visits in the same way. If Library A is counting one week during the summer and multiplying that by 52 that wouldn’t be consistent with Library B who is counting during a week in the winter.

Clarity: Is it obvious and clear what is being compared? 

Data visualizations allow our brains to interpret information quickly, but that also means we may jump to conclusions. Be a critical data consumer by considering what underlying factors might also be at play. The second chart above clarifies that two months of data were missing from Library B. This could be one reason why Library B’s total visits per capita were so much lower than Library A’s. Also beware of unclear claims supposedly supported by the data, like “Library A has higher patron engagement than Library B.” Perhaps Library A defines engagement in terms of number of visits, but Library B’s definition is based on material circulation and program attendance. The data above do not provide enough information to support a comparable claim on engagement.

Comparisons are messy. Whether in library land or elsewhere, keep in mind that comparisons are always tricky, but also very useful. By engaging critically using the strategies above, we CAN compare apples and oranges. They are both fruit afterall…

LRS’s Between a Graph and a Hard Place blog series provides strategies for looking at data with a critical eye. Every week we’ll cover a different topic. You can use these strategies with any kind of data, so while the series may be inspired by the many COVID-19 statistics being reported, the examples we’ll share will focus on other topics. To receive posts via email, please complete this form.

New blog series: Between a Graph and a Hard Place

Hello, world!

We can all agree that these are strange times we are living through. Here at the Library Research Service, we’ve been thinking about how we can help. What skills could we share that might be useful to library staff and our communities?

As library and information professionals, before this pandemic we already spent a lot of time thinking about information, what it means, and how reliable it is. Here at LRS, we are data geeks in addition to being regular library geeks, so we think about data a lot too—the good, the bad, and the misleading.

Critically analyzing information is what librarians are trained to do. We can’t help ourselves. For me, this means every time I talk to my mom and she shares a statistic with me, I ask her about her source. I’m not trying to be a pain. This is just how my mind works.

Right now, we are all seeing a lot of data about the pandemic, and it can be challenging to understand. And this is where we come in.

Let’s be clear: we are not epidemiologists, we are not medical doctors, we are not experts in public health. We are not going to provide data about COVID-19 or interpretations. There are already good resources for both, and we don’t think it would help to add our voices.

What we can do—and we are going to do—is share strategies for looking at data with a critical eye. We’re going to cover a different strategy every two weeks, like thinking about the underlying data behind a visualization, identifying bias, evaluating the credentials of different experts, understanding that how the data are presented can impact how you perceive them, and how to find multiple perspectives on the same information.

We will also discuss how to engage with data carefully, with your mental well-being in mind. Data can make us feel a lot of things, and we all need to take care of ourselves.

This series is inspired by the current situation, but the examples we will share will focus on other topics. You can use these strategies with any kind of data.

We look forward to seeing you here every other Wednesday and hope that these strategies are helpful in this time of information overload. In the meantime, if you’d like some less serious data about a situation that many of us can relate to right now, check out this pie chart.

If you want to subscribe to receive the blog posts from this series by email, please complete this form.

Every Day is Earth Day in Libraries

Half a century ago, Earth Day began as a grassroots effort to bring attention to environmental issues. Now fifty years later, organizers are bringing the focus back to climate change, an admittedly enormous challenge, and urging everyone to take part in protecting and restoring our planet. As lending institutions, libraries have long understood their role as stewards of environmental responsibility. The Green Library Movement began in the early 1990s as a commitment to greening libraries by reducing their environmental impact on the planet. The movement gained popularity around 2003 and then in 2019, the American Library Association (ALA) Council adopted sustainability as a professional core value.

Is the Green Library Movement Growing?

In the years since the movement took off, libraries around the world have reviewed their operations and programming to identify ‘greener’ methods. Buildings have been rebuilt or remodeled to include energy efficient design and physical materials have been replaced with digital mediums. However, much debate still remains over whether libraries are fully embracing the challenge. A 2012 study conducted in Finland discovered that up to 60 percent of respondents believed the components of environmental management had not been taken into account enough in their own libraries. When asked about everyday routines, more than half of libraries were turning off lights after 10 minutes, switching computers off at the end of the working day, and sorting waste products. However, only about 10 percent of respondents said their library’s printers print on both sides of the paper by default. Less than five percent said laptops were preferred in computer acquisitions.

Measuring a Library’s Carbon Emission

Given limited resources, it can be difficult for a library to prioritize green initiatives, especially if they can’t pinpoint where they are expending the most energy consumption. To provide a better understanding of an institution’s Global Warming Potential (GWP), students at University of California Berkeley developed a carbon emissions calculator specifically for academic libraries. While it can be adapted to public and school libraries, the tool fails to take into account newer technologies (e-books, 3D printers, sewing machines, etc.), as well as waste produced from programming and outreach. Using the engineering library at UC Berkeley as a case study, researchers found that the HVAC was the largest abuser of annual power consumption (~145,000 kWh). Of materials, volumes had the greatest GWP at 5,394 metric tons of CO2 equivalent per year.

Greening a Community, Not Just the Library

Even if libraries are not huge carbon emission offenders, they still play a pivotal role in introducing sustainability initiatives to their communities. As green buildings, they can demonstrate the use of solar panels or reflective roofing, educate the community about residential use of rain barrels using their own rainwater collection systems, and incorporate native plants into landscaping to reduce reliance on irrigation.

E-books are another popular way of reducing the carbon footprint. A study at Boston College found that the majority of environmental waste for both e-books and paper books originates before reaching the hands of the intended audience. However, paper books contribute significantly more waste during distribution, making them less environmentally friendly. According to the study, a user would have to access 33 e-books on a device before offsetting the carbon footprint of one printed counterpart.

Libraries can also publicize green initiatives through creative programming. For adults, one librarian suggests screening a documentary related to sustainability. For children’s programming, another librarian tries to find materials that can be reused or repurposed. She also refuses programs that produce single-use waste. Being conscious of a program’s environmental impact—and highlighting that success—can be key takeaways for patrons.

Every Day is Earth Day in Libraries

Climate change may be the theme of this year’s Earth Day celebration, but more and more, it is serving as a foundation for libraries. Whether through building constructing, material use, or programming, multiple opportunities exist for libraries to become agents of change within their communities. Earth Day can be every day in a library.

Note: This post is part of our series, “The LRS Number.” In this series, we highlight statistics that help tell the story of the 21st-century library.

About 7 out of 10 US adults read a book last year, but among those without a high school degree it was 3 out of 10

About seven out of ten adults (72%) in the U.S. report that they read a book in the last 12 months. This percentage has stayed about the same since the Pew Research Center started conducting studies of adult reading habits in 2011, but it does vary depending on income and education.

In the most recent survey in 2019, nine out of ten college graduates (90%) said they read a book in the past year while only about three out of ten (32%) adults without a high school degree did. Higher percentages of women, Whites, those earning more than $75,000, and people living in urban areas reported reading a book in the past year. Males, Hispanics, those earning less than $30,000, and people living in rural areas reported lower rates of reading. The overall percentage of men who read a book decreased from 73% in 2018 to 67% in 2019.

The portion of people reading audio books is on the rise, and increased from 14% in 2016 to 20% in 2019. This increase is particularly strong for college graduates and those who earn more than $75,000. Print books, however, are still the most popular way for people to read: 65% of the people who read a book in the last year read a print book. Another 25% of people reported reading an e-book in the past year. About 37% of adults read only print books.

The full report can be found here.

Note: This post is part of our series, “The LRS Number.” In this series, we highlight statistics that help tell the story of the 21st-century library.

How much is your library worth?

We can all agree that libraries are valuable to our communities, but exactly how much are they worth? Libraries are under increasing pressure to translate qualitative services into quantifiable impact. One approach is to determine the Return on Investment (ROI) a library provides to community members. Doing so communicates the value of public libraries in terms of dollars and cents.

Traditionally a business metric, ROI measures a business’s profitability. Simply put, it compares costs to profits and expresses it as a ratio or percentage. For a public institution like a library, ROI demonstrates how much “value” is realized by the community for each dollar spent on services and materials. This includes:

  • The cost to use alternatives: the estimated amount of money that would have been spent to use an alternative if the library did not exist;
  • Lost use: for patrons who indicated they would not have tried to meet their needs with another source or would not have known where else to go, the estimated value of the direct benefit that they would not have received if the library didn’t exist;
  • Direct local expenditures: dollar figures for expenditures on goods and services within the library’s legal service area;
  • Compensation for library staff: the amount of annual compensation that staff members would not have received if the library didn’t exist; and
  • Halo spending: purchases made by library patrons from vendors and businesses that are located close to the library.
  • Some ROI methodologies also apply a dollar amount to patrons’ time and take the amount saved seeking materials or services elsewhere into account.

Two approaches are commonly used to calculate a library’s ROI: contingent valuation or market valuation, both of which have their strengths and weaknesses. Contingent valuation bases dollar values on subjective perceptions of responding library users. However, within those subjective perceptions, patrons may include a more holistic experience that takes into account the value of having various needs being met in one place. This method acknowledges that the value of a library is likely greater than the sum of the value of its individual resources and services. In contrast, market valuation bases dollar values on objective, “real world” values such as the use of electronic resources, material and book circulation, program attendance, reference services, and meeting room use. Perhaps the greatest advantage of this approach is that it can be pursued using readily available data, as opposed to contingent valuation that relies on patron surveys and interviews.

A meta-analysis of findings from 38 previous library ROI studies found that, on average, the return value for public libraries is 4 to 5 times the amount invested. A study conducted by Library Research Service in 2009 found similar results in Colorado using a contingent valuation methodology. Although valuation findings should not necessarily be extrapolated out to a state or national level, overall they can—and do—show decision makers, patrons, and the public that libraries are a wise investment.

Note: This post is part of our series, “The LRS Number.” In this series, we highlight statistics that help tell the story of the 21st-century library.