Explainer: What is disaggregated data, and why don’t social purpose organizations have access to it?

In this year’s federal budget, the Liberals promised to create a national disaggregated data strategy. Will it help the sector?

Why It Matters

Over the past couple of years, there has been an increased demand for disaggregated data in Canada in order to identify and address social and racial inequities faced by vulnerable populations. But few understand what the term really means, how it works and why Canada is investing in collecting such data now.

Saying “disaggregated data” can be a bit of a tongue twister. 

But lately, it’s a term that seems to be at the tip of everyone’s tongue — or anyone that understands what it means, anyway. 

Earlier this year, the federal government of Canada too jumped on the disaggregated data bandwagon, announcing that it would give a whopping $172 million over a period of five years to Statistics Canada to improve its collection and analysis of such data sets. 

So if you’re wondering, “What exactly is disaggregated data and why is the government willing to spend so much money on it in the midst of a global pandemic?”— we’ll break it down for you. 

 

Disaggregated data

Critical information about peoples’ experiences and the issues they are impacted by is conveyed through two types of data today: aggregated and disaggregated data. 

Aggregated data are “big” datasets that group people together, generalizing information by and for the collective population whose data is collected and analyzed. 

To disaggregate data, on the other hand, means to break down the “big” or aggregated datasets into specific and detailed sub-categories that are representative of the different facets of human identity and experience — like gender, race, profession or level of education.

In 1989, Professor Kimberlé Crenshaw theorized and explained how the diverse sub-categories of human identity “intersect” with one another to make a person who they are. “Intersectionality” has since become a critical lens for viewing how the different aspects of a person’s identity influence not only their decision-making, but also their outcomes in society. 

The impact of colonialism and capitalism today is that racialized and marginalized communities across the globe face significantly more socio-economic challenges than their white and more privileged counterparts. 

An example of the way disaggregated data illustrates these impacts of colonialism is in the case of systemic racism within the American police forces. Black American men face the highest risk of death by police brutality than any other population in the country, according to a 2019 study. The study of disaggregated data surrounding police brutality-related deaths found that one in every 1,000 Black men can expect to be killed in an event of police brutality in America. The same study noted that while people of colour were significantly more likely to face such discrimination and violence, white men and women were the population least likely to face risk of being killed by police use of force. 

Disaggregated data that represents the experiences of marginalized and racialized communities is therefore critical to identifying and bridging historical and present gaps in racial and social equity faced by Canadians. 

But Crenshaw and “intersectionality”, along with most of the statistics that are disaggregated, are of American origin. 

So when March 2020 rolled around and COVID-19 struck, everybody found themselves facing a new set of challenges –– with “the most vulnerable populations” especially experiencing the “unprecedented impacts” of the pandemic, according to Statistics Canada.

But Canada, unlike its cross-border counterpart, was stumbling in the dark without certain key information with which to navigate the pandemic. 

 

Canada lacks the disaggregated data the social purpose sector need

Did you know, for example, that the population most affected by multiple myeloma are Black people? Research shows that this cancer is twice as common and twice as deadly in people of African descent in America, according to a study conducted by the Multiple Myeloma Research Foundation

Marcie Baron is the communications and marketing manager for Myeloma Canada, a national non-profit working to improve the quality of life of those living with this kind of cancer. She explains, “The statistics that we’ve been using actually come from the United States. And we’re drawing parallels, because there was really no reason to believe that statistically, the experience would be any different for Canadians than it is for Americans.”

Still, when the pandemic struck, Baron’s organization –– like the rest of Canada’s social impact sector that is working on the frontlines to bridge inequities –– faced a frightening lack of specific disaggregated data surrounding Canadians and COVID-19.

This year’s federal budget further confirmed:, “At present, Canada lacks the detailed statistical data that governments, public institutions, academics and advocates need in order to take fully informed policy actions and effectively address racial and social inequities.” 

The lack of such information puts marginalized populations at risk of being drowned out of important conversations surrounding which causes need immediate attention and resources to be directed towards them, according to Arlene MacDonald, a member of the Advisory Committee on the Charitable Sector (ACCS). 

MacDonald explains, “When you start to look at data which is then used in decision-making, it’s important to see who’s benefiting from a situation, and who’s not, and who’s impacted and who isn’t. That’s not very often told in the aggregate story. So I think disaggregated data really tells us where we need to be looking to begin to attend to the consequences and the unintended harm that we can do and have done by ignoring the disaggregated and the individual stories.”

Evidently, disaggregated data is a powerful tool for measuring social inequity and therefore informing crucial decisions surrounding what needs to be made available to which sections of society by whom in order to support vulnerable populations in Canada. 

But Louise Binder, a health and policy advocate for Save Your Skin Foundation, insists that this tool comes with its own sets of limitations, with the potential to turn into a double-edged sword.

 

Limitations and unintended harms

Binder says that, “Everybody, unfortunately, seems to be in the data-gathering game these days, which means that no one in particular is responsible for collecting disaggregated data…”

She explains, “So we have a number of federal agencies that are collecting data, we have provincial agencies collecting data, we have public and private institutions collecting data — even insurance companies collect data. The problem is, there doesn’t appear yet to be an agreement or consensus amongst all these data collectors to collect and publish this data in a way that we [the social impact sector] can at least look at the datasets together and draw conclusions from it.”

The reason such an agreement may not be possible, Binder says, is because, “No data should be collected without the consent of the person who gave it; and without their consent to the use of it, in whatever way you are planning to use it, no data should be collected.” And the likelihood of the masses consenting to mass collection of their personal information is slim.

She says that the data-collection process is further complicated by the fact that “many of these [vulnerable] populations don’t have trust in the system that exists — whether that is governmental or even non-governmental bodies sometimes,” following generations of systemic oppression. They therefore do not easily consent to giving their information away. 

“So then the data you have is not accurate, and it’s not useful because it’s not representative of these populations and you can end up painting a picture that isn’t true of the issues faced by these communities,” Binder explains. This limitation in collecting and using disaggregated data as a tool for measuring social inequity can therefore result in misinformation through inaccurate data — becoming a double-edged sword that can further harm already vulnerable populations. 

Furthermore, in the case of marginalized and racialized populations who are in dire need of immediate support amidst the pandemic, data-collection without purposeful follow-through is just as good as not collecting the data in the first place. (If you’ve identified the problem, but won’t do anything about it, what was the point?) Which is why, MacDonald says, “I would like to see some consistency and intention in our data collection.”  

Binder says that ultimately, “If we get underneath the issues faced by certain sections of society, and we are actually asking questions in order to support those populations by understanding how to help them, but also get at some of the underlying socio-economic reasons for their contexts –– if we’re really willing to do that, if we’re actually willing to do something about the issues that the data shows us, then it’s worth the disaggregation.”

Stay tuned for more of the Disaggregated Data in the Social Impact World special report, where we dive into topics like how social purpose organizations are collecting and using disaggregated data, how the lack of disaggregated data limited COVID response, and collecting such data ethically and responsibly. 

Tell us this made you smarter | Contact us | Report error