When you're figuring out how to measure your data quality, there's a lot of guidance out there.
A lot of it is framed in terms of dimensions of data quality. Dimensions are definitely a useful framing device for conceptualizing and aggregating data quality in important ways.
There's no set of data quality dimensions that is recognized as a universal standard. While this is OK, it can make it hard to get started.
Here's our suggestions of what are the core data quality dimensions to start with. We've divided them into three related categories: completeness, correctness, and clarity.
To envision how all these fit together, imagine that your data is pieces of a puzzle.
To get value out of your data, you need to assemble the puzzle (do data quality).
Completeness = having all the pieces to complete the puzzle shape.
Correctness = having all the pieces be from the same puzzle.
Clarity = having the image on each puzzle piece be intact.
Key idea: Your data is describing something—people or places or things or some combination of those.
Completeness is about how your data describes those objects:
Key idea: The something your data is describing is a real-life something.
Correctness is about your data's fidelity to the real-life objects it is describing:
Key idea: The something your data is describing has more than one aspect, and it has connections to other objects, too—it's not just floating in a void.
Clarity is about understanding the different aspects of the entity and how they relate to each other, as well as how the entity relates to other entities:
If you feel like you've gotten everything you were looking for by this point, great! We're glad we could provide you with some direction.
But if you (or stakeholders you have to answer to) want a more specific breakdown, download the full ebook for more information about how we define the different dimensions.