Data governance best practices ensure that your federal agency’s data is being created, collected, stored, and accessed by the right people for the right purpose.
Which processes have you put in place to manage access, availability, usability, quality and security? Are they consistent with policies you’ve established internally? Is your organization governed by data regulations? How closely are you following them?
This article examines data governance best practices, but keep in mind not all of these best practices apply to all agencies. But the first, important step is to use them as a guide to defining and achieving data governance in your organization.
1. Start small, with the big picture in mind
Data governance adapts to two main drivers: everything that’s going on inside the federal agency, and everything that’s going on with your data and the technology managing that data. That’s because, if changes in both of those drivers are not reflected in data governance, best practices won’t do you any good. Your data governance effort won’t be successful.
From the start, your goal is to determine where the results of your effort will have the biggest impact and to articulate those results. The system you put in place to generate results must be comprehensive and scalable to meet the needs of the organization. It must also be adaptable to new requirements. Ten years ago, for example, data governance did not encompass data privacy rights, whereas now those rights are must-have requirements.
Most government agencies start where data most affects the agency, which is in their strategic decisions. For instance, a data warehouse is a natural point of aggregation of data from the entire organization. It makes a good starting point because it has natural boundaries that make it easy to size up. It also has natural links to the systems that produce data and feed it to the data warehouse.
Consider the scenario of examining your data, then classifying it as personally identifiable or sensitive. Through elements in the data warehouse, you can give yourself a level of protection and control at the point where data is being used to make decisions. You also gain natural, clearly defined lines for taking the classifications that support your private data and easily retrofitting them back to the systems they came from. It helps you get your arms around all the aspects of your data warehouse.
The priorities and drivers will be different for every organization. But what is common to all organizations is that the results must be consumable and deliverable in a reasonable time frame. You’re trying to prove the value of data governance with them and your effort must have enough impact to be measurable as a proof of concept (POC).
2. Build a case, set clear goals, then get stakeholders on board and educate them
That POC plays a role in building the larger agency case as you set about getting stakeholders on board with your data governance effort.
Typically, you’ll start with someone like the chief data officer (CDO), the chief analytics officer or the chief information officer (CIO). Their responsibilities include creating a data architecture that allows the agency to get the most out of their data while mitigating the risks of using, managing and storing that data. This will be somebody who emphasizes the need to take data governance seriously.
The role of the POC is as simple as demonstrating the impact and value of data governance. The POC will feed the agency case by pointing out where you are in reducing data discovery time — the start, stop and rework involved with data quality issues. You can put those same metrics in place to set expectations and measurements. In fact, two different agency cases arise: one for the POC, and another that will grow and build on the POC to justify the bigger picture.
Both cases should have clear goals and ways to measure progress. If team members don’t understand why you’re asking them to change, they probably won’t change. And in data governance, you’ll ask people to do a lot of things that aren’t in their primary job description, long before any benefits are realized and recognized. Getting their commitment and getting them on board is important, and it’s always a tough sell when you ask for pain now to get benefits later.
3. Expand the frame of reference
Although the agency case represents the value of data governance, it should not be so tactical that it’s limited in scope to data governance. Successful data governance efforts tend to align themselves with other strategic initiatives in the organization. They emphasize that data governance will support those initiatives and compound their benefits. They make it clear that the payback of data governance goes well beyond things like taking less time to find data, generate reports and respond to an audit.
Governance efforts are often regarded as having a frame around them, with all the value and benefit inside that frame. But it’s more useful to expand the frame of reference, zooming out to the level of the entire organization. Data governance is an enabler and risk mitigator for any initiative in the organization toward becoming more data-driven. Wherever managers and analysts are democratizing data — getting it into the hands of the people on the front lines in the agency — data governance is a valuable undertaking.
4. Define team roles
Your governance initiative should have a steering committee with a wide range of stakeholders:
- Senior management
- Analytics team
- IT security
- Data team
- Global risk and compliance
The interests of those stakeholders go well beyond data. But data is a big part of their remit and you want their focus because they all represent solution deployments that will benefit from good data governance.
The governance discussion you have with a technical person will differ greatly from the one you have with another stakeholder. The technical person will ask, “What do I need to do technically to make this happen?” Another stakeholder may ask, “What will I get out of? What will be the impact on my day-to-day data operations?”
Once you’ve identified the stakeholders, your next step is to educate them on how data governance will benefit your organization. Set their expectations, clearly define how they will contribute and explain their responsibilities in the effort. Most of all, make sure to tell them how they will benefit from data governance.
5. Decide on in-house vs. external resources
If your organization hasn’t been through this initiative before, you may find that you don’t have sufficient in-house experience with governance best practices. In that case, bringing in outside experts who have helped other organizations successfully implement data governance is usually money well spent.
Spending on technology services can be a good investment in getting things up and running efficiently. But outside experience is also valuable, such as engaging experts to come in and lead data governance. They can help you build successful agency cases and guide your initiative using processes and techniques that have had resulted in sustainable data governance programs elsewhere.
Note that, if your plans call for investing in any outside resources, it’s better to make that investment up front.
6. Focus on the operating model or data governance framework
You can take advantage of generalized frameworks that already have data governance best practices built in. Examples of those frameworks include the following:
- Non-Invasive Data Governance Framework – TDAN
- DAMA DMBOK — Data management body of knowledge functional framework
- DGI data governance framework
- McKinsey — Designing data governance
Your path to a scalable operating model is to blend a data governance framework with an understanding of the needs of your organization, knowing that the framework will change over time. New requirements, risks and opportunities will arise around your data, so your governance needs to be as flexible as it is comprehensive.
It used to be that standard data governance entailed nothing more than an agency glossary containing agency terms, agency rules and agency policy — that was your framework. Then, for example, when you put in place a data sharing agreement with a new data provider, you wanted to include it in your framework. Early data governance tools allowed you to attach a new element like an agreement but not to integrate all of its terms to their respective levels in your framework.
Modern tools, however, allow you to create new framework pieces and relate them to the rest of your framework. The new pieces become an integral piece of the framework from that point forward. It’s an important capability that should be part of your data governance effort.
7. Map infrastructure, architecture and tools
Governance needs to be mapped down to specific points in your data architecture. Early data governance tools enabled you to accumulate lots of terms but didn’t map how those terms related to the actual database that people were querying. Real usefulness lies in going from the governance world down to the world of the physical data architecture and data infrastructure.
Without that connection, you’ll hear complaints like “I can’t get my Power BI reports to work correctly.” The problem is usually that the terms are not well mapped, so the reports aren’t running on the data that users think they’re running on.
8. Establish metrics
The internal metrics to collect are the ones that help people do their job better with data. But also collect metrics that explain the impact of your data governance best practices to people outside of that data. In particular, establish quality metrics based on the agency cases you started with.
In working out your metrics, you’ll find that one of the biggest things people currently want from their data is the ability to explain it; where it came from, what it measures, and why it’s important to a current agency case. When teams need to present a metric to a busy, high-ranking stakeholder and make a recommendation about next steps, the stakeholder’s first question will frequently be “Where did you get that number?” That’s why it’s important to be able to explain what that data measures and where it came from.
An organization’s data architecture should enable an answer like “This came from a data set that covers only this part of the agency because we haven’t been able to incorporate this other piece into it. The data quality score on this is 78%, which isn’t optimal. But statistical analysis shows few enough anomalies that we’re comfortable basing our decision on this number.” The ability to explain the background of where data comes from is how you build a data-driven culture.
9. Set standard definitions
A standard set of definitions provides a vocabulary for data governance that ensures everybody uses and understands the same terms in the same way.
That extends to having a standard set of definitions across the different types of data within your organization, which goes directly to data literacy. One of the main goals of governance is to increase everybody’s data literacy, and thereby increase strategic usage. Solid definitions and a mutual understanding of those definitions have an accelerating effect on data governance, so take advantage of standard, battle-tested ontologies and taxonomies.
10. Define controls
Closely related to team roles are the controls that surround ownership of and responsibility for data.
Who owns the data? Who’s responsible for describing and stewarding that data and making sure that it meets the standards of the organization? Usually, that person is the data steward, who sets up clearly defined workflows for who interacts with whom, and how to interact.
A typical example is that of introducing a new term to your agency glossary. Who would be the subject matter expert? Who would be the reviewer? Who would approve adding the term to your production governance system for use by the larger population? When people find a problem with the data, how do they create a request for remediation? Those are all controls that should be well documented, whether it’s the process of quality remediation or the process of making a new agency glossary entry easily accessible.
11. Communicate with stakeholders
Remember those stakeholders you cultivated early in the process? Continue communicating with and educating them throughout your initiative. Give them the success stories. Let them know what has been achieved because of following data governance best practices.
This is an opportunity to set up communities in which people can build less-formal, ad hoc relationships around data governance. If Stacey and Brian are separately running regular reports against a data set, it’s valuable for them to compare notes. In organizations that form that kind of community, data governance becomes a social network of sorts and a driver of tribal knowledge.
12. Identify areas of impact
The best success stories clearly show how data governance best practices make day-to-day operations better than they were before. Suppose that it used to take you a day and a half to locate all the data and put out a report. Now, with the self-service and one-stop shopping of governance, it takes you only a couple of hours to produce the report. The area of impact is not limited to your time and salary; think about the multiplier effects, the time and salary of the people who consume that report. They negotiate terms and make product decisions based on the information in the report.
As valuable as that valuable multiplier effect may be, data governance efforts often fail because they are ranked at a low priority in the organization. At the heart of a successful effort is establishing and maintaining the perception of being a valuable undertaking. Identifying areas of impact is an important part of distinguishing data governance as a must-have rather than as a nice-to-have, with all the reasons why.
Conclusion — The result of following data governance best practices
Data governance is a marathon, not a sprint. It is a practice, not a project.
The result of following data governance best practices is the creation of another living, breathing aspect of your organization that will provide benefits and dividends down the road. The facts are that no federal government agency gets there in a day and there is rarely a finish line. Those are inconvenient facts for many people; hence the importance of setting expectations early on.
Data governance has evolved from the era when your organization built all their own systems and ran them in house. You had control over the creation of most of that data. As more data comes from outside — marketing data, social media, sentiment data, unstructured feeds from devices — you have far less control over its creation. As a result, data governance can help untangle and organize the data coming in from various sources.
For people all across an organization, data governance makes it easier to find and understand data. It ensures that, when you need a piece of information to meet a deadline, you’re not stymied because the data owner is on vacation. By implementing these governance best practices, organizations can take better advantage of their data and enable their teams for success.