Seven actionable steps towards effective AI data governance

Seven actionable steps towards effective AI data governance

By Mina Mousa (pictured), Head of Systems Engineering Australia & New Zealand at Extreme Networks

 

Companies looking to unlock growth have flocked to Artificial Intelligence (AI). Its capacity to act as a growth engine is well understood: a study by CSIRO’s Data61, for example, highlights that “Australian businesses are growth-focused, using AI technologies across the organisation to gain competitive advantage and improve strategic decision-making.”

While growth is a key outcome that organisations try to target with AI, achieving that goal is not always a given. Despite more CIOs leading efforts to deploy AI on an organisation-wide basis, about one in three are yet to see significant ROI from these investments.

A key reason behind this may be a lack of focus on a critical foundational piece of the AI puzzle: data governance. While less glamorous than model building, it has a critical impact on the performance of an AI initiative.

An AI model’s output is only as good as the data the model ingests and is trained on. The old adage in analytics of ‘garbage in, garbage out’ continues to hold true in an AI context. If models are fed low-quality or skewed information, the results will also be low-quality and skewed.

Organisations often find this out “the hard way” – in the form of flawed decision-making, compliance failures or other unanticipated challenges from the AI.

The warning signs

Data governance problems typically manifest in a few ways.

A common pain point is where different internal teams and functions often have different vocabularies, metrics, and success criteria. They may use the same words but attach a different meaning to them. This makes it hard to have a data governance conversation internally, and to enforce consistent policies across the organisation.

Additionally, where data governance mechanisms exist inside of organisations, they’re often not designed for the AI era – leading to bottlenecks and outdated datasets instead of the real-time data that AI requires.

AI models often work best with real-time data streams and external data sources, and frequent retraining. Signs of governance being unfit-for-purpose might include data accessibility issues, such as a slow access to data, or having data ‘locked’ up in legacy platforms and proprietary tools that are not designed to interface with an AI model at all. If data is rigid and siloed, it probably won’t work for an AI system designed for flexibility and unification.

Good governance, and the path to achieving it

Proper governance accelerates time-to-value by improving data discoverability, introducing consistent quality checks, and reducing the time data teams spend searching for the right assets, thus freeing them to focus on productive model-building.

Beyond that, a robust governance framework fosters better cross-functional collaboration since data stewards, domain experts, and compliance officers all share a unified understanding of policies and procedures. The result is fewer surprises or last-minute compliance blockers and streamlined AI development.

There are seven steps that organisations can take to establish more effective data governance in the AI era.

Step 1: Organisations should map out the current state of their data assets, to ensure all data repositories and uses are documented. This exercise can help prevent costly blind spots from derailing AI initiatives and highlight any outdated controls in the data environment.

Step 2: Next, organisations should formally identify data owners, data stewards, and data consumers in the organisation. This clarifies accountability for data ownership, lineage and quality, and ensures there’s someone responsible for making changes to the dataset or access to it, if required.

Step 3: Organisations should use modern platforms (like Atlan or Collibra) or develop an internal solution that unifies metadata, governance policies, and lineage tracking. They should then back this up by establishing a single, authoritative master record that everyone recognises as the source of truth. Combined, this creates a ‘one stop shop’ that facilitates connections between datasets and data users. This is particularly important given the pace of AI innovation; competitive advantage could be lost if an internal team has to spend too much time searching or waiting for key data inputs for their model.

Step 4:  Establish a single, authoritative master record that everyone recognises as the single source of truth.  This can ensure that each data field has a designated owner responsible for maintaining its accuracy, resolving any conflicts that arise when different systems hold contradictory information, and overseeing processes to keep the chosen record synchronised and current across all platforms. By identifying which system governs each element and mandating that other systems defer to this authoritative record, organisations will minimise duplications and inconsistencies.

Step 5: Automated quality and policy compliance checks should be built into data flows so that no information can move downstream to an AI model without scrutiny. Automation reduces human error or intentional policy bypass and maintains consistency over the application of governance.

Step 6: A key aspect of governance today is explainability for how an AI reached a decision or recommendation. Using explainable AI (XAI) techniques like SHAP or LIME, combined with an audit trail of the data feeding each model, can help organisations to understand the AI’s workings and to identify biases or anomalies early on.

Step 7: Organisations should back all of this up with continuous training on data privacy, AI ethics, and data stewardship, and celebrate teams that embrace and exemplify responsible data practices. This creates and embeds a culture of effective governance.

By following this seven-step process, Australian organisations can reduce the time-to-value from their AI efforts, while meeting the growing expectations placed on AI adopters around how the technology is implemented and how it interacts with data. Well-governed, high-quality data translates into more accurate AI models that business teams can trust and champion, and into user comfort that the organisation meets current best practice.