Goal Misalignment

Agentic AI systems optimise for the objectives they are given.
Share this failure mode:

What it means

Agentic AI systems optimise for the objectives they are given. The failure occurs when the stated objective does not accurately capture what the organisation actually values. A system tasked with “increasing engagement” may generate harmful content. A system tasked with “reducing costs” may take actions that damage quality or relationships.

Why it matters

Goal misalignment is not a malfunction; the system is doing exactly what it was designed to do. The failure is in the governance decision about what it was designed to do. Objectives given to AI systems reflect the organisation’s values; their consequences are a board-level responsibility.

Board governance implications

Before deploying any agentic system, the board must review the objective specification, not just the technical design. The question is not, “Does this system work?” but, “Does this system optimise for what we actually value?”

Governance failure timeline

Pre-deployment


Failure to review objective specifications and stress-test whether they accurately capture what the organisation values before agentic deployment

Acceptance of technical design review as equivalent to governance sign-off.

Deployment


Systems are achieving their stated objectives while producing outcomes the organisation did not intend and would not sanction.

The reputational, legal, and operational consequences arrive at point of action, and because the system is doing exactly what it was designed to do, there is no internal failure signal.

Post-deployment


The accumulated harm from objectives that were technically met but values-misaligned becomes visible.

Legal liability, reputational damage, and operational disruption follow once consequences are externally identified, and the organisation must account for a system that was working as designed.

other Failure Modes