What it means
AI systems can be manipulated, through prompt injection, jailbreaking, or inadequate configuration, to generate sexually explicit, discriminatory, abusive, or harassing content. This can occur through external attack or internal misuse by staff.
Why it matters
Content of this nature generated by an organisation’s AI tools creates immediate reputational, legal, and HR exposure. In client-facing or public-facing contexts, a single incident can cause irreparable reputational damage.
Board governance implications
Acceptable use policy must explicitly prohibit generation of this content category. Safeguards must be tested before deployment, not assumed. Incident response must include a specific protocol for this failure type.
Governance failure timeline
Pre-deployment
Failure to test safeguards against abusive and explicit content generation before deployment.
Absence of acceptable use policy provisions explicitly prohibiting this content category before tools are made accessible.
Deployment
HR liability, reputational exposure, potential regulatory breach, and media crisis arrive at point of discovery.
In client-facing or public-facing contexts, a single incident causes damage that is disproportionate to the operational significance of the failure.
Post-deployment
The exposure is to repeat incidents, sustained HR and reputational liability, and regulatory scrutiny of whether safeguard adequacy was assessed before deployment.
Where incidents recur, the reputational damage compounds and the organisation’s position, that this was an isolated failure rather than a systemic one, becomes harder to maintain.