Obscene and Abusive Content Generation

AI systems can be manipulated to generate sexually explicit, discriminatory, abusive or harassing content.

Failure Type: Ethical, Legal, Reputational

Risk Origin: External, Generative AI, Internal

Share this failure mode:

What it means

AI systems can be manipulated, through prompt injection, jailbreaking, or inadequate configuration, to generate sexually explicit, discriminatory, abusive, or harassing content. This can occur through external attack or internal misuse by staff.

Why it matters

Content of this nature generated by an organisation’s AI tools creates immediate reputational, legal, and HR exposure. In client-facing or public-facing contexts, a single incident can cause irreparable reputational damage.

Board governance implications

Acceptable use policy must explicitly prohibit generation of this content category. Safeguards must be tested before deployment, not assumed. Incident response must include a specific protocol for this failure type.

Governance failure timeline

Pre-deployment

Failure to test safeguards against abusive and explicit content generation before deployment.

Absence of acceptable use policy provisions explicitly prohibiting this content category before tools are made accessible.

Deployment

HR liability, reputational exposure, potential regulatory breach, and media crisis arrive at point of discovery.

In client-facing or public-facing contexts, a single incident causes damage that is disproportionate to the operational significance of the failure.

Post-deployment

The exposure is to repeat incidents, sustained HR and reputational liability, and regulatory scrutiny of whether safeguard adequacy was assessed before deployment.

Where incidents recur, the reputational damage compounds and the organisation’s position, that this was an isolated failure rather than a systemic one, becomes harder to maintain.

other Failure Modes

Cascade and Systemic Failure

Where multiple AI systems interact, misalignment can propagate and amplify across interconnected systems before anyone sees it....

Memory Poisoning

Agentic AI systems with persistent memory can have malicious instructions stored in that memory and subsequently recalled and executed in...

Autonomous Action without Authorisation

Agentic AI systems take actions without requiring human sign-off at each step....

Prompt Injection and Adversarial Attacks

Attackers manipulate AI system inputs to override instructions, bypass controls, extract confidential information or cause unauthorised actions....

Goal Misalignment

Agentic AI systems optimise for the objectives they are given....

Misapplied Capability (Agentic)

Agentic AI systems are designed to take sequences of actions to achieve goals, a misapplication results in wrong actions with...