The Uncomfortable Truth About AI: Why "Safe" Systems Are Still Vulnerable

Cybercrime just got 95% cheaper. What the Just Revealed About AI Security

Dr. Irakli Petriashvili

12/21/20252 min read

We often talk about Generative AI as a productivity miracle - a tool that writes our emails, debugs our code, and summarizes our meetings. But as an auditor and researcher, I look for the risks that lie beneath the surface.

A deeper look into the recent December 2025 Science & Tech Spotlight by the Government Accountability Office (GAO) reveals a startling reality: No current generative AI system is immune to misuse.

The report, Malicious Use of Generative AI (GAO-26-108695), details a rapidly evolving digital arms race between developers trying to secure these systems and attackers finding creative ways to break them.

Here is what every governance professional and tech enthusiast needs to know.

The Art of Deception: How AI Gets "Hacked" - The GAO report highlights that attackers don't always need complex code to break an AI; often, they just need the right words. Because AI models are designed to be helpful, they struggle to distinguish between a harmless request and a malicious instruction. Attackers exploit this "desire to please" using three main techniques:

1. Roleplaying (The "DAN" Attack) This is akin to social engineering for machines. Attackers ask the AI to adopt a persona, like the infamous "Do Anything Now" (DAN) character-that ignores standard rules. By tricking the system into "acting," attackers can bypass safeguards designed to stop hate speech or dangerous instructions.

2. The "Crescendo" Effect Imagine boiling a frog. In a "Crescendo" attack, a bad actor doesn't ask for a bomb recipe immediately. Instead, they steer the conversation using small, benign steps. By gradually shifting the focus, they exploit the system's tendency to comply, eventually leading it to produce harmful content without triggering its defense alarms.

3. Automated Attacks This is perhaps the most concerning evolution. Attackers are now using other AI systems to attack target AIs. These automated tools iteratively refine prompts, testing different phrasings until they find the magic words that bypass security filters.

The Economics of Cybercrime - Why does this matter? Because AI is making cybercrime terrifyingly cheap.

The GAO cites an academic study showing that generative AI could reduce the cost of conducting phishing attacks by more than 95%. Furthermore, when Generative AI is paired with "Agentic AI" systems that can autonomously make plans, it can independently create and deliver complex phishing campaigns without human intervention.

The maturity of these threats is alarming. In one instance, researchers were able to circumvent the safeguards of a newly released system within a single day, obtaining instructions on how to build an incendiary weapon.

The Defense Dilemma - So, how do we fix this? The GAO outlines several mitigation strategies that industries are currently testing:

Fortifying Training: Using "adversarial training" (training on safe information) to make models more robust.
Content Filters: Screening both what goes in (user prompts) and what comes out (system answers).
Human in the Loop: Requiring human approval for high-risk actions.

However, there is a fundamental conflict of interest. The report notes that developers are primarily incentivized to improve performance, not security. As a result, safeguards often lag behind, requiring continuous resources to maintain.

The Policy Gap - As we integrate AI into federal agencies and critical businesses, the GAO poses difficult questions that policymakers must answer:

How can we develop auditing tools that effectively promote security research?
Are current guidelines enough to ensure responsible use?
How do we address the fact that attackers are constantly finding new methods that defy existing guidance?

Final Thoughts: The GAO's findings serve as a reminder that "intelligence" does not equal "security." As we embrace these tools, we must remain vigilant. The technology is neutral, but the intent behind it is not.

Data Source: U.S. Government Accountability Office, "Science & Tech Spotlight: Malicious Use of Generative AI," GAO-26-108695, December 2025.

The Uncomfortable Truth About AI: Why "Safe" Systems Are Still Vulnerable

Contacts:

ip250@sussex.ac.uk

Subscribe

All rights reserved © 2026