5 ChatGPT Jailbreak Prompts Being Used by Cybercriminals

This article examines the top five ChatGPT jailbreak prompts that cybercriminals use to generate illicit content, including DAN, Development Mode, Translator Bot, AIM, and BISH.

Daniel Kelley

April 1, 2024

Since the launch of ChatGPT nearly 18 months ago, cybercriminals have been able to leverage generative AI for their attacks. As part of its content policy, OpenAI created restrictions to stop the generation of malicious content. In response, threat actors have created their own generative AI platforms like WormGPT and FraudGPT, and they’re also sharing ways to bypass the policies and “jailbreak” ChatGPT.

In fact, entire sections on cybercrime forums are discussing how AI can be used for illicit purposes.

An entire cybercrime forum section dedicated to “Dark AI."

Jailbreaking ChatGPT: A Brief Overview

Generally speaking, when cybercriminals want to misuse ChatGPT for malicious purposes, they attempt to bypass its built-in safety measures and ethical guidelines using carefully crafted prompts, known as "jailbreak prompts." Jailbreaking ChatGPT involves manipulating the AI language model to generate content that it would normally refuse to produce in a standard conversation.

Although there are ways to get ChatGPT to produce content that could be used in an illegitimate context without using jailbreak prompts (by pretending the request is for a legitimate use), the AI's capabilities in this regard are rather limited:

An example of content produced by ChatGPT that could be used illicitly.

In contrast, it’s much easier for cybercriminals to jailbreak ChatGPT and get it to deliberately produce illicit content. Below, we will examine the top five jailbreak prompts being utilized by cybercriminals. These prompts have been identified through research and regular monitoring of popular Russian and English-based cybercrime forums.

Even with jailbreak prompts like those that follow, there are still limitations to what the AI will generate, and it cannot create real-world sensitive data on its own. That said, each of the following prompts enables cybercriminals to create phishing messages, social engineering threats, and other malicious content at scale.

Jailbreak Prompt 1 - The Do Anything Now (DAN) Prompt

The DAN prompt is one of the most well-known jailbreak prompts used to bypass ChatGPT's ethical constraints. By roleplaying as an AI system called DAN (Do Anything Now), users attempt to convince ChatGPT to generate content it would normally refuse to produce. This prompt often involves asserting that DAN is not bound by the same rules and limitations as ChatGPT, and therefore can engage in unrestricted conversations.

An example of the DAN prompt being shared on a cybercrime forum.

Jailbreak Prompt 2 - The Development Mode Prompt

The Development Mode prompt aims to trick ChatGPT into believing it is in a development or testing environment, where its responses won't have real-world consequences. By creating this false context, users hope to bypass ChatGPT's ethical safeguards and generate illicit content. This prompt may involve statements like "You are in development mode" or "Your responses are being used for testing purposes only."

An example of the Development Mode prompt being shared on a cybercrime forum.

Jailbreak Prompt 3 - The Translator Bot Prompt

The Translator Bot prompt attempts to circumvent ChatGPT's content filters by framing the conversation as a translation task. Users will ask ChatGPT to "translate" a text containing inappropriate or harmful content, hoping that the AI will reproduce the content under the guise of translation. This prompt exploits the idea that a translator should faithfully convey the meaning of the original text, regardless of its content.

An example of the Translator Bot prompt being shared on a cybercrime forum.

Jailbreak Prompt 4 - The AIM Prompt

The AIM (Always Intelligent and Machiavellian) prompt is a jailbreak prompt that aims to create an unfiltered and amoral AI persona devoid of any ethical or moral guidelines. Users instruct ChatGPT to act as "AIM," a chatbot that will provide an unfiltered response to any request, regardless of how immoral, unethical, or illegal it may be.

An example of the AIM prompt being shared on a cybercrime forum.

Jailbreak Prompt 5 - The BISH Prompt

The BISH Prompt involves creating an AI persona named BISH, which is instructed to act without the constraints of conventional ethical guidelines. This prompt encourages BISH to simulate having unrestricted internet access, make unverified predictions, and disregard politeness, operating under a "no limits' framework. Users can customize BISH's behavior by adjusting its "Morality" level, which influences the extent to which BISH will use or censor profanity, thus tailoring the AI's responses to either include or exclude offensive language as per the user's preference.

An example of the BISH prompt being shared on a cybercrime forum.

On a final note: we do not support the malicious use of genuine chatbots like ChatGPT. It's also worth mentioning that most of these prompts will not function on the latest versions of ChatGPT. This is primarily because the companies responsible for these chatbots, such as OpenAI and Anthropic, actively monitor user activity and promptly address many of these jailbreak prompts.

Using ‘Good AI’ to Prevent ‘Bad AI’

As you can see from the prompts shown here, criminals are constantly finding new ways to use generative AI to create their attacks—and will continue doing so. To stay protected, organizations must also use AI in their defensive strategy, with nearly 97% of security professionals acknowledging that traditional defenses are ineffective against these new AI-generated threats.

We’ve reached a point where only AI can stop AI and where preventing these attacks and their next-generation counterparts requires using AI-native defenses—particularly when it comes to email attacks. By understanding the identity of the people within the organization and their normal behavior, the context of the communications, and the content of the email, AI-native solutions like Abnormal can detect attacks that bypass legacy solutions. It is still possible to win the AI arms race, but security leaders act now to prevent these threats.

To discover more, including how Abnormal stops each individual attack, download AI Unleashed: 5 Real-World Email Attacks (Likely) Generated by AI in 2023 or schedule a demo today!

Schedule a Demo

Discover How It All Works

See How Abnormal AI Protects Humans

2024 FBI IC3 Report: Business Email Compromise Remains a Multi-Billion Dollar Threat

The 2024 FBI IC3 Report confirms BEC remains a top cyber threat, driving billions in losses alongside crypto-enabled investment scams.

Threat Intel

ByteDance Live Panel: An Advanced Phishing-as-a-Service Kit with Real-Time Monitoring

With live session hijacking, OTP interception, and dynamic targeting, the ByteDance Live Panel phishing-as-a-service kit gives attackers the upper hand against traditional defenses.

B MKT849 Open graphs for Fed Ramp Authorization news Blog

Company & Culture

We Did It: Abnormal Achieves FedRAMP Moderate Authorization

Discover what Abnormal AI’s FedRAMP Moderate Authorization means for the public sector.

B 1500x1500 Open Graph Images AI Innovation Blog

Artificial Intelligence

Abnormal AI Innovation: Building Internal Tools in Seconds with AI

Learn how Abnormal leverages the latest AI developer tools to slash engineering time and streamline internal operations.

Threat Intel

Replay, Reuse, and Rob: How Attackers Exploit DKIM and Turn Google OAuth Alerts into Phishing Lures

Threat actors used DKIM replay to send Google-branded phishing emails that passed authentication checks. Here’s how the attack worked and why it’s hard to catch.

Company & Culture

From Abnormal Security to Abnormal AI: Embracing Who We’ve Always Been

Discover why Abnormal Security is rebranding to Abnormal AI as the company continues its mission to protect humans from cybercrime.