5 ChatGPT Jailbreak Prompts Being Used by Cybercriminals

This article examines the top five ChatGPT jailbreak prompts that cybercriminals use to generate illicit content, including DAN, Development Mode, Translator Bot, AIM, and BISH.

Daniel Kelley

April 1, 2024

Since the launch of ChatGPT nearly 18 months ago, cybercriminals have been able to leverage generative AI for their attacks. As part of its content policy, OpenAI created restrictions to stop the generation of malicious content. In response, threat actors have created their own generative AI platforms like WormGPT and FraudGPT, and they’re also sharing ways to bypass the policies and “jailbreak” ChatGPT.

In fact, entire sections on cybercrime forums are discussing how AI can be used for illicit purposes.

An entire cybercrime forum section dedicated to “Dark AI."

Jailbreaking ChatGPT: A Brief Overview

Generally speaking, when cybercriminals want to misuse ChatGPT for malicious purposes, they attempt to bypass its built-in safety measures and ethical guidelines using carefully crafted prompts, known as "jailbreak prompts." Jailbreaking ChatGPT involves manipulating the AI language model to generate content that it would normally refuse to produce in a standard conversation.

Although there are ways to get ChatGPT to produce content that could be used in an illegitimate context without using jailbreak prompts (by pretending the request is for a legitimate use), the AI's capabilities in this regard are rather limited:

An example of content produced by ChatGPT that could be used illicitly.

In contrast, it’s much easier for cybercriminals to jailbreak ChatGPT and get it to deliberately produce illicit content. Below, we will examine the top five jailbreak prompts being utilized by cybercriminals. These prompts have been identified through research and regular monitoring of popular Russian and English-based cybercrime forums.

Even with jailbreak prompts like those that follow, there are still limitations to what the AI will generate, and it cannot create real-world sensitive data on its own. That said, each of the following prompts enables cybercriminals to create phishing messages, social engineering threats, and other malicious content at scale.

Jailbreak Prompt 1 - The Do Anything Now (DAN) Prompt

The DAN prompt is one of the most well-known jailbreak prompts used to bypass ChatGPT's ethical constraints. By roleplaying as an AI system called DAN (Do Anything Now), users attempt to convince ChatGPT to generate content it would normally refuse to produce. This prompt often involves asserting that DAN is not bound by the same rules and limitations as ChatGPT, and therefore can engage in unrestricted conversations.

An example of the DAN prompt being shared on a cybercrime forum.

Jailbreak Prompt 2 - The Development Mode Prompt

The Development Mode prompt aims to trick ChatGPT into believing it is in a development or testing environment, where its responses won't have real-world consequences. By creating this false context, users hope to bypass ChatGPT's ethical safeguards and generate illicit content. This prompt may involve statements like "You are in development mode" or "Your responses are being used for testing purposes only."

An example of the Development Mode prompt being shared on a cybercrime forum.

Jailbreak Prompt 3 - The Translator Bot Prompt

The Translator Bot prompt attempts to circumvent ChatGPT's content filters by framing the conversation as a translation task. Users will ask ChatGPT to "translate" a text containing inappropriate or harmful content, hoping that the AI will reproduce the content under the guise of translation. This prompt exploits the idea that a translator should faithfully convey the meaning of the original text, regardless of its content.

An example of the Translator Bot prompt being shared on a cybercrime forum.

Jailbreak Prompt 4 - The AIM Prompt

The AIM (Always Intelligent and Machiavellian) prompt is a jailbreak prompt that aims to create an unfiltered and amoral AI persona devoid of any ethical or moral guidelines. Users instruct ChatGPT to act as "AIM," a chatbot that will provide an unfiltered response to any request, regardless of how immoral, unethical, or illegal it may be.

An example of the AIM prompt being shared on a cybercrime forum.

Jailbreak Prompt 5 - The BISH Prompt

The BISH Prompt involves creating an AI persona named BISH, which is instructed to act without the constraints of conventional ethical guidelines. This prompt encourages BISH to simulate having unrestricted internet access, make unverified predictions, and disregard politeness, operating under a "no limits' framework. Users can customize BISH's behavior by adjusting its "Morality" level, which influences the extent to which BISH will use or censor profanity, thus tailoring the AI's responses to either include or exclude offensive language as per the user's preference.

An example of the BISH prompt being shared on a cybercrime forum.

On a final note: we do not support the malicious use of genuine chatbots like ChatGPT. It's also worth mentioning that most of these prompts will not function on the latest versions of ChatGPT. This is primarily because the companies responsible for these chatbots, such as OpenAI and Anthropic, actively monitor user activity and promptly address many of these jailbreak prompts.

Using ‘Good AI’ to Prevent ‘Bad AI’

As you can see from the prompts shown here, criminals are constantly finding new ways to use generative AI to create their attacks—and will continue doing so. To stay protected, organizations must also use AI in their defensive strategy, with nearly 97% of security professionals acknowledging that traditional defenses are ineffective against these new AI-generated threats.

We’ve reached a point where only AI can stop AI and where preventing these attacks and their next-generation counterparts requires using AI-native defenses—particularly when it comes to email attacks. By understanding the identity of the people within the organization and their normal behavior, the context of the communications, and the content of the email, AI-native solutions like Abnormal can detect attacks that bypass legacy solutions. It is still possible to win the AI arms race, but security leaders act now to prevent these threats.

To discover more, including how Abnormal stops each individual attack, download AI Unleashed: 5 Real-World Email Attacks (Likely) Generated by AI in 2023 or schedule a demo today!

Schedule a Demo

Get AI Protection for Your Human Interactions

Protect your organization from socially-engineered email attacks that target human behavior.

Request a Demo

Attack Stories

Beyond Traditional Defenses: Abnormal Security Detects Advanced Phishing Attack Missed by Proofpoint + DarkTrace

Discover how Abnormal Security detected a sophisticated phishing attack that evaded both Proofpoint and DarkTrace, and learn how it outperforms traditional solutions in combating modern email threats.

B Writing Detection Rules with LL Ms Blog

Engineering

Innovating Email Protection: Writing Detection Rules with LLMs

Discover how Abnormal Security leverages large language models (LLMs) to automate and enhance email threat detection with AI-generated detection rules.

CISO Insights

Cyber Savvy: Protecting Vital Signs with Nicholas Schopperth, CISO at Dayton Children’s Hospital

Discover key insights from seasoned cybersecurity professional Nicholas Schopperth, CISO at Dayton Children’s Hospital.

Company & Culture Thought Leadership

Introducing SOC Unlocked: Tales from the Cybersecurity Frontline, an Abnormal Security Original Podcast

Discover 'SOC Unlocked,' Abnormal Security's new podcast featuring host Mick Leach and cybersecurity expert guests like Jeremy Ventura, Dave Kennedy, and Mick Douglas.

B 07 22 24 MKT624 Images for Paris Olympics Blog

Data & Trends

French Companies Face Olympic-Sized Cyber Threats

Threat actors are targeting French businesses ahead of the Paris 2024 Olympics. Learn how they're capitalizing on the event and how to protect your organization.

Account Takeover Threat Intel

Cross-Platform Account Takeover: 4 Real-World Scenarios

Cross-platform account takeover is an attack where one compromised account is used to access other accounts. Learn about four real-world examples: compromised email passwords, hijacked GitHub accounts, stolen AWS credentials, and leaked Slack logins.