Innovating Email Protection: Writing Detection Rules with LLMs

Discover how Abnormal Security leverages large language models (LLMs) to automate and enhance email threat detection with AI-generated detection rules.

Dr. Dan Shiebler

July 26, 2024

At Abnormal, our marquee email security product is based on a detection engine. The goal of this engine is to detect cyberattacks among the billions of emails that make up normal business communications.

Our detection engine has many different components, including ML models, attack signature systems, and threat intelligence. One powerful component is our rule engine, which allows us to quickly detect messages with known indicators of compromise. As each message passes through our system, we extract thousands of rich signals describing everything about the message—content, sender, recipient, links, attachments, contextual information, and more.

Our rule engine is a DSL (domain specific language) that can express certain combinations of attributes that should be treated as malicious, such as a bad domain or malicious link. After implementing rules, an email security analyst can then use this rule engine to write detection rules on top of these attributes.

An example rule could be:

(
never_seen_sender = true AND
from_fqdn_age_in_days < 30 AND
attachment_extensions contains "eml" AND
body_text_contents contains "urgent_language"
)

The rule-writing process is relatively straightforward:

Select a set of attacks that we want to detect.
Study these attacks and craft a rule that matches them.
Validate that the crafted rule matches the selected attacks.
Validate that the rule does not flag any safe messages.
Launch the rule.

LLM Rule Generator Blog 1 Rule Writing Process

Unfortunately, this process can be quite labor-intensive. Hackers develop new attacks at a blistering pace, and we currently see tens of thousands of net-new attacks arise every day. It's simply not feasible for a team of human analysts to keep up. For this reason, Abnormal Security has historically relied upon a machine learning approach in which deep neural networks directly analyze emails.

However, this end-to-end deep learning approach has downsides. For example, deep neural networks are far more difficult to interpret than rules. The growth of large language models and their stunning capability to transform unstructured data to structured data thus raises a question: can we use generative large language models to augment our core machine learning approach with interpretable AI-authored detection rules for better detection?

LLM Rule Generation

Diving into the rule writing process, we see that only step 2, writing a rule from a set of attacks, requires human intervention. Sourcing attacks, verifying that safe messages are not flagged by the rule, and launching the rule can each be fully automated

We can think of step 2 as a translation procedure: the analyst is translating a list of attack messages into the rule DSL. But since large language models like GPT4 are excellent at this kind of translation, we can use a simple one-shot prompt like the following to automate this process:

You are an email security analyst tasked with writing an attack detection rule that flags a set of malicious email messages.
Here is the syntax of your rule engine:
…
Here is the set of malicious email messages that your rule should flag:
…
Your rule:

The output of an LLM fed with this prompt should be a rule that we can test directly. This enables us to craft a rule-writing flow that is completely independent of human intervention.

LLM Rule Generator Blog 3 Rule Writing Flow

LLM Rule Generation as Machine Learning

One way to understand what's going on here is to frame this in the language of ML models. The model family is the rule DSL, the training data is the list of attacks, the training algorithm is LLM inference, the hyperparameters are the prompt, and the trained model is the generated rule. An interesting observation that arises from this perspective is that the training data does not contain any safe messages. The algorithm relies on the LLM's prior knowledge of what safe business communication looks like to enable the LLM to write a specific enough rule that flags the target attack messages without flagging any safe messages.

There are several ways to improve this knowledge and, therefore, the generated rule, including adding examples of good rules to the prompt or using a large language model that has been fine-tuned for security applications.

LLM Responsibilities vs. Software Responsibilities

Let's take a step back. One of the core decisions in any LLM-enabled application is the right place to draw the line between the responsibilities of the software system and the responsibilities of the LLM itself. In this framework, we have the following decision points:

Selection of attacks to target
Generation of a rule from these attacks
Evaluation of the rule against the environment of all emails
Interpretation of the evaluation results

Each of these steps could theoretically be powered by a "vanilla" software system or an agentic LLM-enabled system. The system described above only offloads #2 to an LLM, and relies on vanilla software systems to handle numbers 1, 3, and 4.

Intuitively agentic LLM-enabled systems offer flexibility and power at the expense of reliability. This tends to be a poor tradeoff in the domain of evaluation, so steps #3 and #4 are best suited for a vanilla software system. This is especially true at an established organization like Abnormal Security that has invested heavily in rule evaluation infrastructure.

That said, step #1 lends itself much better to LLM involvement, and this is an area of exploration in the future. For example, we could imagine building a system in which one LLM generates attacks and another writes rules to stop them.

Enhancing Defense with LLM-Generated Rules

Rules and heuristics will never be enough to stay on top of attackers. However, LLMs can increase the effectiveness of this strategy by relaxing the human effort bottleneck, while still providing additional controls. The benefit here is that LLM-generated rules can augment a behavioral AI engine—providing defense in depth and additional peace of mind. In addition, we can extrapolate this pattern across domains, as many techniques that are bottlenecked by labor intensiveness today will resurge as LLMs become increasingly popular.

As a fast-growing company, we have lots of interesting engineering challenges to solve, just like this one. If these challenges interest you, and you want to further your growth as an engineer, we’re hiring! Learn more at our careers website.

Discover How It All Works

See How Abnormal AI Protects Humans

Forging a Stronger Defense: Why a Global Industrial Manufacturer Added Abnormal to Block What Proofpoint Couldn’t

A global industrial manufacturer blocked 3,232 missed attacks and saved 336 SOC hours per month by adding Abnormal to address gaps left by Proofpoint.

Artificial Intelligence Company & Culture

Abnormal Security Advocates for AI-Native Cybersecurity in Response to OSTP RFI on AI Strategy

Abnormal urges adoption of AI-native cybersecurity in response to OSTP’s RFI, highlighting the need for public-private collaboration to counter AI-powered threats.

B MKT793r Open Graphs Convergence Announcement Blog

Artificial Intelligence

The Convergence of AI + Cybersecurity: Announcing Season 4

Join this virtual event series to get the insights you need to make security decisions in the age of AI.

Threat Intel

Inside Atlantis AIO: Credential Stuffing Across 140+ Platforms

Discover how cybercriminals use Atlantis AIO to automate credential stuffing attacks—and how AI-driven security can stop them before accounts are compromised.

Threat Intel

Exploring Black Basta’s Use of Generative AI to Supercharge Cybercrime

Black Basta is a highly active ransomware-as-a-service (RaaS) group that has been linked to dozens of high-profile attacks against organizations worldwide. See how they utilize generative AI to support their campaigns.

B AI Generated Zoom Impersonation Phishing Attack

Threat Intel

AI-Generated Zoom Impersonation Attack Exploits Tax Season to Deploy Remote Desktop Tool

Threat actors impersonated Zoom using an AI-generated phishing page to deliver a remote monitoring and management tool.

Innovating Email Protection: Writing Detection Rules with LLMs

LLM Rule Generation

LLM Rule Generation as Machine Learning

LLM Responsibilities vs. Software Responsibilities

Enhancing Defense with LLM-Generated Rules

See Abnormal in Action

Get the Latest Email Security Insights

See How Abnormal AI Protects Humans

Related Posts