How Abnormal Enhanced Its Detection Platform with BERT Large Language Models (LLMs)

We are excited to share that Abnormal has recently deployed a BERT Large Language Model (LLM), pretrained from Google on a large corpus of data, and applied it to stop advanced attacks.

Dallas Young

Dr. Dan Shiebler

October 12, 2022

Every day threat actors are becoming more sophisticated at exploiting human behavior in email-based attacks. They send text-based messages that lack malicious links or attachments, or any other known indicators of compromise (IOCs).

Though organizations are taking a more behavioral approach to detecting these sophisticated email attacks, attackers are always working to find the magic combinations of words and phrases that will allow their malicious emails to evade detection.

Many attacks now involve polymorphic attacks where bad actors change a malicious email campaign’s content, subject line, sender name, or even their email attack template. This method enables them to automate the modification of phishing attacks to evade traditional signature-based email solutions, allowing for different versions of the same attack to reach user inboxes.

We are excited to share that Abnormal has recently deployed a BERT Large Language Model (LLM), pre-trained from Google on a large corpus of data, and applied it to stop these new classes of attacks. By out-innovating cybercriminals, we can offer our customers the best possible defense against even the most sophisticated attacks.

Attacker Innovation: Language Permutations

Abnormal’s improved detection models with BERT make it easy to determine if two emails are similar and are part of the same polymorphic email campaign targeting an organization.

What looks different to a computer will often look quite similar to humans. For example, a bad actor was using brand impersonation of Geek Squad, a mobile PC repair service, intended to socially engineer the recipient via email or a fraudulent call center.

Example Email 1: The bad actor varied the text using different spelling variations like Geeks Protect 360, Geeks Squad, and also mentioning Geek Squad as the sender. Additionally, the subscription number is randomly generated along with the renewal date.

Example Email 2: The bad actor sent this second variation using Geek_Squad and Geek_Squad_Protect 360 with the randomized subscription number and replaced the expiration date with a subscription date reference.

Example Email 3: The bad actor pluralized the spelling variation of Geeks Protect 360 along with the body of the text message.

In the image below, you can see dynamically generated sender display names, domains, and IP addresses.

The bad actor took this email campaign a step further to obfuscate their campaign and randomized the email subject lines with variations like service update, order status, service update, and many other examples not listed below. Furthermore, the IP addresses, email addresses (info@geeks-squad, care@geeks-squad, etc.), and sender domains (geeks-squad51[.]com, geek-squadupdate28[.]com, etc.) varied. This increases the probability that some of the email variations will bypass traditional email security controls, landing in unsuspecting users’ inboxes.

Abnormal Innovation: Bidirectional Encoder Representations from Transformers (BERT) Learning

BERT is an extremely powerful and high-performance large language model (LLM) that is pretrained from Google on a large corpus. It interprets and evaluates text based on the context of the words and has the capability to assess and learn word associations in order to understand the correct meaning of email conversations to stop new classes of attacks. Using these pretrained BERT LLMs gives Abnormal the ability to understand content and text and the intention of a possible attacker in a highly scalable manner.

For example, BERT can differentiate the context between sentences in which the word “trust” is used:

Sentence 1: He can’t trust you.
Sentence 2: She has a trust fund.
Sentence 3: There is no trust in this business relationship.

Because BERT understands the context, it will return different embeddings as vectors that encapsulate the meaning of the word. In sentence 2, BERT understands the reference is being made to a trust fund (legal entity) and not a form of relationship trust between an individual (sentence 1), or an organization (sentence 3).

Utilizing BERT provides significant advantages over other natural language processing models, including:

The BERT model supercharges our artificial intelligence (AI) systems that further enables Abnormal to understand attacker intent. Abnormal has trained and fine-tuned the BERT models on our own unique data sets and has built several additional models on top. This allows us to extract the context of the email itself, reducing false positives (FPs) by understanding how normal behavior can look similar. This is especially important in identifying payloadless text-only email attacks like BEC, and its variants like vendor fraud that are often one-off and custom tailored towards an individual or organization.
BERT can bidirectionally analyze and read text from left to right and right to left. This allows Abnormal to be incredibly accurate and efficient in categorizing, classifying, and identifying the content of the text within the most nuanced of email messages at scale without introducing processing delays.
Abnormal is always learning through a combination of supervised and unsupervised machine learning to continually adapt to the language permutations offering resilience to attacker obfuscation that readily bypasses traditional email solutions.

The end result is that Abnormal’s industry-leading detection is even faster, more accurate, and more efficient, especially when combined with our other models and algorithms. This will allow your IT security teams to spend less time tracking down malicious or unwanted emails, and your end users will be more productive.

Why Risk-Adaptive Behavioral Detection Matters

Attackers today invest in learning about the unique identities within an organization and their relationships, to launch socially engineered and targeted inbound email attacks. Instead of attack tactics using malicious links and attachments, they exploit trusted relationships between internal and external identities and employ endless permutations of words, characters, and phrases to evade detection.

Abnormal’s detection engine employs a unique approach to find these attacks. It analyzes every email from every identity across thousands of contextual signals, to build risk-aware detection models that stop all types of inbound email attacks, from business email compromise to account takeovers to supply chain fraud and more.

Discover Behavioral AI for Email Security

Three key pillars enable Abnormal’s unique approach:

Cloud Native API Architecture: Ability to analyze both North-South and East-West traffic flow, and ingest large data sets about an organization and its suppliers.
Behavioral Knowledge Engine: Ability to identify and understand relationships, communication frequency, and typical behaviors to build a profile that’s specific to the organization.
Anomaly Detection: Ability to apply the risk models to channels (like email) by aggregating risk signals, analyzing contextual signals, and automatically remediating attacks.

When customers use Abnormal to protect email, they benefit from best-in-class detection customized to their organization’s norms and behavior, which is always learning and improving.

Want to learn more about how Abnormal detects the hardest-to-find attacks? Request a demo today.

Discover How It All Works

See How Abnormal AI Protects Humans

Forging a Stronger Defense: Why a Global Industrial Manufacturer Added Abnormal to Block What Proofpoint Couldn’t

A global industrial manufacturer blocked 3,232 missed attacks and saved 336 SOC hours per month by adding Abnormal to address gaps left by Proofpoint.

Artificial Intelligence Company & Culture

Abnormal Security Advocates for AI-Native Cybersecurity in Response to OSTP RFI on AI Strategy

Abnormal urges adoption of AI-native cybersecurity in response to OSTP’s RFI, highlighting the need for public-private collaboration to counter AI-powered threats.

B MKT793r Open Graphs Convergence Announcement Blog

Artificial Intelligence

The Convergence of AI + Cybersecurity: Announcing Season 4

Join this virtual event series to get the insights you need to make security decisions in the age of AI.

Threat Intel

Inside Atlantis AIO: Credential Stuffing Across 140+ Platforms

Discover how cybercriminals use Atlantis AIO to automate credential stuffing attacks—and how AI-driven security can stop them before accounts are compromised.

Threat Intel

Exploring Black Basta’s Use of Generative AI to Supercharge Cybercrime

Black Basta is a highly active ransomware-as-a-service (RaaS) group that has been linked to dozens of high-profile attacks against organizations worldwide. See how they utilize generative AI to support their campaigns.

B AI Generated Zoom Impersonation Phishing Attack

Threat Intel

AI-Generated Zoom Impersonation Attack Exploits Tax Season to Deploy Remote Desktop Tool

Threat actors impersonated Zoom using an AI-generated phishing page to deliver a remote monitoring and management tool.

How Abnormal Enhanced Its Detection Platform with BERT Large Language Models (LLMs)

Attacker Innovation: Language Permutations

Abnormal Innovation: Bidirectional Encoder Representations from Transformers (BERT) Learning

Why Risk-Adaptive Behavioral Detection Matters

See Abnormal in Action

Get the Latest Email Security Insights

See How Abnormal AI Protects Humans

Related Posts