chat
expand_more

AI Security Mailbox: Augmenting a Customer-Facing Product with Generative AI

Learn how Abnormal Security leverages large language models (LLMs) to enhance security awareness and automate SOC teams’ workflows with AI Security Mailbox.
August 30, 2024

Designed to automate the entire user-reported email workflow, Abnormal’s AI Security Mailbox relieves security operations (SOC) teams from hours of manual triage and gives them time to work on higher-impact activities to better protect their company. From report collection and analysis to malicious campaign remediation and end-user notification, AI Security Mailbox enables SOC teams to efficiently manage threats while enhancing overall organizational security.

Security operations analysts primarily use Abnormal Security’s products to monitor and respond to cybersecurity threats and incidents. However, AI Security Mailbox is unique as every end user in an organization protected by Abnormal can interact with it.

Leveraging a privately deployed large language model (LLM), AI Security Mailbox not only saves SOCs countless hours but also personalizes training for end users, transforming them into active participants in the fight against cyber threats. Explore how AI Security Mailbox is transforming email security with cutting-edge technology.

How AI Security Mailbox Works

If an employee receives a message they think is suspicious, they can submit it to their company’s designated phishing mailbox. From this point, AI Security Mailbox takes the following actions behind the scenes:

  • Ingests the report and extracts the original email’s content and metadata

  • Analyzes the report based on the behavioral data of the employee and their organization

  • Determines a judgment on whether the email is safe, spam, malicious, or a phishing simulation

If an email is determined to be spam or malicious, AI Security Mailbox automatically remediates all emails in the same campaign (i.e., all of the emails sent by the malicious actor to employees across the organization.)

AI Security Mailbox utilizes a large language model (LLM) deployed privately and securely in our cloud to generate a personalized message explaining the judgment. It then sends an email to the reporter with said message. These notifications serve as a cybersecurity education tool for employees within an organization.

AISM Example Response

If the user has follow-up questions about the judgment or cybersecurity best practices in general, they can reply to the email, and the LLM will use the email thread content as context to provide answers.

AISM Example Thread Reply

Security teams can customize the content and tone of initial and follow-up LLM responses, as shown below. For example, security teams can configure the custom instructions to use a more formal tone with VIPs or respond in a language other than English.

AISM Analyst Configurations

Prompt Construction

We construct the prompt for generating initial and follow-up responses to the reporter using a few key components:

  1. The email and the associated judgment.

  2. An outline for the content based on the judgment (e.g., if the email was safe, spam, malicious, or a phishing simulation.)

  3. A list of cybersecurity best practices sourced from experts within Abnormal to help steer any recommendations provided by the LLM.

  4. Any custom instructions defined by customers, which are configured by security teams via the AI Security Mailbox settings page on the Abnormal Portal.

  5. The complete contents of the email thread, if the user replies with a follow-up question.

Use Cases

Customers have specialized their AI auto-responses in a variety of ways. For example:

  • Including IT help desk phone numbers and email addresses in the custom instructions to ensure employees are directed to the appropriate contacts for more complex questions.

  • Specifying the company’s general IT best practices and procedures (e.g., password length, software used for SSO, etc.)

  • Adding custom instructions to send reporter notifications in multiple languages to fulfill legal requirements for all official email communications. (Yes, AI Security Mailbox speaks multiple languages!)

Why Use Email as an LLM interface?

From internal communications and vendor interactions to managing contracts and invoices, email remains a central hub for critical information exchange in corporate environments.

This makes email an ideal interface for an LLM-powered product, as it fits directly into existing workflows. Email enables organizations to organically provide employees with personalized training on phishing and security awareness when they report a suspicious message, as opposed to requiring them to periodically watch videos or take courses.

Key Technical Challenges

When we first developed our prompt for generating LLM responses, we noticed that the outputs were often very long and filled with complex technical details. We also noticed that the LLM suffered from the lost in the middle phenomenon, as it would occasionally not follow an instruction in the prompt.

To solve this problem, we used a prompting technique called itemized reframing. In this technique, we provide a concise, bulleted list describing an exact outline of what we want the LLM to output.

You are an email security analyst tasked with explaining the judgment of an email
to a user who reported it to be suspicious.
Write a summary covering the following points:
- 1 sentence stating...
- A bulleted list consisting of:
 - 1 sentence explaining...
 - 1 sentence describing...

After updating our prompt to use itemized reframing, we saw the outputs become far more concise, consistent, and easier to understand.

Another prompting problem we faced was constraining the LLM to only discuss topics related to cybersecurity. While we could have fine-tuned a model on a corpus of cybersecurity-related data, we found that we could achieve our desired output quality through prompting alone. We simply provided a set of cybersecurity best practices directly in the prompts, which we crowdsourced from industry experts within the company.

We also observed that the LLM is more likely to follow the instruction to focus on cybersecurity-related questions when given affirmative directives, such as "Only answer questions related to cybersecurity," rather than negative ones like "Do not answer unrelated questions."

Because LLMs have a stochastic nature, we recognized that the model itself can be a point of failure and target for misuse in the AI Security Mailbox product. We went through a comprehensive process internally, including red-teaming sessions and company-wide dogfooding, to identify potential risks and implement mitigations. This process gave us confidence in the robustness and usability of the product.

Risk

Mitigation

We are unable to generate an LLM response due to an outage.

We fall back to sending a default template response to the reporter. A future enhancement here is to have a mechanism to send the prompt to another LLM if our primary one fails.

We exceed the context window of the model.

We tokenize the content of the email thread before sending it to the LLM and put defensive measures in place if the token length exceeds the context window.

Over time, we see degradations in the model’s responses.

In addition to human evaluation, we perform LLM-as-a-judge grading to assess response quality against predetermined criteria—e.g., the response must not be too technically complex. If a substantial decline in quality scores is observed, we reassess and refine the prompt change. (See this blog post for more details on our methodology.)

It’s unclear how the product will perform when rolled out to many users, as LLM-powered products are non-deterministic by nature.

We did a company-wide dogfooding by having an internal launch of the product first, enabling all employees to interact with (and pressure test) the product. This strategy allowed us to catch bugs and make prompt improvements early on, giving us confidence that we would deliver the most delightful product to our customers.

We also learned that post-processing LLM outputs is crucial, as this allows us to handle deterministic tasks outside the model. By doing so, we reserve the LLM’s unique capabilities for tasks it's best suited for, improving overall system efficiency and reliability.

We decided to perform post-processing in the following ways:

  1. We prompt the LLM to output markdown and convert the markdown output to HTML. This allows us to render the email in the format expected by our email-sending infrastructure.

  2. We check the LLM output for unsafe links and will not send the output if detected. This is important because malicious emails often contain harmful links, and we want to prevent including those in the LLM response.

AISM Flowchart

Using Generative AI to Fight Cybercrime

By automating manual processes and providing personalized, actionable insights, AI Security Mailbox enables organizations to stay ahead of evolving threats while empowering employees with the knowledge they need to remain vigilant. AI Security Mailbox is the first of many generative AI products that Abnormal is developing for our customers and our own internal use. We’re excited about giving security teams superpowers to do more in less time and building a future where good AI protects us from malicious AI.

Thank you to Richard Wang, Shrivu Shankar, Lavania Nair, HK Jayakumar, and Alethea Toh for all your hard work delivering this launch!


As a fast-growing company, we have lots of interesting engineering challenges to solve. If this work sounds interesting to you, we’re hiring! Visit our careers page to learn more.

AI Security Mailbox: Augmenting a Customer-Facing Product with Generative AI

See Abnormal in Action

Get a Demo

Get the Latest Email Security Insights

Subscribe to our newsletter to receive updates on the latest attacks and new trends in the email threat landscape.

Get AI Protection for Your Human Interactions

Protect your organization from socially-engineered email attacks that target human behavior.
Request a Demo
Request a Demo

Related Posts

B Proofpoint Customer Story Blog 8
A Fortune 500 transportation and logistics leader blocked more than 6,700 attacks missed by Proofpoint and reclaimed 350 SOC hours per month by adding Abnormal to its security stack.
Read More
B Gartner MQ 2024 Announcement Blog
Abnormal Security was named a Leader in the 2024 Gartner Magic Quadrant for Email Security Platforms and positioned furthest for Completeness of Vision.
Read More
B Gift Card Scams Tricker to Spot Blog
Learn why gift card scams are becoming more difficult to identify, how cybercriminals evolve their tactics, and strategies to protect your organization.
Read More
B Offensive AI 12 16 24
Learn how AI is used in cybersecurity, what defensive AI vs. offensive AI means, and how to use defensive AI to combat offensive AI.
Read More
B Proofpoint Customer Story Blog 7
See how Abnormal's AI helped a Fortune 500 insurance provider detect 27,847 threats missed by Proofpoint and save 6,600+ hours in employee productivity.
Read More
B Cyberattack Forecast Emerging Threats Blog
Uncover the latest email threats and strategies to strengthen your cybersecurity and prepare for 2025.
Read More