chat
expand_more

Software Design Patterns for a Distributed Codebase

As Abnormal grows, we have to maintain a scalable codebase across all areas of engineering to prevent issues around testing, maintainability, and documentation. When organizations scale, a common problem is that the codebase becomes cluttered, with multiple teams writing different code that accomplishes the same task.
November 3, 2021

As Abnormal grows, we have to maintain a scalable codebase across all areas of engineering to prevent issues around testing, maintainability, and documentation. When organizations scale, a common problem is that the codebase becomes cluttered, with multiple teams writing different code that accomplishes the same task. While Abnormal is in this phase of rapid growth, we want to avoid this problem, so designing code that is extendable and generalizable has taken on a higher priority.

Ideally, our codebase contains minimal unused or deprecated modules, leverages common and reusable modules as opposed to one-off solutions, and is written entirely in the same voice. This is challenging because there are varying levels of engineers contributing to the codebase at any point in time, and we want to avoid complexity in order to move with velocity.

One way we’ve tackled this challenge is through implementing design patterns, which are standardized systems that present a way of architecting a solution based on the usage and constraints of the problem being solved. In this post, we present examples of a few such design patterns we follow at Abnormal.

Strategy Design Pattern

The strategy design pattern bundles together several different implementations of the same task. This is particularly useful when there are many different strategies for the same task given different preconditions, where only one strategy should be executed at runtime.

Our attachment signals builder uses this design pattern pretty extensively. At the core, we have several different builders for different types of attachments, such as HTMLAttachmentSignalsBuilder for HTML attachments and DocxAttachmentSignalBuilder for Microsoft Word documents. We must have multiple because we don’t know which SignalsBuilder to use until we know what type of attachment a message contains.

Strategy design pattern diagram

In this diagram, the attachment signals builder acts as the context class, storing the different SignalsBuilders that will be used to extract signals from an attachment.

The AbstractAttachmentSignalsBuilder specifies the base functionality of any signals builder but does not provide any implementations. Each subclass provides the implementation of the base class. This design pattern allows ML engineers to extend our attachment signals build capability to existing and new attachment types without having to change the AttachmentSignalsBuilder.

Template Design Pattern

This design pattern is a pretty common one in our codebase. It defines a template class with some boilerplate functionality, while also specifying other functions that have to be extended by subclasses. It doesn’t rely on a context class or an interface to specify which subclass to use, unlike the strategy design pattern from before.

There are several components of our detection system which rely on deterministic rules for making decisions on a message. Though each component implements different business logic, they all require some shared functions.

Template design pattern diagram

The framework class RulesBasedModel defines the reusable components, as well as hook functions that are not implemented in the framework class but are implemented in the subclasses. The developer builds new subclasses using the shared boilerplate code from the parent class.

The template design pattern is useful in cases like these, where two or more classes demonstrate similar functionality, but not enough to use a single interface for both.

Adapter Design Pattern

The adapter design pattern communicates between two interfaces that would otherwise not be able to communicate with one another. It has a pretty straightforward use case: you use this design pattern to build adapters to translate between two otherwise incompatible interfaces.

Our detection system uses several types of models developed on third-party libraries including SKLearn, Keras, and others. Each detector uses its own library-specific model during prediction, which means the way that data feeds into the model varies from detector to detector. Therefore, we created adapters for each library’s model. Below is an example of the adapter for SKLearn models.

Adapter design patter diagram

In this example, SKLearnModelWrapper is an adapter between the model wrapper and the third-party SKLearn model object. The ModelWrapper can now be used for any SKLearn model through this adapter. Developers can also easily extend the system to any ML library by creating an adapter for that third-party model object.

Composite Design Pattern

Composite objects are objects that are composed of other objects. This pattern is useful in cases where we have hierarchical structures within composite objects, such as tree-like structures.

For one of our detectors, user attributes are extracted through nested logical expressions. These expressions form a hierarchy given the nature of the nesting, and a composite design pattern can be used to specify a single structure regardless of the size of the expression.

The example below uses the following classes to illustrate this:

  • ProcessedEventPredicate: The base class of an operator. When this class is called, it runs check(), which executes the operation.

  • ProcessedEventPredicate subclasses: Each operation extends the ProcessedEventPredicate class by implementing the logic in check(). Each subclass also contains a predicates list for the predicates it performs the operation on. This list can also contain either boolean values or other ProcessedEventPredicate subclasses.

composite design pattern diagram

When the nested logical expression is run in an extractor, the composite design pattern ensures that all of the operators and operands are compatible with the extraction method. This makes it easier to use operators and operands as abstractions, especially in a nested structure.

Key Takeaways

As engineers, we are constantly solving complex problems. To maintain a sense of consistency across teams working on the same codebase, we need some high-order values to abide by. Here are some of the general principles we like to follow:

  • Single Responsibility: Every module, class, or function in a computer program should have responsibility over a single part of that program's functionality, which it should encapsulate. All of these services should be narrowly aligned with that responsibility.
  • Open/Close: Software entities should be open for extension, but closed for modification. We want to build our codebase such that we can extend its functionality without having to change its source code.
  • Liskov Substitution: Objects of a superclass should be replaceable with objects of their subclasses without breaking the application. This requires the objects of the subclasses to behave in the same way as the objects of the superclass. Any subclass must be able to replace the usage of a superclass in the codebase.
  • Interface Segregation Principle: Clients should not be forced to depend on interfaces that they don’t use. Interfaces should be developed to suit a very specific client need—we never want a client to rely on an interface with functions that they will not use.
  • Dependency Inversion Principle: High-level modules should not depend on low-level modules and abstractions should not depend on details. Details should depend on abstractions.
  • Composition over Inheritance: Too many levels of inheritance in a codebase can cause difficulties in testing, maintenance, and code extensibility. To reduce these sorts of problems, composition is generally recommended.
  • Pattern as a Last Resort: While patterns are great for reducing redundancy in extending existing functionality, code should not be instinctively written in the style of an existing pattern. Rather, after a new design is scoped out, it should only be fitted into a pattern if the use case and interface are identical. This allows us to prevent issues with tight coupling, complexity, and confusion that may arise from retrofitted code.

These principles, along with their corresponding design patterns, are just a few of the more commonly used ones at Abnormal. By following these principles, we ensure that our codebase maintains a sense of cohesion, extensibility, and clarity across teams at the company.

We hope this will give a good framework for anyone interested in using design strategies in their own work! Design patterns used in this blog are based on the principles described in Source Making.

If you’re interested in joining the Abnormal engineering team, we’re hiring! Check out our open roles and apply on our Careers page.

Software Design Patterns for a Distributed Codebase

See Abnormal in Action

Get a Demo

Get the Latest Email Security Insights

Subscribe to our newsletter to receive updates on the latest attacks and new trends in the email threat landscape.

Get AI Protection for Your Human Interactions

Protect your organization from socially-engineered email attacks that target human behavior.
Request a Demo
Request a Demo

Related Posts

B MKT628 Cyber Savvy Social Images
Discover key insights from seasoned cybersecurity professional Nicholas Schopperth, CISO at Dayton Children’s Hospital.
Read More
B Podcast Blog
Discover 'SOC Unlocked,' Abnormal Security's new podcast featuring host Mick Leach and cybersecurity expert guests like Jeremy Ventura, Dave Kennedy, and Mick Douglas.
Read More
B 07 22 24 MKT624 Images for Paris Olympics Blog
Threat actors are targeting French businesses ahead of the Paris 2024 Olympics. Learn how they're capitalizing on the event and how to protect your organization.
Read More
B Cross Platform ATO
Cross-platform account takeover is an attack where one compromised account is used to access other accounts. Learn about four real-world examples: compromised email passwords, hijacked GitHub accounts, stolen AWS credentials, and leaked Slack logins.
Read More
B Why MFA Alone Will No Longer Suffice
Explore why account takeover attacks pose a major threat to enterprises and why multi-factor authentication (MFA) alone isn't enough to prevent them.
Read More
B NLP
Learn how Abnormal uses natural language processing or NLP to protect organizations from phishing, account takeovers, and more.
Read More