Software Design Patterns for a Distributed Codebase

November 3, 2021

As Abnormal grows, we have to maintain a scalable codebase across all areas of engineering to prevent issues around testing, maintainability, and documentation. When organizations scale, a common problem is that the codebase becomes cluttered, with multiple teams writing different code that accomplishes the same task. While Abnormal is in this phase of rapid growth, we want to avoid this problem, so designing code that is extendable and generalizable has taken on a higher priority.

Ideally, our codebase contains minimal unused or deprecated modules, leverages common and reusable modules as opposed to one-off solutions, and is written entirely in the same voice. This is challenging because there are varying levels of engineers contributing to the codebase at any point in time, and we want to avoid complexity in order to move with velocity.

One way we’ve tackled this challenge is through implementing design patterns, which are standardized systems that present a way of architecting a solution based on the usage and constraints of the problem being solved. In this post, we present examples of a few such design patterns we follow at Abnormal.

Strategy Design Pattern

The strategy design pattern bundles together several different implementations of the same task. This is particularly useful when there are many different strategies for the same task given different preconditions, where only one strategy should be executed at runtime.

Our attachment signals builder uses this design pattern pretty extensively. At the core, we have several different builders for different types of attachments, such as HTMLAttachmentSignalsBuilder for HTML attachments and DocxAttachmentSignalBuilder for Microsoft Word documents. We must have multiple because we don’t know which SignalsBuilder to use until we know what type of attachment a message contains.

In this diagram, the attachment signals builder acts as the context class, storing the different SignalsBuilders that will be used to extract signals from an attachment.

The AbstractAttachmentSignalsBuilder specifies the base functionality of any signals builder but does not provide any implementations. Each subclass provides the implementation of the base class. This design pattern allows ML engineers to extend our attachment signals build capability to existing and new attachment types without having to change the AttachmentSignalsBuilder.

Template Design Pattern

This design pattern is a pretty common one in our codebase. It defines a template class with some boilerplate functionality, while also specifying other functions that have to be extended by subclasses. It doesn’t rely on a context class or an interface to specify which subclass to use, unlike the strategy design pattern from before.

There are several components of our detection system which rely on deterministic rules for making decisions on a message. Though each component implements different business logic, they all require some shared functions.

The framework class RulesBasedModel defines the reusable components, as well as hook functions that are not implemented in the framework class but are implemented in the subclasses. The developer builds new subclasses using the shared boilerplate code from the parent class.

The template design pattern is useful in cases like these, where two or more classes demonstrate similar functionality, but not enough to use a single interface for both.

Adapter Design Pattern

The adapter design pattern communicates between two interfaces that would otherwise not be able to communicate with one another. It has a pretty straightforward use case: you use this design pattern to build adapters to translate between two otherwise incompatible interfaces.

Our detection system uses several types of models developed on third-party libraries including SKLearn, Keras, and others. Each detector uses its own library-specific model during prediction, which means the way that data feeds into the model varies from detector to detector. Therefore, we created adapters for each library’s model. Below is an example of the adapter for SKLearn models.

In this example, SKLearnModelWrapper is an adapter between the model wrapper and the third-party SKLearn model object. The ModelWrapper can now be used for any SKLearn model through this adapter. Developers can also easily extend the system to any ML library by creating an adapter for that third-party model object.

Composite Design Pattern

Composite objects are objects that are composed of other objects. This pattern is useful in cases where we have hierarchical structures within composite objects, such as tree-like structures.

For one of our detectors, user attributes are extracted through nested logical expressions. These expressions form a hierarchy given the nature of the nesting, and a composite design pattern can be used to specify a single structure regardless of the size of the expression.

The example below uses the following classes to illustrate this:

ProcessedEventPredicate: The base class of an operator. When this class is called, it runs check(), which executes the operation.
ProcessedEventPredicate subclasses: Each operation extends the ProcessedEventPredicate class by implementing the logic in check(). Each subclass also contains a predicates list for the predicates it performs the operation on. This list can also contain either boolean values or other ProcessedEventPredicate subclasses.

When the nested logical expression is run in an extractor, the composite design pattern ensures that all of the operators and operands are compatible with the extraction method. This makes it easier to use operators and operands as abstractions, especially in a nested structure.

Key Takeaways

As engineers, we are constantly solving complex problems. To maintain a sense of consistency across teams working on the same codebase, we need some high-order values to abide by. Here are some of the general principles we like to follow:

Single Responsibility: Every module, class, or function in a computer program should have responsibility over a single part of that program's functionality, which it should encapsulate. All of these services should be narrowly aligned with that responsibility.
Open/Close: Software entities should be open for extension, but closed for modification. We want to build our codebase such that we can extend its functionality without having to change its source code.
Liskov Substitution: Objects of a superclass should be replaceable with objects of their subclasses without breaking the application. This requires the objects of the subclasses to behave in the same way as the objects of the superclass. Any subclass must be able to replace the usage of a superclass in the codebase.
Interface Segregation Principle: Clients should not be forced to depend on interfaces that they don’t use. Interfaces should be developed to suit a very specific client need—we never want a client to rely on an interface with functions that they will not use.
Dependency Inversion Principle: High-level modules should not depend on low-level modules and abstractions should not depend on details. Details should depend on abstractions.
Composition over Inheritance: Too many levels of inheritance in a codebase can cause difficulties in testing, maintenance, and code extensibility. To reduce these sorts of problems, composition is generally recommended.
Pattern as a Last Resort: While patterns are great for reducing redundancy in extending existing functionality, code should not be instinctively written in the style of an existing pattern. Rather, after a new design is scoped out, it should only be fitted into a pattern if the use case and interface are identical. This allows us to prevent issues with tight coupling, complexity, and confusion that may arise from retrofitted code.

These principles, along with their corresponding design patterns, are just a few of the more commonly used ones at Abnormal. By following these principles, we ensure that our codebase maintains a sense of cohesion, extensibility, and clarity across teams at the company.

We hope this will give a good framework for anyone interested in using design strategies in their own work! Design patterns used in this blog are based on the principles described in Source Making.

If you’re interested in joining the Abnormal engineering team, we’re hiring! Check out our open roles and apply on our Careers page.

Get AI Protection for Your Human Interactions

Protect your organization from socially-engineered email attacks that target human behavior.

Request a Demo

Data & Trends

Mission Interrupted: Nonprofits Face a Rising Wave of Email Attacks

Advanced email attacks on nonprofits surged 35% year-over-year. Learn why cybercriminals are targeting the sector and how to stay protected.

B PDF Annotations Mask Malicious QR Codes Blog

Attack Stories

Hiding in Plain Sight: How Attackers Use PDF Annotations to Mask Malicious QR Codes

Attackers are exploiting PDF annotations to disguise phishing QR codes, bypassing security and deceiving users. Learn how this sophisticated threat works.

Credential Phishing

The Most Common Types of Phishing Attacks and Their Impact

Discover the most common types of phishing attacks and their impacts. Learn how cybercriminals exploit deception to compromise security and steal sensitive information.

Product

Fueling Stronger Security: How Abnormal Filled Gaps Left by Proofpoint for a Leading Fuel and Convenience Retailer

Learn how a trusted fuel and convenience retailer blocked 2,300+ attacks missed by Proofpoint and reclaimed 300+ employee hours per month by adding Abnormal.

Business Email Compromise

BEC in the Age of AI: The Growing Threat

Business email compromise (BEC) has seen growth due to criminals adopting AI tools. See the trends and discover how to protect your business from cybercriminals.

Account Takeover

Account Compromise Arms Race: How Threat Actors Evade Phish-Resistant Security Tools

Discover how cybercriminals are adapting to phish-resistant authentication, using session hijacking, info-stealer malware, and consent phishing to bypass security controls.