close icon
daily.dev platform

Discover more from daily.dev

Personalized news feed, dev communities and search, much better than what’s out there. Maybe ;)

Start reading - Free forever
Start reading - Free forever
Continue reading >

Exploring the Archipelago Architecture

Exploring the Archipelago Architecture
Author
 Ido Shamun
Related tags on daily.dev
toc
Table of contents
arrow-down

🎯

The Archipelago Architecture is daily.dev's innovative approach, merging monolithic and microservices architectures. It features multiple deployment units within a single service for independent scaling and efficient data access. This architecture offers scalability and reduced network load, with complexities in management and risk of over-segmentation. Key highlights include its deployment unit strategy, independence from a monorepo setup, and the use of Infrastructure as Code with Pulumi.

We're all familiar with the terms "monolith" and "microservices" in the context of software architecture, but what the hell is Archipelago? And more importantly, how do you pronounce it correctly? While I'll leave the pronunciation exercises to the linguists, I'm here to guide you through our unique approach to system design at daily.dev.

Monolith

In a monolithic architecture, the entire codebase is consolidated into one project with a single deployable unit. This model boasts efficient internal network communication, eliminating the need for remote data fetching. It's an excellent starting point for many projects due to its simplicity and can be effective in the early stages. However, sticking to a monolith can become burdensome as the project expands and the team grows. Code navigation turns complex, a faulty deployment can bring down the entire system, and the lack of separation of concerns becomes evident. Furthermore, inter-team dependencies can create synchronization headaches when multiple teams contribute to the same codebase. This is precisely how daily.dev started, as a monolith project.

Microservices

Microservices architecture sits at the opposite end of the spectrum, championing highly specialized services that perform singular tasks. Typically, these services are succinct — often just a few hundred lines of code — and encapsulate a minimal portion of the business logic. While fostering separation of concerns, this approach heavily relies on network interactions and can suffer from latency issues. This reliance can also drive up costs, a point that has covered a lot lately (An example). Microservices pair well with serverless computing, riding the same wave of popularity. However, challenges abound. Organizations sometimes lose track of their microservices, occasionally duplicating efforts by creating identical services. This leads to redundant work and adds to the operational overhead.

Archipelago

The Archipelago (a group of islands) approach strikes a harmonious balance between monolithic and microservices architectures. At daily.dev, each service is responsible for a substantial domain. Take, for instance, our feed service, content pipeline, search function, and LLM gateway, to name a few. These services are far from just a few hundred lines of code – they are robust. Yet, the distinctive feature of Archipelago architecture lies in the multiple deployment units contained within a single service. This multiplicity allows for the independent scaling of each unit according to distinct metrics. These units are part of the same codebase and are deployed synchronously — enabling direct data access without the need, in most cases, for synchronous inter-service communication. Like microservices, every service owns its data and is responsible for its infrastructure.

Deployment unit

A deployment unit is an entity deployed (duh 😅) and operates autonomously in relation to other units. In Kubernetes terms, this translates to either a 'deployment' or a 'job'. It's typical for projects to feature an API unit for external communication and a background worker to handle event processing — often, there are several of each. Additionally, some systems incorporate cron jobs, which execute tasks on a schedule independent of other units. Each unit has different auto-scaling rules, resource allocations (like CPU and memory), and health check parameters. This setup gifts us with the precise level of flexibility our operations demand.

Not a monorepo

You might be tempted to compare our approach to a monorepo because of the shared codebase concept. However, it's important to clarify that a monorepo refers to a version control strategy, not an architectural pattern. Indeed, one could manage an entire fleet of microservices within a single monorepo. At daily.dev, we take a different route, managing each service in its own repository. This choice is grounded in our preference for simplicity and manageability.

Infrastructure as Code

At daily.dev, our infrastructure is powered by Pulumi, which revolutionizes infrastructure management by allowing us to define it using TypeScript — yes, you read that correctly, TypeScript! The intricacies of this process are enough to fill another blog post, but for now, understand that Infrastructure as Code (IaC) is a pivotal element of our Archipelago architecture. We've developed a shared library that equips our developers with the essential infrastructure building blocks, simplifying the integration of new deployment units. To add a unit, developers append a new object to an array, specify the command, select the unit type (API, background, cron, etc.), set additional parameters, and viola. This paradigm shift means that at daily.dev, developers own and manage the infrastructure of their respective projects.

Benefits

The Archipelago architecture, in our view, artfully blends the best of both monolithic and microservices architectures. It offers the ability to scale various workloads independently, using distinct metrics, something that's not possible in a traditional monolithic setup. This architecture also maintains a separation of concerns, avoiding the monolithic pitfall of tightly interwoven components. Furthermore, it reduces network load, as there’s less reliance on remote data fetching compared to typical microservices. It enhances the developer experience by eliminating the need to set up a new project for every new service, thus streamlining the development process. Additionally, it aids in maintaining a clear overview of all services, a common challenge in sprawling microservice environments.

Drawbacks

While the Archipelago architecture offers numerous benefits, it's not without its challenges. One potential drawback is the complexity of managing multiple deployment units within the same codebase. It requires a robust continuous integration and deployment pipeline to ensure smooth operations. Additionally, while each unit can scale independently, this also means that monitoring and resource management become more complex, as developers must oversee multiple units with potentially different scaling behaviors and resource needs.

Over-segmentation is another risk. If not carefully designed, the architecture might lead to unnecessary fragmentation, creating an overhead of coordination and increasing the cognitive load on developers who need to understand multiple deployment units and their interactions.

Despite these considerations, the flexibility and scalability benefits of the Archipelago architecture often outweigh the drawbacks, especially for dynamic and growing products like daily.dev. However, it's essential to be aware of these potential issues and address them proactively in the system design phase.

Archipelago in action

A prime example within our Archipelago architecture is our application API service, a crucial component that manages user profiles, posts, engagement, and other key application data. Initially, this service was structured into three primary deployment units:

  • Public API: This unit serves as the interface for our applications to access and manipulate data. Its primary role is always to remain operational and responsive, a critical requirement for our user-facing operations.
  • Background Worker: Operating independently, this unit is designed to react to data change capture messages and events. Its auto-scaling is tied to the queue size, enabling efficient handling of usage spikes and unusual activity without incurring unnecessary costs.
  • Real-Time API: This unit provides real-time updates to clients. Due to its nature of managing WebSocket connections, it demands more resources, particularly in terms of memory and network capacity.

The rationale for this division is clear: to ensure that the Public API remains unaffected by the resource-intensive tasks of the Background Worker and the Real-Time API. This separation is vital to maintain uninterrupted service, especially as WebSocket connections and background processing can strain memory and network resources significantly.

The addition of our weekly digest feature highlighted the architecture's adaptability. This feature generates over 350,000 personalized emails each week—a self-imposed DDoS, if you will. We introduced an additional deployment unit specifically for the digest to safeguard our production workload and existing background tasks. This new unit auto-scales based on the size of the digest queue, ensuring that the heightened demand does not disrupt other services.

Incorporating this new workload was a seamless process, thanks to our Infrastructure as Code (IaC) approach and the multi-deployment building blocks we have in place. It was as easy as creating a new file in the project.

Conclusion

The Archipelago architecture represents our exploration into balancing scalability, manageability, and productivity. It's an ongoing journey, one that comes with its own set of challenges and learning opportunities. We share this concept in the spirit of collaboration, hoping it might inspire others or invite constructive feedback to refine this approach further.

Why not level up your reading with

Stay up-to-date with the latest developer news every time you open a new tab.

Read more