By restructuring our React monorepo CI pipeline and optimizing the eslint configuration, we reduced our CI time from 9 minutes to 2.7 minutesโa 70% decrease. This significant time reduction not only enhances the developer experience but also streamlines our workflow, allowing for faster feedback and quicker iterations.
Continuous integration (CI) is essential to every large-scale application. It runs the automated tests suite and linting on every commit, ensuring it doesn't break anything (well, of course, things still break, but it helps ๐ฌ). Our React monorepo contains three primary projects: our shared components library, the web application, and the browser extension. Each project needs to be linted and tested on every commit and it's a required status check for the PRs. You can't merge anything before the CI turns green ๐ข.
Like every project, the more we build, the longer our CI runs. It reached the point where the pipeline took 9 minutes to complete. On every change, we have to wait 9 minutes to know if the tests pass and the PR is ready for review or to be merged (depending on the state of the PR). 9 minutes means a long feedback loop, context switches (lots of them), frustration and overall degraded developer experience. Also, when deploying a hotfix, you must wait 9 minutes to validate your change. We decided to optimize the pipeline to make our suffering more tolerable.
Initial state
First, let's look at the CI pipeline, hang tight, it's a complex one:
We have two steps that run in parallel. One runs the tests, and the other builds our extension. You can see the CircleCI config here. It's already evident where we should dive to optimize. Running the tests must be improved!
First iteration
The easiest win we could think of right off the bat is to run each project concurrently. As mentioned, we have three projects, so we can run them separately. It should surely be a quick win. So that's what we did (see config):
We also added a step to install all dependencies to avoid installing them for each step. Unfortunately, with this change in place, we still didn't have the impact we looked for. We managed to cut it down by 1 minute. Looking at the stats again, we realized the shared project looks interesting and requires a deep dive to understand why it takes 7 minutes to run.
Diving deeper
For every project, we run lint and tests, so to understand what's going on under the hood, we split the shared projects into two steps, one for the lint and the other for the tests (see config):
Ok, we have some progress! We managed to cut 3.5 minutes, but why the hell does linting take almost 5 minutes? It definitely didn't make any sense. So, we dove even deeper to understand why linting takes so much time. If you run eslint with TIMING=1
it measures the time each rule takes. Exactly what we needed, so we ran it, and the bottleneck was clear:
The import/no-cycle
rule takes 76% of the run time. Something is clearly wrong. We searched on Google to see if it was a known issue, and voila, it is. Surprisingly, by default this rule max depth is set to no limit, which, as you can see, is very inefficient. So, setting the max depth to 3 should fix the issue and cover most cases. And here's the new pipeline:
We're getting there! The total time is now 3.3 minutes. It's already a significant improvement, but we decided to see if there's room for more.
Fine tuning and future proof
We wanted to see if we could squeeze the time even more, so we decided to try better machines for some of the steps. Not only that, CircleCI has a feature that allows the actual tests to run in parallel. Given the current size of our test suite, we don't need it, but it would be nice to have it there so we can quickly increase parallelism when the time comes. To apply it, we first need to list all our tests. Easily done using jest (our test runner):
Then, we need to use the CircleCI CLI tool to execute the tests and split them into different machines:
There are different strategies for splitting the tests; we decided to use the split by time, which tries to balance the time each worker runs. Finally, we define how many machines we want to use using the parallelism
keyword. As mentioned, we set it to 1 for now, but we can increase it in the future.
This is the final flow (see config):
Conclusion
The optimization above required one day of a single engineer and greatly impacted our developer experience. It reduced CI time from almost 9 minutes to 2.7 minutes and set the ground for the future. Such tasks are often deprioritized as they don't lead to a direct business impact, and everyone got used to it already. Developer experience is essential, and even with a tiny bit of time, we can make wonders.