Managing Version Dependencies Between Microservices

If deploying a new version of a service requires deploying new versions of other services at the same time, you’re probably doing microservices wrong.

Let’s say that to ship a feature, we need to deploy service A from v1.0 to v1.1, which requires updating service B from v1.5 to v1.6, which requires service C to be updated from v2.1 to v2.2. In scenarios like these, you’re forced to perform lock-step deployments.

A lock-step deployment is a deployment of multiple service versions at once, with careful coordination, usually due to breaking changes between their interactions, or due to functionalities scattered throughout many services. You might’ve heard of cases where teams assemble in a “war room” once a week to collectively ship their changes to production; that’s lock-step deployments gone really bad.

Lock-step deployments completely mess up:

Here are some common causes of lock-step deployments.

Backward compatibility: Services you depend on

Here’s a scenario: at a Food delivery company, your team manages the Delivery Routing service and plans to deploy version 1.1, which includes an optimization of delivery assignments based on driver availability and location.

However, this improvement relies on a new upcoming feature promised by the Delivery Driver service in version v1.8, which adds the new availability and location fields to the driver API response.

Should you wait for the Delivery Driver team to deploy their new version before you deploy yours? Perhaps, but it’s definitely not ideal, for a number of reasons:

The solution here is to deploy a backward-compatible version of your service that works with both the current and upcoming versions of the dependency service. Going back to the example: you can implement your Delivery Routing service in such a way that makes the optimized assignment algorithm come into play only if it notices that the Delivery Driver response includes the new fields, implying that its new version made it to prod. If it doesn’t, it just falls back to the unoptimized version.

Feature flags are another effective approach, enabling you to control the release of this feature. Wrap the code that depends on the new version of the dependency service in a feature flag, and activate it when you confirm the successful rollout of the required version. Keep in mind that if the dependency service rolls back to an incompatible version, you’ll need to deactivate the flag accordingly.

Backward compatibility: Services that depend on you

When deploying a new version of your service, always keep in mind that consumer services already depend on your exposed API contract, so you shouldn’t make breaking changes to the API.

Keep your changes additive. Think twice before removing or even modifying an API response, because you’ll likely be breaking something downstream.

Deprecate features properly if you really want to phase out a resource that you’ve already exposed. Allow teams to gradually move over to the new preferred way, and have patience. A REST API deprecation example: exposing your new API with breaking changes under a new version URL, e.g. /v3/.

Always think of your service from the perspective of the consumer, even if your team is responsible for both the producer and the consumer. This way of thinking really gets you into the proper microservices mindset, allowing you to reap its benefits and avoid distributed monoliths. This has worked really well for me.

Functionality scattered throughout services

Arguably the most common reason: you have to deploy in lock-step because your feature has its functionality scattered throughout many services. This indicates that your service responsibilities are not properly delineated.

I’ve found that a common symptom of this is a lack of clear service ownership. Teams tend to be naturally separated by business domain, therefore services that reflect these boundaries tend to work in harmony. You’ve heard the saying: microservices don’t solve technical problems, they solve organizational problems. They work best when they have proper API & data boundaries, and that’s more easily achievable when they’re owned by teams. That’s because mere ownership of a service introduces a healthy amount of friction because teams must coordinate with each other to achieve their goals, and that slows them down. In time, these pain points will reveal themselves, and with it, a proper separation of services that you can plan and work towards.

I wrote this post after recently reading a Reddit thread that brought up this topic and having run into this problem myself. Hit me up with any other topics you think I should write about!


See Also