Once again about one common problem of microservice teams

Andy Imperfect
4 min readOct 18, 2023

--

Hello.

In this article, I just want to remind of a very common issue that is one of the most common for many teams that are moving to a microservice architecture.

This is my first publication on Medium.Com, so please do not judge strictly, it can be considered a trial run of a personal blog publication.

To put it very briefly, the considered problem is a loss of responsibility. I’m talking about responsibility for the functional result.

Typically, when microservices are separated, development teams are also separated. However, even when IT teams are distributed by large business domains (which is common in such methodologies as Data Mesh), an effective cross-domain automation becomes a significant issue.

When separating microservices, the situation is even more complicated, since microservices are divided in more detail:

  • ideally — each microservice is responsible for a separate small subdomain of the business
  • in an acceptable theory — along with microservices responsible for the business subdomain, it is possible to allocate microservices responsible for “technological” subdomains (authorization, master data management, data warehousing, BI, API gateways, etc.)
  • in practice — isolation in general is often unstable, and the boundaries of microservice functions constantly change over time.

At the same time, in many cases, microservice teams are divided with the expectation that they will be “self-managing”: whenever an inter-team task arises, they will gather and communicate with each other to develop the most optimal solution. However, in practice, the result is often a pronounced demotivating effect:

  • very high basic communication costs (meetings with at least 5–6 people may be necessary even for technically simple tasks)
  • the number of participants in meetings often has to be expanded as the decision is worked out (since when analyzing various trade-offs it turns out that they affect more and more new areas of responsibility), which further increases communication costs
  • since “everyone communicates with everyone” on multiple tasks, company management loses the transparency of understanding which team spends how much time on which business task. Thus, instead of increasing the controllability of the organizational structure, the business, on the contrary, actually loses the controllability of IT teams.

In the context of the last point, it is important to clarify that when the IT landscape is divided, the team is assigned to a specific microservice / set of microservices, it begins to focus on it, and accordingly its motivation to work on tasks beyond the scope of these services runs counter to its operational objectives (de facto, this means spending the time not doing their assigned job).

In such cases, it is possible that teams, instead of searching for the best solutions for the business (which may turn out to be complex), tend to offer point solutions that are understandable and convenient for them. The opposite situation is also possible, when the team participates in developing the optimal solution (and perhaps even plays a leading role in coming up with it), but this solution involves a very small role for this team in its implementation. After several repetitions of such situations, the team begins to experience “frustration” from participating in the development of other people’s problems, and also loses motivation to participate in such activities in the future

A different approach arises when a certain link, as a kind of a tech supervisor, is allocated above the microservice teams, which works through interservice decisions and then passes them on to the teams for execution. Such a link can take on most of the communications with business customers (on behalf of all tech teams), which to a certain extent reduces the costs of communication in the company. However, in this case, the opposite effect occurs: if the teams become not self-managing (but “orchestrated” by such tech supervisors), then an additional isolating effect arises from the fact that the teams are divided into microservices. Ultimately, teams move away from business and decision-making, becoming “code writers” (doc writers, automation test makers, etc.). Over time, the feedback from end developers to the business begins to break down. And this is also a minus.

Moreover, in all the indicated cases, the problem is expressed as a decrease in the developer’s responsibility for the overall result of the company.

In conclusion, I will give some examples that indicate this problem:

  • Anti-pattern “Chain of calls”
  • Atomization of testing. The situation can reach the point where in a chain of calls each team writes its own very small amount of code, and the testers of this team cover this volume with their isolated tests. In this case, instead of first testing the general chain responsible for the business task as a whole, and then going deeper to analyze its individual nodes — we first get tests for each node, and then add up the overall picture from them.
  • The inability to transparently see the execution status of the entire business task and the presence instead of many micro tickets in task trackers
  • When working on a business problem, the director estimates the possible costs of communication between teams to be so significant that instead he prefers to immediately address the problem to one of the technical teams for a narrow solution to the problem within that team. Accordingly, there is no sufficiently large-scale study of trade offs with a choice of solution options.

Interestingly, we see that the effect of all the points described is that the technical architecture begins to impose restrictions on the design of business solutions.

--

--