Planning a Migration to Microservice architecture - Why Might You Choose Microservices?

Planning a Migration to Microservice architecture - Why Might You Choose Microservices?
Антон Меринов

Антон Меринов

Автор статьи. Интересы, навыки: Профессиональное администрирование СУБД Oracle Database, веб-разработка, IT-World. Подробнее.

 
 
 
« Prev
Next »

Why Might You Choose Microservices?

I can’t define the goals you may have for your company. You know better the aspirations of your company and the challenges you are facing. What I can outline are the reasons that often get stated as to why microservices are being adopted by companies all over the world. In the spirit of honesty, I’ll also outline other ways you could potentially achieve these same outcomes using different approaches.

Improve Team Autonomy

Whatever industry you operate in, it is all about your people, and catching them doing things right, and providing them with the confidence, the motivation, the freedom and desire to achieve their true potential.

John Timpson

Many organizations have shown the benefits of creating autonomous teams. Keeping organizational groups small, allowing them to build close bonds and work effectively together without bringing in too much bureaucracy, has helped many organizations grow and scale more effectively than some of its peers. Gore has found great success by making sure none of their business units ever gets to more than 150, to make sure that everyone knows each other. For these smaller business units to work, they have to be given power and responsibility to work as autonomous units.

Timpsons, a highly successful UK retailer, has achieved massive scale by empowering its workforce, reducing the need for central functions and allowing the local stores to make decisions for themselves, things like giving them power over how much to refund unhappy customers. Now chairman of the company, John Timpson was famous for scrapping internal rules and replacing them with just two:

  • Look the part and put the money in the till.
  • You can do anything else to best serve customers.

Autonomy works at the smaller scale too, and most modern organizations I work with are looking to create more autonomous teams within their organizations, often trying to copy models from other organizations like Amazon’s two-pizza team model, or the Spotify model.1

If done right, team autonomy can empower people, help them step up and grow, and get the job done faster. When teams own microservices, and have full control over those services, they increase the amount of autonomy they can have within a larger organization.

How else could you do this?

Autonomy - distribution of responsibility - can play out in many ways. Working out how you can push more responsibility into the team doesn’t require a shift in architecture. Essentially, though, it’s a process of identifying what responsibilities can be pushed into the teams, and this could play out in many ways. Giving ownership to parts of the codebase to different teams could be one answer (a modular monolith could still benefit you here): this could also be done by identifying people empowered to make decisions for parts of the codebase on functional grounds (e.g., Ryan knows display ads best, so he’s responsible for that; Jane knows the most about tuning our query performance, so run anything in that area past her first).

Improving autonomy can also play out in simply not having to wait for other people to do things for you, so adopting self-service approaches to provisioning machines or environments can be a huge enabler, avoiding the need for central operations teams to have to field tickets for day-to-day activities.

Reduce Time to Market

By being able to make and deploy changes to individual microservices, and deploy these changes without having to wait for coordinated releases, we have the potential to release functionality to our customers more quickly. Being able to bring more people to bear on a problem is also a factor - we’ll cover that shortly.

How else could you do this?

Well, where do we start? There are so many variables that go into play when considering how to ship software more quickly. I always suggest you carry out some sort of path-to-production modeling exercise, as it may help show that the biggest blocker isn’t what you think.

I remember on one project many years ago at a large investment bank, we had been brought in to help speed up delivery of software. “The developers take too long to get things into production!” we were told. One of my colleagues, Kief Morris, took the time to map out all the stages involved in delivering software, looking at the process from the moment a product owner comes up with an idea to the point where that idea actually got into production.

He quickly identified that it took around six weeks on average from when a developer started on a piece of work to it being deployed into a production environment. We felt that we could shave a couple of weeks off of this process through the use of suitable automation, as manual processes were involved. But Kief found a much bigger problem in the path to production - often it took over 40 weeks for the ideas to get from the product owner to the point where developers could even start on the work. By focusing our efforts on improving that part of the process, we’d help the client much more in improving the time to market for new functionality.

So think of all the steps involved with shipping software. Look at how long they take, the durations (both elapsed time and busy time) for each step, and highlight the pain points along the way. After all of that, you may well find that microservices could be part of the solution, but you’ll probably find many other things you could try in parallel.

Scale Cost-Effectively for Load

By breaking our processing into individual microservices, these processes can be scaled independently. This means we can also hopefully cost-effectively scale - we need to scale up only those parts of our processing that are currently constraining our ability to handle load. It also follows that we can then scale down those microservices that are under less load, perhaps even turning them off when not required. This is in part why so many companies that build SaaS products adopt microservice architecture - it gives them more control over operational costs.

How else could you do this?

Here we have a huge number of alternatives to consider, most of which are easier to experiment with before committing to a microservices-oriented approach. We could just get a bigger box for a start - if you’re on a public cloud or other type of virtualized platform, you could simply provision bigger machines to run your process on. This “vertical” scaling obviously has its limitations, but for a quick short-term improvement, it shouldn’t be dismissed outright.

Traditional horizontal scaling of the existing monolith - basically running multiple copies - could prove to be highly effective. Running multiple copies of your monolith behind a load-distribution mechanism like a load balancer or a queue could allow for simple handling of more load, although this may not help if the bottleneck is in the database, which, depending on the technology, may not support such a scaling mechanism. Horizontal scaling is an easy thing to try, and you really should give it a go before considering microservices, as it will likely be quick to assess its suitability, and has far fewer downsides than a full-blown microservice architecture.

You could also replace technology being used with alternatives that can handle load better. This is typically not a trivial undertaking, however - consider the work to port an existing program over to a new type of database or programming language. A shift to microservices may actually make changing technology easier, as you can change the technology being used only inside the microservices that require it, leaving the rest untouched and thereby reducing the impact of the change.

Improve Robustness

The move from single-tenant software to multitenant SaaS applications means the impact of system outages can be significantly more widespread. The expectations of our customers for their software to be available, as well as the importance of software in their lives, has increased. By breaking our application into individual, independently deployable processes, we open up a host of mechanisms to improve the robustness of our applications.

By using microservices, we are able to implement a more robust architecture because functionality is decomposed - that is, an impact on one area of functionality need not bring down the whole system. We also get to focus our time and energy on those parts of the application that most require robustness - ensuring critical parts of our system remain operational.

Resilience Versus Robustness

Typically, when we want to improve a system’s ability to avoid outages, handle failures gracefully when they occur, and recover quickly when problems happen, we often talk about resilience. Much work has been done in the field of what is now known as resilience engineering, looking at the topic as a whole as it applies to all fields, not just computing. The model for resilience pioneered by David Woods looks more broadly at the concept of resilience, and points to the fact that being resilient isn’t as simple as we might first think, by separating out our ability to deal with known and unknown sources of failure, among other things.2

John Allspaw, a colleague of David Woods, helps distinguish between the concepts of robustness and resilience. Robustness is the ability to have a system that is able to react to expected variations. Resilience is having an organization capable of adapting to things that haven’t been thought of, which could well include creating a culture of experimentation through things like chaos engineering. For example, we are aware that a specific machine could die, so we might bring redundancy into our system by load balancing an instance. That is an example of addressing robustness. Resiliency is the process of an organization preparing itself for the fact that it cannot anticipate all potential problems.

An important consideration here is that microservices do not necessarily give you robustness for free. Rather, they open up opportunities to design a system in such a way that it can better tolerate network partitions, service outages, and the like. Just spreading your functionality across multiple separate processes and separate machines does not guarantee improved robustness; quite the contrary - it may just increase your surface area of failure.

How else could you do this?

By running multiple copies of your monolith, perhaps behind a load balancer or another load distribution mechanism like a queue, we add redundancy to our system. We can further improve robustness of our applications by distributing instances of our monolith across multiple failure planes (e.g., don’t have all machines in the same rack or same data center).

Investment in more reliable hardware and software could likewise yield benefits, as could a thorough investigation of existing causes of system outages. I’ve seen more than a few production issues caused by an overreliance on manual processes, for example, or people “not following protocol,” which often means an innocent mistake by an individual can have significant impacts. British Airways suffered a massive outage in 2017, causing all of its flights into and out of London Heathrow and Gatwick to be canceled. This problem was apparently inadvertently triggered by a power surge resulting from the actions of one individual. If the robustness of your application relies on human beings never making a mistake, you’re in for a rocky ride.

Scale the Number of Developers

We’ve probably all seen the problem of throwing developers at a project to try to make it go faster - it so often backfires. But some problems do need more people to get them done. As Frederick Brooks outlines in his now seminal book, The Mythical Man Month,” adding more people will only continue to improve how quickly you can deliver, if the work itself can be partitioned into separate pieces of work with limited interactions between them. He gives the example of harvesting crops in a field - it’s a simple task to have multiple people working in parallel, as the work being done by each harvester doesn’t require interactions with other people. Software rarely works like this, as the work being done isn’t all the same, and often the output from one piece of work is needed as the input for another.

With clearly identified boundaries, and an architecture that has focused around ensuring our microservices limit their coupling with each other, we come up with pieces of code that can be worked on independently. Therefore, we hope we can scale the number of developers by reducing the delivery contention.

To successfully scale the number of developers you bring to bear on the problem requires a good degree of autonomy between the teams themselves. Just having microservices isn’t going to be good enough. You’ll have to think about how the teams align to the service ownership, and what coordination between teams is required. You’ll also need to break up work in such a way that changes don’t need to be coordinated across too many services.

How else could you do this?

Microservices work well for larger teams as the microservices themselves become decoupled pieces of functionality that can be worked on independently. An alternative approach could be to consider implementing a modular monolith: different teams own each module, and as long as the interface with other modules remains stable, they could continue to make changes in isolation.

This approach is somewhat limited, though. We still have some form of contention between the different teams, as the software is still all packaged together, so the act of deployment still requires coordination between the appropriate parties.

Embrace New Technology

Monoliths typically limit our technology choices. We normally have one programming language on the backend, making use of one programming idiom. We’re fixed to one deployment platform, one operating system, one type of database. With a microservice architecture, we get the option to vary these choices for each service.

By isolating the technology change in one service boundary, we can understand the benefits of the new technology in isolation, and limit the impact if the technology turns out to have issues.

In my experience, while mature microservice organizations often limit how many technology stacks they support, they are rarely homogeneous in the technologies in use. The flexibility in being able to try new technology in a safe way can give them competitive advantage, both in terms of delivering better results for customers and in helping keep their developers happy as they get to master new skills.

How else could you do this?

If we still continue to ship our software as a single process, we do have limits on which technologies we can bring in. We could safely adopt new languages on the same runtime, of course - the JVM as one example can happily host code written in multiple languages within the same running process. New types of databases become more problematic, though, as this implies some sort of decomposition of a previously monolithic data model to allow for an incremental migration, unless you’re going for a complete, immediate switchover to a new database technology, which is complicated and risky.

If the current technology stack is considered a “burning platform,” you may have no choice other than to replace it with a newer, better-supported technology stack.5 Of course, there is nothing to stop you from incrementally replacing your existing monolith with the new monolith - patterns can work well for that.

Reuse?

Reuse is one of the most oft-stated goals for microservice migration, and in my opinion is a poor goal in the first place. Fundamentally, reuse is not a direct outcome people want. Reuse is something people hope will lead to other benefits. We hope that through reuse, we may be able to ship features more quickly, or perhaps reduce costs, but if those things are your goals, track those things instead, or you may end up optimizing the wrong thing.

To explain what I mean, let’s take a deeper look into one of the usual reasons reuse is chosen as an objective. We want to ship features more quickly. We think that by optimizing our development process around reusing existing code, we won’t have to write as much code - and with less work to do, we can get our software out the door more quickly, right? But let’s take a simple example. The Customer Services team in Music Corp needs to format a PDF in order to provide customer invoices. Another part of the system already handles PDF generation: we produce PDFs for printing purposes in the warehouse, to produce packing slips for orders shipped to customers and to send order requests to suppliers.

Following the goal of reuse, our team may be directed to use the existing PDF generation capability. But that functionality is currently managed by a different team, in a different part of the organization. So now we have to coordinate with them to make the required changes to support our features. This may mean we have to ask them to do the work for us, or perhaps we have to make the changes ourselves and submit a pull request (assuming our company works like that). Either way, we have to coordinate with another part of the organization to make the change.

We could spend the time to coordinate with other people and get the changes made, all so we could roll out our change. But we work out that we could actually just write our own implementation much faster and ship the feature to the customer more quickly than if we spend the time to adapt the existing code. If your actual goal is faster time to market, this may be the right choice. But if you optimize for reuse hoping you get faster time to market, you may end up doing things that slow you down.

Measuring reuse in complex systems is difficult, and as I’ve outlined, it is typically something we’re doing to achieve something else. Spend your time focusing on the actual objective instead, and recognize that reuse may not always be the right answer.

« Prev
Next »
Comments (0)
There are no comments posted here yet
Leave your comments
Posting as Guest
×
Suggested Locations