BPMN and Microservices Orchestration, Part 2 of 2: Graphical Models, Simplified Sagas, and Cross-functional Collaboration

Written by Mike Winters on in the Inside Zeebe category.

This is part 2 in a 2-part series about BPMN and how it’s being applied to new use cases. You can find part 1 here. A sincere thanks to Bernd Rücker for his feedback during the writing of both blog posts.

Welcome back to our discussion of BPMN (Business Process Model and Notation) and its role in emerging use cases such as microservices orchestration. You don’t have to read the posts in order to be able to follow along, but if you’re new to BPMN, you might find it helpful to start with part 1.

To recap, the first post covered:

  • An introduction to BPMN
  • Why a well-established standard that thrived in the past can thrive in the future, too
  • Common orchestration patterns supported by BPMN
  • The current state and future plans of BPMN in Zeebe

In this part 2, we’ll:

  • Look at examples where using a graphical model instead of a code-based model simplifies workflow definition
  • Dive into tooling for building graphical models in BPMN (and other ways to define workflows)
  • Reassure you that BPMN’s graphical models are nothing to be afraid of–even if you’ve had a bad experience with graphical models in the past

The Saga Pattern Simplified

The simplicity of BPMN’s graphical models sometimes obscures how difficult it would be to implement a flow using a proprietary DSL other code-based solution instead of BPMN. So let’s start with an example where we compare a BPMN workflow with a workflow defined in another flow language.

We’ll look at how to implement the saga pattern, which our co-founder Bernd has written about previously. For those new to the topic, the saga pattern is an approach for solving distributed transactions without using the two-phase-commit. First discussed in a research paper published in 1987, sagas are becoming increasingly popular as a way to handle consistency as organizations adopt distributed and microservices-based architectures.

To revisit a classic saga example from Caitie McCaffrey, we’ll refer to a trip booking where the hotel, car, and flight that make up the trip are each handled by a different microservice. All three individual bookings need to succeed for the overall trip booking to be valid.

What happens in a case where we book our hotel and car successfully, but the flight booking can’t be completed because, for example, there are no more flights with seats during our selected dates? We can’t move forward and complete the trip, but we also need to deal with the car and hotel that we already booked.

The saga pattern trip booking example

BPMN provides a straightforward solution in the form of compensation, which was designed for rolling back tasks that have already been completed in scenarios just like this one. A simple “trip” workflow implementing the saga pattern with the compensation event can be modeled in BPMN like so:

Compensation event and saga pattern in BPMN

If you prefer not to use a graphical model to define the saga, you can go the API route instead. With Camunda’s Model API, you can define the same model with the following lines of code (a GitHub repository with the complete example is here):

// - flow of activities and compensating actions

flow.startEvent()
    .serviceTask("car").name("Reserve car").camundaClass(ReserveCarAdapter.class)
      .boundaryEvent().compensateEventDefinition().compensateEventDefinitionDone()
      .compensationStart().serviceTask("CancelCar").camundaClass(CancelCarAdapter.class)
      .compensationDone()
    .serviceTask("hotel").name("Book hotel").camundaClass(BookHotelAdapter.class)
      .boundaryEvent().compensateEventDefinition().compensateEventDefinitionDone()
      .compensationStart().serviceTask("CancelHotel").camundaClass(CancelHotelAdapter.class)
      .compensationDone()
    .serviceTask("flight").name("Book flight").camundaClass(BookFlightAdapter.class)
      .boundaryEvent().compensateEventDefinition().compensateEventDefinitionDone()
      .compensationStart().serviceTask("CancelFlight").camundaClass(CancelFlightAdapter.class)
      .compensationDone()
    .endEvent();

And defining the flow is just one part of the equation. In the saga pattern, it’s also necessary to keep track of tasks that have already been completed and to be able to refer to this state. Many workflow engines that execute BPMN models (including both Camunda and Zeebe) are built for such stateful operations, meaning that it’s easy to know what to roll back when executing a saga.

Of course, we’re not here to say that BPMN combined with a workflow engine is the only flow language that supports the saga pattern. But we do believe BPMN allows you to implement the saga pattern much more simply than the alternatives.

This post from Yan Cui walks through a saga example using Amazon States Language in AWS Step Functions. In Amazon States Language, building the trip booking and cancellation requires writing more than 80 lines of JSON. And the visual representation generated by the Step Functions web app based on the code isn’t particularly intuitive when compared to the BPMN model.

AWS Step Functions workflow diagram

In a real-world application, where the flow becomes necessarily more complex (with retry logic, decision points, parallel paths, customer notifications based on different outcomes, and more), both the States Language code required to define the flow and the generated visual model become more difficult to manage and follow along with. In BPMN, changes like these remain relatively simple to understand and implement.

A quick note: we want to be sure to say that Yan’s post is nicely written and we found his example very helpful in understanding how to implement a saga in AWS. Our comparison of BPMN and Amazon States Language / Step Functions is in no way meant as a critique of his work!

Complex BPMN Flows in the Real World

To further understand how BPMN simplifies complex flow logic, let’s review a real-world BPMN model.

The model below was featured in a presentation about microservices orchestration with Camunda BPM from Lieven Vandegaer of MEDIAGENIX, a Belgium-based company that helps their global customer base of media firms manage the lifecycle of broadcaster content. In the talk, MEDIAGENIX describes how they built a microservices-based platform to provide a video-on-demand solution for a telecom company, handling transcoding, encryption, and publication of media files.

One of their workflow models is shown below. Every task in this model is carried out by a different microservice, and the Camunda BPM engine is responsible for orchestrating the services.

Mediagenix BPMN workflow for microservices orchestration

(slide source)

A sequence containing “or” and “and/or” gateways, two embedded subprocesses, and more than 15 different tasks (some of which are multi-instance tasks) is expressed concisely in BPMN–and in such a way that a technical or non-technical human can understand. There’s an enormous amount of complexity made manageable in this graphical model.

Which brings us to an important point: the developers who are responsible for writing flow logic aren’t the only group who benefits from BPMN’s graphical models. BPMN also serves as a foundation for cross-functional collaboration.

Graphical models as a collaboration tool (or BizDevOps)

Defining complex flows with graphical models can save developers time and make it easier to ensure that flow logic is expressed accurately.

Just as importantly, the use of graphics extends the group of people who can interpret a workflow model, contribute to the design process, and provide troubleshooting and analysis once the process is running in production. We refer to this as BizDevOps (Business + Development + Operations), and Bernd has written about the importance of this idea, too.

We already noted that a non-technical stakeholder who wouldn’t necessarily be able to read raw flow logic code can easily follow along with a visual model in BPMN to ensure it aligns with business requirements.

BPMN models also provide a basis for business analysts who need to monitor workflows, understand process performance, and identify how to make improvements. At Camunda, this takes the form of Optimize, an analytics tool built for the people responsible for a process. Optimize provides process heat maps overlaid on models, ad hoc reports, raw data exports, alerts, and more–all of which are based on workflow data generated by the original graphical model.

Camunda Optimize dashboard for business process analysis

A BPMN workflow engine can also provide technical operations teams with the context necessary to understand and resolve problems in running processes. With the appropriate tooling, operators can intervene in real time, fixing failed workflow instances or correcting corrupt data, keeping the system healthy.

Camunda provides tooling for operators via Cockpit, which provides a dashboard with all running BPMN models and highlights technical incidents. And Zeebe, too, will have its own tool for technical operators.

Camunda Cockpit for technical operators

This “workflow lifecycle” enabled by BPMN and the surrounding toolkit, where stakeholders beyond the development team can be hands-on with model->analyze->improve iterations, is extremely powerful but often overlooked. And it’s largely made possible by the fact that BPMN is based on graphical models–a natural way for technical and non-technical teammates to collaborate.

We’d go as far as to say that the cross-team collaboration made possible by BPMN is so valuable that it outweighs any potential downside.

“Are you sure about that? There has to be a catch, right? Aren’t graphical models ridden with lurking dangers and property panel gotchas?”

Au contraire! We at Camunda believe that…

Developers can love graphical modeling tools, too (when they’re done right)

It’s our experience that most developers have had at least one bad experience with graphical models. There are a handful of issues that we hear about, including but not limited to:

  • Being forced to use subpar modeling tools–this could mean a really bad user experience with the modeling itself, or something more specific such as running into problems with diff and merge when using a modeler
  • Anxiety about hidden complexity lurking behind a nice-looking model–the so-called “death by properties panel”

These are valid concerns, but the good news is that they’re addressable–and therefore no reason to abandon BPMN and its benefits.

In Zeebe, we’re taking on these issues by:

  • Offering a lightweight modeling tool with a great user experience. You can download it and try it out if you’d like. It’s based on bpmn.io, an open-source JavaScript library (and a successful project in its own right) that you can embed in your own application.

The basic order workflow

  • Providing graphical diff tooling that works–try out a demo here. We can say that in real-world projects, we never hear complaints about merging BPMN and XML on a file level.

The basic order workflow

  • Supporting code-based tools for model creation, too. Zeebe currently offers experimental YAML workflow definition and will soon also offer its own version of Camunda’s Model API, the Java DSL for workflow definition that we referenced during the saga example. Another example of what’s possible is Camunda’s Fluent Builder API, which creates basic processes in just a few lines of code. With the Fluent Builder API, the following code…
 BpmnModelInstance modelInstance = Bpmn.createExecutableProcess()
  .startEvent()
  .userTask()
  .parallelGateway()
    .scriptTask()
    .endEvent()
  .moveToLastGateway()
    .serviceTask()
    .endEvent()
  .done();

…will generate the following graphical model:

A model generated by Camunda's Fluent Builder API

All of that said, we do believe in the power of graphics for defining complex flows. For simple sequences, it might be efficient to hack away with the Model API. But most real-world use cases are more complicated, with error handling, timeouts, compensation, message correlation, and more to take into account.

In these complex flows, graphics can save you time, make it possible for technical and non-technical teams to collaborate, and provide a way to validate model logic before going into production.

There’s a middle-ground, too. You can start building a model using the Model API, export the model to an XML file that can be used by a graphical modeling tool, and then continue building and collaborating using a graphical model. The API-based and graphics-based approaches work in tandem as a model evolves, step-by-step.

Wrapping Up

We took the time to write these posts because we haven’t yet come across another flow language that can solve microservices orchestration (and other emerging use cases) as effectively as BPMN. The standard is noteworthy in being feature complete from a flow logic perspective and in serving as a point of collaboration across many different types of users throughout the workflow lifecycle.

In part 2, we hope you came away convinced that:

  • Graphical models in BPMN can make it much simpler to define and validate complex logical flows
  • Graphical models are a powerful way to bridge the gap between technical and non-technical teams and greatly simplify the definition of complex workflows
  • Graphical models are nothing to be afraid of if they’re done right, and Zeebe provides you with user-friendly modeling tools that save you from death-by-property-panel. But you can always define workflows with code if you prefer.

More broadly, we hope you, too, now have a complete view of why BPMN can (and already does) play a key role in modern architectures.

As Zeebe evolves, we’ll provide more resources on BPMN and how to apply it to the microservices orchestration use case. As always, we’d love to hear what you would find most helpful, so get in touch if you have feedback or ideas to share with us.

To keep up with all things Zeebe, you can: