Skip to main content

Process Managers + Sagas (orchestrating long-running processes)

When business logic spans multiple aggregates or services, you need coordination.

Two patterns dominate:

  • Saga: a sequence of local transactions with compensating actions on failure
  • Process Manager: a stateful coordinator that reacts to events and issues commands

They're often used interchangeably, but technically:

  • Saga = the pattern (compensate on failure)
  • Process Manager = the implementation (state machine that drives the saga)

When you need this

  • Order fulfillment (payment → inventory → shipping)
  • User onboarding (create account → send email → provision resources)
  • Money transfer (debit source → credit destination)
  • Any multi-step business process that can partially fail

The problem: distributed transactions don't scale

You can't do:

// DON'T: This doesn't work across services/aggregates
using var tx = BeginDistributedTransaction();
await _orderService.CreateOrder(orderId);
await _paymentService.ChargeCard(orderId, amount);
await _inventoryService.Reserve(orderId, items);
tx.Commit();

Instead, you coordinate through events and commands.

Choreography vs Orchestration

Choreography (decentralized)

Each service listens for events and reacts:

OrderPlaced → PaymentService listens → PaymentReceived
PaymentReceived → InventoryService listens → InventoryReserved
InventoryReserved → ShippingService listens → OrderShipped

Pros: loose coupling, simple services Cons: hard to see the full flow, harder to debug

Orchestration (centralized)

A process manager coordinates:

ProcessManager receives OrderPlaced
→ sends ReservePayment command
ProcessManager receives PaymentReserved
→ sends ReserveInventory command
ProcessManager receives InventoryReserved
→ sends ShipOrder command

Pros: flow is explicit, easier to debug Cons: coordinator can become a bottleneck

Process Manager implementation

The state machine

public enum OrderFulfillmentState
{
NotStarted,
AwaitingPayment,
AwaitingInventory,
AwaitingShipment,
Completed,
Failed,
Compensating
}

public sealed record OrderFulfillmentData(
Guid OrderId,
Guid CustomerId,
decimal TotalAmount,
List<OrderItem> Items,
OrderFulfillmentState State,
string? FailureReason,
int RetryCount
);

Process Manager as an event-sourced aggregate

public interface IProcessManagerEvent : IDomainEvent;

public sealed record OrderFulfillmentStarted(
Guid ProcessId,
Guid OrderId,
Guid CustomerId,
decimal TotalAmount,
List<OrderItem> Items
) : IProcessManagerEvent;

public sealed record PaymentRequested(Guid ProcessId, Guid PaymentId) : IProcessManagerEvent;
public sealed record PaymentSucceeded(Guid ProcessId, Guid PaymentId) : IProcessManagerEvent;
public sealed record PaymentFailed(Guid ProcessId, string Reason) : IProcessManagerEvent;

public sealed record InventoryReserveRequested(Guid ProcessId) : IProcessManagerEvent;
public sealed record InventoryReserved(Guid ProcessId) : IProcessManagerEvent;
public sealed record InventoryReserveFailed(Guid ProcessId, string Reason) : IProcessManagerEvent;

public sealed record ShipmentRequested(Guid ProcessId) : IProcessManagerEvent;
public sealed record ShipmentDispatched(Guid ProcessId, string TrackingNumber) : IProcessManagerEvent;

public sealed record OrderFulfillmentCompleted(Guid ProcessId) : IProcessManagerEvent;
public sealed record OrderFulfillmentFailed(Guid ProcessId, string Reason) : IProcessManagerEvent;

// Compensation events
public sealed record PaymentRefundRequested(Guid ProcessId) : IProcessManagerEvent;
public sealed record PaymentRefunded(Guid ProcessId) : IProcessManagerEvent;
public sealed record InventoryReleaseRequested(Guid ProcessId) : IProcessManagerEvent;
public sealed record InventoryReleased(Guid ProcessId) : IProcessManagerEvent;

The Process Manager aggregate

public sealed class OrderFulfillmentProcess : AggregateRoot
{
public Guid OrderId { get; private set; }
public Guid CustomerId { get; private set; }
public decimal TotalAmount { get; private set; }
public List<OrderItem> Items { get; private set; } = new();
public OrderFulfillmentState State { get; private set; }
public string? FailureReason { get; private set; }
public Guid? PaymentId { get; private set; }
public string? TrackingNumber { get; private set; }

public static OrderFulfillmentProcess Start(
Guid processId,
Guid orderId,
Guid customerId,
decimal totalAmount,
List<OrderItem> items)
{
var process = new OrderFulfillmentProcess();
process.Raise(new OrderFulfillmentStarted(processId, orderId, customerId, totalAmount, items));
return process;
}

// React to external events and decide next action
public ICommand? HandleExternalEvent(object externalEvent)
{
return externalEvent switch
{
// Payment service responded
ExternalPaymentSucceeded e when State == OrderFulfillmentState.AwaitingPayment =>
HandlePaymentSuccess(e),

ExternalPaymentFailed e when State == OrderFulfillmentState.AwaitingPayment =>
HandlePaymentFailure(e),

// Inventory service responded
ExternalInventoryReserved e when State == OrderFulfillmentState.AwaitingInventory =>
HandleInventoryReserved(e),

ExternalInventoryFailed e when State == OrderFulfillmentState.AwaitingInventory =>
HandleInventoryFailure(e),

// Shipping service responded
ExternalShipmentDispatched e when State == OrderFulfillmentState.AwaitingShipment =>
HandleShipmentDispatched(e),

_ => null // ignore events we don't care about in current state
};
}

private ICommand? HandlePaymentSuccess(ExternalPaymentSucceeded e)
{
Raise(new PaymentSucceeded(Id, e.PaymentId));
Raise(new InventoryReserveRequested(Id));
return new ReserveInventoryCommand(OrderId, Items);
}

private ICommand? HandlePaymentFailure(ExternalPaymentFailed e)
{
Raise(new PaymentFailed(Id, e.Reason));
Raise(new OrderFulfillmentFailed(Id, $"Payment failed: {e.Reason}"));
return null; // no compensation needed yet
}

private ICommand? HandleInventoryReserved(ExternalInventoryReserved e)
{
Raise(new InventoryReserved(Id));
Raise(new ShipmentRequested(Id));
return new ShipOrderCommand(OrderId, CustomerId);
}

private ICommand? HandleInventoryFailure(ExternalInventoryFailed e)
{
Raise(new InventoryReserveFailed(Id, e.Reason));
// Start compensation: refund payment
Raise(new PaymentRefundRequested(Id));
return new RefundPaymentCommand(PaymentId!.Value);
}

private ICommand? HandleShipmentDispatched(ExternalShipmentDispatched e)
{
Raise(new ShipmentDispatched(Id, e.TrackingNumber));
Raise(new OrderFulfillmentCompleted(Id));
return null; // done!
}

// Get the initial command to kick off the process
public ICommand GetInitialCommand()
{
if (State != OrderFulfillmentState.NotStarted)
throw new InvalidOperationException("Process already started");

Raise(new PaymentRequested(Id, Guid.NewGuid()));
return new ChargePaymentCommand(PaymentId!.Value, CustomerId, TotalAmount);
}

protected override void Apply(IDomainEvent @event)
{
switch (@event)
{
case OrderFulfillmentStarted e:
Id = e.ProcessId;
OrderId = e.OrderId;
CustomerId = e.CustomerId;
TotalAmount = e.TotalAmount;
Items = e.Items;
State = OrderFulfillmentState.NotStarted;
break;

case PaymentRequested e:
PaymentId = e.PaymentId;
State = OrderFulfillmentState.AwaitingPayment;
break;

case PaymentSucceeded:
break; // state transitions in next event

case PaymentFailed e:
FailureReason = e.Reason;
State = OrderFulfillmentState.Failed;
break;

case InventoryReserveRequested:
State = OrderFulfillmentState.AwaitingInventory;
break;

case InventoryReserved:
break;

case InventoryReserveFailed e:
FailureReason = e.Reason;
State = OrderFulfillmentState.Compensating;
break;

case ShipmentRequested:
State = OrderFulfillmentState.AwaitingShipment;
break;

case ShipmentDispatched e:
TrackingNumber = e.TrackingNumber;
break;

case OrderFulfillmentCompleted:
State = OrderFulfillmentState.Completed;
break;

case OrderFulfillmentFailed e:
FailureReason = e.Reason;
State = OrderFulfillmentState.Failed;
break;

case PaymentRefundRequested:
State = OrderFulfillmentState.Compensating;
break;

case PaymentRefunded:
State = OrderFulfillmentState.Failed; // compensation complete
break;
}
}
}

Command and event routing

public interface ICommand
{
Guid CorrelationId { get; }
}

public sealed record ChargePaymentCommand(
Guid PaymentId,
Guid CustomerId,
decimal Amount
) : ICommand
{
public Guid CorrelationId => PaymentId;
}

public sealed record ReserveInventoryCommand(
Guid OrderId,
List<OrderItem> Items
) : ICommand
{
public Guid CorrelationId => OrderId;
}

public sealed record ShipOrderCommand(
Guid OrderId,
Guid CustomerId
) : ICommand
{
public Guid CorrelationId => OrderId;
}

public sealed record RefundPaymentCommand(Guid PaymentId) : ICommand
{
public Guid CorrelationId => PaymentId;
}

The coordinator service

public sealed class OrderFulfillmentCoordinator
{
private readonly IEventStore _eventStore;
private readonly ICommandBus _commandBus;
private readonly ILogger<OrderFulfillmentCoordinator> _logger;

public OrderFulfillmentCoordinator(
IEventStore eventStore,
ICommandBus commandBus,
ILogger<OrderFulfillmentCoordinator> logger)
{
_eventStore = eventStore;
_commandBus = commandBus;
_logger = logger;
}

// Called when a new order is placed
public async Task StartFulfillment(
Guid orderId,
Guid customerId,
decimal totalAmount,
List<OrderItem> items,
CancellationToken ct)
{
var processId = Guid.NewGuid();
var process = OrderFulfillmentProcess.Start(
processId, orderId, customerId, totalAmount, items);

var initialCommand = process.GetInitialCommand();

await SaveProcess(process, ct);
await _commandBus.Send(initialCommand, ct);

_logger.LogInformation(
"Started fulfillment process {ProcessId} for order {OrderId}",
processId, orderId);
}

// Called when external services publish events
public async Task HandleExternalEvent(
Guid processId,
object externalEvent,
CancellationToken ct)
{
var process = await LoadProcess(processId, ct);
if (process is null)
{
_logger.LogWarning("Process {ProcessId} not found", processId);
return;
}

var nextCommand = process.HandleExternalEvent(externalEvent);

await SaveProcess(process, ct);

if (nextCommand is not null)
{
await _commandBus.Send(nextCommand, ct);
}

_logger.LogInformation(
"Process {ProcessId} transitioned to {State}",
processId, process.State);
}

private async Task<OrderFulfillmentProcess?> LoadProcess(Guid processId, CancellationToken ct)
{
var streamId = $"order-fulfillment-{processId}";
var events = new List<IDomainEvent>();

await foreach (var se in _eventStore.ReadStream(streamId, 0, ct))
{
events.Add(DeserializeEvent(se));
}

if (events.Count == 0) return null;

var process = new OrderFulfillmentProcess();
process.LoadFromHistory(events);
return process;
}

private async Task SaveProcess(OrderFulfillmentProcess process, CancellationToken ct)
{
var streamId = $"order-fulfillment-{process.Id}";
var newEvents = process.DequeueUncommittedEvents();

var stored = newEvents.Select(e => new StoredEvent(
EventId: Guid.NewGuid(),
EventType: e.GetType().Name,
SchemaVersion: 1,
OccurredAt: DateTimeOffset.UtcNow,
Data: SerializeEvent(e),
Metadata: null
)).ToList();

await _eventStore.AppendToStream(streamId, process.Version, stored, ct);
}

private IDomainEvent DeserializeEvent(StoredEvent e) => /* ... */;
private byte[] SerializeEvent(IDomainEvent e) => /* ... */;
}

Compensation patterns

Forward recovery vs backward recovery

Forward recovery: retry until success

public async Task<bool> TryWithRetry(Func<Task> action, int maxRetries)
{
for (int i = 0; i < maxRetries; i++)
{
try
{
await action();
return true;
}
catch (TransientException)
{
await Task.Delay(TimeSpan.FromSeconds(Math.Pow(2, i)));
}
}
return false;
}

Backward recovery: undo completed steps

// Compensation actions for each step
public static class CompensationActions
{
public static readonly Dictionary<Type, Type> CompensationMap = new()
{
[typeof(PaymentCharged)] = typeof(RefundPaymentCommand),
[typeof(InventoryReserved)] = typeof(ReleaseInventoryCommand),
[typeof(ShipmentCreated)] = typeof(CancelShipmentCommand),
};
}

Semantic compensation

Sometimes you can't truly "undo":

  • Payment charged → Refund (not "uncharged")
  • Email sent → Send "sorry, ignore that" email
  • Inventory shipped → Initiate return

Design compensating actions that make business sense.

Timeout handling

Processes can get stuck waiting for events that never come:

public sealed class ProcessTimeoutChecker : BackgroundService
{
private readonly IProcessRepository _processes;
private readonly ICommandBus _commandBus;

protected override async Task ExecuteAsync(CancellationToken ct)
{
while (!ct.IsCancellationRequested)
{
var stuckProcesses = await _processes.GetStuckProcesses(
timeout: TimeSpan.FromMinutes(30),
ct);

foreach (var process in stuckProcesses)
{
// Either retry or compensate
if (process.RetryCount < 3)
{
await RetryCurrentStep(process, ct);
}
else
{
await StartCompensation(process, ct);
}
}

await Task.Delay(TimeSpan.FromMinutes(1), ct);
}
}
}

Testing sagas

public sealed class OrderFulfillmentProcessTests
{
[Fact]
public void Happy_path_completes_successfully()
{
// Given: process started
var process = OrderFulfillmentProcess.Start(
Guid.NewGuid(),
Guid.NewGuid(),
Guid.NewGuid(),
100m,
new List<OrderItem> { new("SKU-1", 2) });

process.GetInitialCommand(); // transitions to AwaitingPayment

// When: payment succeeds
var cmd1 = process.HandleExternalEvent(
new ExternalPaymentSucceeded(process.PaymentId!.Value));

Assert.IsType<ReserveInventoryCommand>(cmd1);
Assert.Equal(OrderFulfillmentState.AwaitingInventory, process.State);

// When: inventory reserved
var cmd2 = process.HandleExternalEvent(
new ExternalInventoryReserved(process.OrderId));

Assert.IsType<ShipOrderCommand>(cmd2);
Assert.Equal(OrderFulfillmentState.AwaitingShipment, process.State);

// When: shipped
var cmd3 = process.HandleExternalEvent(
new ExternalShipmentDispatched(process.OrderId, "TRACK-123"));

Assert.Null(cmd3); // no more commands
Assert.Equal(OrderFulfillmentState.Completed, process.State);
}

[Fact]
public void Inventory_failure_triggers_payment_refund()
{
var process = OrderFulfillmentProcess.Start(
Guid.NewGuid(),
Guid.NewGuid(),
Guid.NewGuid(),
100m,
new List<OrderItem> { new("SKU-1", 2) });

process.GetInitialCommand();
process.HandleExternalEvent(
new ExternalPaymentSucceeded(process.PaymentId!.Value));

// When: inventory fails
var cmd = process.HandleExternalEvent(
new ExternalInventoryFailed(process.OrderId, "Out of stock"));

// Then: refund is requested
Assert.IsType<RefundPaymentCommand>(cmd);
Assert.Equal(OrderFulfillmentState.Compensating, process.State);
}
}

Choreography example (for comparison)

When orchestration is overkill, use choreography:

// PaymentService
public class PaymentEventHandler
{
public async Task Handle(OrderPlaced e, CancellationToken ct)
{
var result = await _paymentGateway.Charge(e.CustomerId, e.TotalAmount);

if (result.Success)
await _eventBus.Publish(new PaymentSucceeded(e.OrderId, result.PaymentId));
else
await _eventBus.Publish(new PaymentFailed(e.OrderId, result.Error));
}
}

// InventoryService
public class InventoryEventHandler
{
public async Task Handle(PaymentSucceeded e, CancellationToken ct)
{
var order = await _orderService.Get(e.OrderId);
var result = await _inventory.Reserve(order.Items);

if (result.Success)
await _eventBus.Publish(new InventoryReserved(e.OrderId));
else
await _eventBus.Publish(new InventoryFailed(e.OrderId, result.Error));
}

// Compensation handler
public async Task Handle(PaymentFailed e, CancellationToken ct)
{
// Nothing to compensate - payment never succeeded
}
}

Next

Temporal queries let you answer "what was the state at time T?"

Next: Temporal queries

Sources

  • https://learn.microsoft.com/en-us/azure/architecture/reference-architectures/saga/saga
  • https://microservices.io/patterns/data/saga.html
  • https://www.enterpriseintegrationpatterns.com/patterns/messaging/ProcessManager.html
  • https://learn.microsoft.com/en-us/dotnet/architecture/microservices/architect-microservice-container-applications/asynchronous-message-based-communication