Actor Model — Senior Level¶
Topic: Actor Model Focus: distributed actors, persistence, sharding, performance
Table of Contents¶
- Introduction
- Prerequisites
- Glossary
- Core Concepts
- Real-World Analogies
- Mental Models
- Code Examples
- Pros & Cons
- Use Cases
- Coding Patterns
- Clean Code
- Best Practices
- Edge Cases & Pitfalls
- Common Mistakes
- Tricky Points
- Test Yourself
- Tricky Questions
- Cheat Sheet
- Summary
- What You Can Build
- Further Reading
- Related Topics
- Diagrams & Visual Aids
Introduction¶
At the junior and middle levels, the actor model lived inside a single process. Mailboxes were in-memory queues. Supervision trees were local. Addresses were tagged pointers. That mental picture is enough to understand the model, but it is not enough to run it at scale. A real production actor system is distributed across dozens of nodes, survives crashes without losing state, rebalances when machines come and go, recovers after network partitions, and emits enough telemetry to debug a problem at three in the morning.
The senior view of the actor model is the view from the operator's chair. You stop asking "how do actors send messages" and start asking the questions that actually matter when the model meets reality:
- How does an actor address survive when the process holding it disappears?
- Where does a stateful actor live in a cluster of N nodes, and what happens when N changes?
- How do you persist actor state without serializing every message?
- How do you trace a request that bounces through twenty actors on five hosts?
- How do you keep mailboxes from becoming unbounded backlog stores during a traffic spike?
- And, critically: when is the actor model the wrong tool, and what should you reach for instead?
This document walks through the patterns that production-grade actor systems use to answer those questions. The reference points are Erlang/OTP, Akka and Pekko on the JVM, and Microsoft Orleans on .NET. They differ in syntax, but the core problems and solutions are universal. We will look at distributed addressing, cluster sharding, persistence and event sourcing, cluster singletons and leader election, split-brain handling with CRDTs, tracing, backpressure with Reactive Streams, throughput tuning, testing strategies, and the anti-patterns that signal you picked the wrong model.
The code examples are deliberately concrete. We model a sharded user-session service, a saga implemented as a persistent actor, and a backpressure-aware ingestion pipeline. Every snippet is meant to compile and run after you wire it into a project skeleton — these are not pseudocode sketches.
Prerequisites¶
Before you read this document you should be comfortable with:
- The basics of the actor model: mailboxes, message passing, "let it crash", supervision (covered in the junior and middle documents).
- One actor library at the programming level — Erlang/OTP
gen_server, Akka Typed, Pekko, Orleans grains, or Elixir'sGenServer. - Concurrency primitives: futures/promises, executors, blocking versus non-blocking operations.
- Distributed systems vocabulary: failure detectors, gossip, CAP, eventual consistency, quorum.
- Serialization: Protobuf, Avro, or a binary format other than JSON. JSON is fine for HTTP, not for inter-actor traffic at scale.
- Event sourcing fundamentals: events as the source of truth, replay, snapshots.
- Reactive Streams or some other backpressure protocol.
- Tracing systems: OpenTelemetry, Jaeger, Zipkin, or vendor equivalents.
If any of those are blurry, pause and shore them up. Distributed actor systems amplify weaknesses in those areas — you do not want to debug a sharding rebalance bug while also Googling what a quorum is.
Glossary¶
| Term | Meaning |
|---|---|
| Cluster | A set of nodes (JVMs, BEAM VMs, .NET processes) that recognize each other as one logical actor system. |
| Node | One process in the cluster, identified by host plus port. |
| Shard | A subset of actors grouped under a logical name. The unit of placement and movement. |
| Shard region | An actor on each node that routes messages to local shards and coordinates hand-off. |
| Entity | An individual actor inside a shard, identified by an entity id. |
| Hand-off | The act of moving a shard from one node to another. |
| Rebalance | A coordinator decision that triggers many hand-offs to even out shard distribution. |
| Coordinator | The cluster-wide singleton that decides which node owns which shard. |
| Virtual actor | Orleans terminology — an actor that always exists logically and is materialized on demand. |
| Persistent actor | An actor whose state is recovered by replaying its journal of past events. |
| Journal | The append-only log of events emitted by persistent actors. |
| Snapshot | A periodic dump of an actor's state to shorten future replays. |
| Cluster singleton | An actor that exists on exactly one node at a time across the entire cluster. |
| Split-brain | Network partition that causes two halves of a cluster to each believe the other is dead. |
| ddata (Distributed Data) | Akka's CRDT-backed replicated key-value store. |
| Backpressure | Downstream telling upstream to slow down, signaled by demand. |
| Mailbox | The queue of unprocessed messages held by an actor. |
| Dispatcher | The thread pool that runs actor message-handling. |
| Stash | A temporary holding area for messages an actor cannot handle yet. |
| TestKit | Framework support for testing actors with controllable timing and synthetic probes. |
Core Concepts¶
Distributed Actor Systems¶
A single-node actor system gives you message passing and supervision. A distributed actor system adds two properties that change everything: location transparency and failure as a first-class signal.
Location transparency means a reference to an actor does not encode where that actor is. Sending a message uses the same API whether the recipient is in your process or on a host across the data center. The runtime is responsible for serialization, network transport, and routing. Erlang has had this since the 1980s — a Pid can be local or remote and you do not know the difference at the call site. Akka and Pekko replicate the pattern with ActorRef. Orleans goes further: grain references are virtual, pointing to a logical identity that the runtime will materialize wherever it sees fit.
Failure as a signal means processes can monitor each other across the network. When a remote node dies, every actor that was watching an actor on that node receives a Terminated (or equivalent) message. Supervision trees still work across the wire — although you usually do not let parent and child sit on different nodes; supervision is local and cluster sharding is the cross-node analogue.
The runtime parts you need to know:
- Membership and gossip. Nodes discover each other and exchange heartbeats. Akka uses a gossip protocol derived from Amazon Dynamo. Erlang uses a fully connected mesh by default, which limits cluster size and is now sometimes replaced with
partisanfor large clusters. Orleans uses a membership table backed by Azure storage or an external store. - Failure detectors. A heartbeat alone is not enough — you need to decide when a missed heartbeat counts as a failure. Akka uses the Phi Accrual Failure Detector, which produces a continuous suspicion level rather than a binary up/down decision.
- Serializers. Inter-node messages must be serialized. JSON works for prototypes; you want Protobuf, Avro, or a custom binary format for production. Allocations per message at scale are the difference between 200k msg/s and 1.5M msg/s.
Cluster Sharding¶
Sharding is the technique that lets you store millions of actors across a cluster without manually pinning each one to a node. You define a logical type — UserSession, Order, Device — and the cluster assigns each entity to a shard, and each shard to a node. Sending a message to entity user-42 goes through a local shard region actor, which knows the location of shard 42 % N and forwards the message.
Three operations matter:
- Placement. When the first message arrives for entity
user-42, the coordinator decides which shard it belongs to (typicallyhash(id) % shardCount) and which node owns that shard. The destination node spawns the actor on demand. - Hand-off. When the coordinator decides to move a shard, it sends a stop signal to the source. The source region tells every entity in the shard to checkpoint, processes any remaining mailbox items, then signals completion. The coordinator then redirects routing to the new node, which lazily re-materializes entities as messages arrive.
- Rebalancing. When a node joins or leaves, the coordinator may decide to redistribute shards to even out load. Rebalances are throttled — moving too many shards at once causes a stampede.
A few principles save pain:
- Pick a shard count once and never change it. It must be much larger than your node count (a 10x rule of thumb) so individual shards stay small but smaller than your entity count by orders of magnitude.
- Make entity ids stable across deployments. The hash of an id determines its shard; changing id format reshuffles everything.
- Treat hand-off as expected, not exceptional. If your entities cannot lose their in-memory caches without massive re-warm cost, sharding is going to hurt.
Persistent Actors and Event Sourcing¶
A persistent actor is one whose state is the fold of an event log. On startup or after a crash, the framework reads events from a journal, replays them through a handler, and the actor's state is reconstructed. The actor never writes its current state directly — it emits events.
In Akka Persistence (and Pekko's port), a EventSourcedBehavior has three pieces:
- State. The in-memory representation. Immutable, replaced after every event.
- Command handler. Decides what events to emit in response to a command.
- Event handler. Updates the state given an event. Pure function.
The lifecycle is: receive command, validate, emit events, persist events, update in-memory state, optionally reply. The reply is only allowed after persistence succeeds — replying earlier means a crash could expose a write that was not durable.
Two performance levers:
- Snapshots. Periodically dump current state. On recovery, load the latest snapshot and only replay events that came after. Without snapshots, recovery time grows linearly with history.
- Journal compaction. Many journals (Cassandra, JDBC, EventStore) support deleting events older than a snapshot. This trades audit-completeness for recovery speed. Decide explicitly which you need.
A common mistake is treating the journal as a relational database. It is not. You cannot query events across entities efficiently. If you need to project events into a queryable read model, that is a separate component — usually a CQRS read-side built with Akka Projection, ksqlDB, or hand-rolled Kafka consumers.
Cluster Singleton and Leader Election¶
Some responsibilities should run on exactly one node at a time. The classic examples are scheduling, batch coordinators, and idempotency gatekeepers. A cluster singleton is an actor that the cluster guarantees exists on exactly one node — usually the oldest member.
The dance is more delicate than it sounds:
- The cluster elects a leader using a deterministic rule (oldest member, lowest hash address, etc.).
- The leader hosts the singleton.
- When the leader leaves, the next leader spawns the singleton.
- During the gap, all messages must be buffered or rejected.
ClusterSingletonManager does the hosting and ClusterSingletonProxy does the routing on each node. The proxy buffers messages while no singleton exists. Buffer size is bounded — if it overflows, you start dropping messages, which is exactly the failure you want to detect with monitoring.
Leader election is not the same as a consensus algorithm. Akka does not use Raft for singleton election; it uses the gossip view of who is in the cluster. That works in normal operation but breaks during a split-brain — both halves think they are the cluster and each elects its own singleton. The fix is downing strategies, covered next.
Split-Brain and CRDT-Backed State¶
A network partition splits the cluster into two or more disconnected halves. Each half sees the others as unreachable. If both halves keep running independently, you get two singletons, duplicate work, and ultimately conflicting writes — split-brain.
The defenses, in increasing order of safety:
- Auto-downing. Mark unreachable nodes as down after a timeout. Simple, dangerous — both halves do it, both halves keep running, you get split-brain anyway.
- Static quorum. Configure a known quorum size. If a half cannot see a quorum, it shuts itself down. Works only if cluster size is fixed.
- Keep-majority. Each half checks if it has the majority of last-known members. The minority shuts down. Safer than static quorum because it adapts.
- Lease-based. Acquire a lease from an external store (Kubernetes API, Consul, Etcd). Only the lease holder operates. Strongest guarantee, adds external dependency.
For state that must remain available during partitions, use CRDTs. Akka's Distributed Data (ddata) provides replicated counters, sets, maps, and registers that converge automatically after a partition heals. The trade-off: CRDTs guarantee eventual convergence but not linearizability. You cannot use a GCounter to enforce "at most 100 active users." You can use it to count "approximate active users with eventual correctness."
Tracing Across Actors¶
When a request bounces through ten actors on five hosts, a stack trace tells you nothing. You need correlation.
OpenTelemetry actor instrumentation works on three primitives:
- Trace context propagation. Every message carries a trace id and parent span id. The sending actor injects the context; the receiving actor extracts it.
- Span around message handling. Wrap the body of the message handler in a span that records the actor path and message type.
- Async linking. When an actor sends a message and continues, the new actor's work is linked to the originating span but does not block it.
Practical considerations:
- Tagging spans with mailbox depth and processing duration helps you correlate latency with backlog.
- Sampling decisions must be coherent across the trace. Use parent-based sampling so a sampled trace stays sampled through every actor.
- Pekko has a
pekko-instrumentationmodule; Akka has commercial Lightbend Telemetry. Roll your own if those do not fit, but understand the wire format you need to maintain.
Backpressure with Reactive Streams¶
Mailboxes have a fatal default: they are unbounded. A slow consumer plus a fast producer plus no backpressure equals out-of-memory.
Reactive Streams is a protocol — implemented by Akka Streams, Pekko Streams, RxJava, and Project Reactor — where consumers signal demand and producers respect it. The contract is:
- Subscriber subscribes to publisher.
- Subscriber sends
request(n)upstream. - Publisher emits up to
nitems. - Subscriber requests more when ready.
- Cancellation is explicit.
Akka Streams compiles into actors at the bottom but exposes a typed builder API. You assemble graphs of Source, Flow, and Sink, attach buffers and strategies (drop-head, drop-tail, backpressure), and run them on a materializer. Inside, the streams runtime uses windowed demand to amortize per-element overhead.
When you need actor-level backpressure that is not streamable, implement the work-pulling pattern: workers send Pull to a master when ready; the master sends one work item per pull. The mailbox never grows beyond worker count.
Performance¶
Actor systems advertise impressive numbers — Akka claims fifty million messages per second on a single JVM. Those numbers come with caveats.
The throughput-driving factors are:
- Allocations per message. Boxing a primitive, wrapping in
Option, building a case class — each allocation is GC pressure. Hot paths must avoid them. - Dispatcher choice. The default
ForkJoinPoolis good for mixed workloads but terrible for blocking I/O. Use a dedicated dispatcher with aThreadPoolExecutorfor blocking work. - Throughput configuration. The
throughputsetting controls how many messages an actor processes before the dispatcher moves to the next actor. Higher values reduce context-switch overhead but can starve other actors. - Mailbox type. The default unbounded mailbox is a linked list. A bounded array-backed mailbox can be faster but introduces drops or backpressure.
- Serialization cost. Inter-node messages pay a serializer hit on every send. JSON is one to two orders of magnitude slower than Protobuf.
- Network MTU. Akka Cluster batches messages into network frames. The batch size and TCP buffer sizes interact with throughput.
A practical tuning order: profile first, identify the bottleneck (mailbox depth, CPU, network, GC), change one variable, measure, repeat. Premature dispatcher tuning is responsible for a surprising amount of production damage.
Testing Actors¶
Actor code is harder to test than synchronous code because timing matters. Pekko TestKit and Akka TestKit-Typed give you:
- Test probes. Actors that record received messages and let you assert on them.
- Synthetic time. A test scheduler that advances on command, so timeouts are deterministic.
- Behavior testing. You can replace child actor spawning with probes via a behavior factory pattern.
- Sharding TestKit. Lets you test sharding logic without a full cluster.
- Persistence TestKit. In-memory journal and snapshot store with deterministic recovery.
The discipline: every actor that takes a collaborator should accept it as a constructor parameter, not look it up from the actor system. That lets tests inject a probe. The behavior factory pattern is just dependency injection adapted to typed actors.
When the Actor Model Is Wrong¶
The actor model is general but not universal. It is the wrong tool when:
- CPU-bound numerical work. Actors add per-message overhead. A dense matrix multiplication wants SIMD and tight loops, not mailboxes.
- Simple stateless RPC. If your service is genuinely stateless and you only need horizontal scaling, gRPC + Kubernetes is simpler and operationally cheaper.
- Latency-critical hard real-time. Mailbox jitter and GC pauses (on JVM and BEAM) make tail latency uneven. Trading floor matching engines, low-latency proxies, and tick-to-trade paths usually pick lock-free queues and pinned threads.
- Small monoliths. A single Postgres and a synchronous web framework will outpace an actor system for a startup with 10k users.
- Workflow engines. If your problem is "long-running multi-step workflow with human approval," a dedicated workflow engine (Temporal, Cadence) is purpose-built and easier to operate.
Choosing the actor model when the problem does not need it is one of the most common over-engineering patterns in the industry.
Real-World Analogies¶
| Concept | Analogy |
|---|---|
| Distributed actor system | A global mail service — you address a person, the system finds them. |
| Cluster sharding | A library where books are grouped by call number ranges; each branch holds some ranges. |
| Hand-off | Moving an entire shelf from one branch to another while keeping the catalog accurate. |
| Persistent actor | A bank ledger — current balance is the sum of all entries; you can rebuild it any time. |
| Snapshot | The balance carry-forward at month-end, so audit does not start from the company's founding date. |
| Cluster singleton | The night-shift security guard — exactly one per building, but the post is always filled. |
| Split-brain | Two department heads, each unaware the other exists, both giving orders. |
| ddata (CRDT) | A shared shopping list that everyone edits offline; on reconnect, the items merge sensibly. |
| Backpressure | A waiter signaling the kitchen to slow down because the dining room is full. |
| Mailbox bottleneck | A clerk's inbox piling up while everyone keeps dropping forms in. |
| Tracing | A package tracking number that follows the parcel through every depot. |
| Work-pulling | Day laborers at a hiring hall — they walk up when ready, no one queues them. |
Mental Models¶
The runtime is the operating system. In a distributed actor system, the actor runtime takes over responsibilities that the OS handles for processes — scheduling, isolation, addressing, lifecycle. You stop thinking in JVM-or-BEAM-or-CLR terms and start thinking "Akka cluster" or "BEAM cluster" as the platform.
Addressing is logical, placement is physical. An actor reference says who; the cluster decides where. You should never have to know which node holds an actor for your code to work. If you do, sharding is misconfigured.
State lives where the events live. A persistent actor's in-memory representation is a cache of its event log. If you treat it as the source of truth, you will be surprised when a crash erases your unpersisted changes.
Failure is a message, not an exception. A remote actor dying is not something you catch — it is something you receive as a Terminated signal. You handle it like any other message, in your normal mailbox loop.
Backpressure is the absence of growth. When a system is healthy, mailboxes stay shallow. Growth is the symptom of demand exceeding capacity. The fix is to slow upstream, not to widen the queue.
Tracing is the only debugger. Stack traces tell you what one actor was doing at one moment. A distributed trace tells you what twenty actors did over a second. You will rely on the latter far more than the former.
Code Examples¶
The examples use Pekko (the Apache fork of Akka) with the typed API. They are written in Scala because Pekko's typed DSL is most expressive there; the same patterns work in Java with more verbosity.
Example 1: Sharded User-Session Service¶
A user-session service stores per-user state — current basket, last-seen page, A/B test bucket. It must scale to millions of concurrent users without pinning all of them to a single node.
import org.apache.pekko.actor.typed._
import org.apache.pekko.actor.typed.scaladsl._
import org.apache.pekko.cluster.sharding.typed.scaladsl._
object UserSession {
sealed trait Command
final case class AddToBasket(itemId: String, replyTo: ActorRef[Reply]) extends Command
final case class GetBasket(replyTo: ActorRef[Reply]) extends Command
private case object Passivate extends Command
sealed trait Reply
final case class Basket(items: Vector[String]) extends Reply
case object Done extends Reply
val TypeKey: EntityTypeKey[Command] = EntityTypeKey[Command]("UserSession")
def apply(entityId: String): Behavior[Command] =
Behaviors.setup { ctx =>
ctx.setReceiveTimeout(scala.concurrent.duration.DurationInt(5).minutes, Passivate)
session(entityId, Vector.empty)
}
private def session(id: String, items: Vector[String]): Behavior[Command] =
Behaviors.receiveMessage {
case AddToBasket(itemId, replyTo) =>
replyTo ! Done
session(id, items :+ itemId)
case GetBasket(replyTo) =>
replyTo ! Basket(items)
Behaviors.same
case Passivate =>
Behaviors.stopped
}
}
object UserSessionService {
def init(system: ActorSystem[_]): ActorRef[ShardingEnvelope[UserSession.Command]] = {
val sharding = ClusterSharding(system)
sharding.init(
Entity(UserSession.TypeKey)(ctx => UserSession(ctx.entityId))
.withStopMessage(UserSession.Passivate)
)
}
}
A few senior-level points to notice:
setReceiveTimeouttriggers passivation after five minutes of inactivity. This is how you keep memory bounded — idle sessions are stopped and their state is forgotten. (Or, if the actor were persistent, the next message would re-hydrate from journal.)withStopMessage(Passivate)integrates the passivation signal with the sharding hand-off. When the shard region wants to move the entity, it sendsPassivate, the actor stops cleanly, and the next message recreates it on the new node.- The entity id is passed as a constructor parameter. The framework computes
shard = hash(id) % shardCountand routes accordingly. - Sending a message looks like this:
val service = UserSessionService.init(system)
service ! ShardingEnvelope("user-42", UserSession.AddToBasket("sku-1", replyProbe.ref))
You do not specify a node. The cluster routes.
Example 2: Saga as a Persistent Actor¶
A saga coordinates a multi-step distributed transaction with compensations. We model an order placement saga: reserve inventory, charge payment, schedule shipping; compensate on failure.
import org.apache.pekko.actor.typed._
import org.apache.pekko.persistence.typed.PersistenceId
import org.apache.pekko.persistence.typed.scaladsl._
object OrderSaga {
sealed trait Command
final case class Start(orderId: String, items: Vector[String], replyTo: ActorRef[Result]) extends Command
private final case class InventoryReserved(token: String) extends Command
private final case class InventoryFailed(reason: String) extends Command
private final case class PaymentCharged(receipt: String) extends Command
private final case class PaymentFailed(reason: String) extends Command
private final case class ShippingScheduled(trackingId: String) extends Command
sealed trait Event
final case class Started(orderId: String, items: Vector[String]) extends Event
final case class InventoryStepDone(token: String) extends Event
final case class PaymentStepDone(receipt: String) extends Event
final case class ShippingStepDone(trackingId: String) extends Event
final case class Failed(reason: String) extends Event
final case class Compensated(stage: String) extends Event
sealed trait Result
final case class Succeeded(trackingId: String) extends Result
final case class Aborted(reason: String) extends Result
sealed trait State
case object Idle extends State
final case class ReservingInventory(orderId: String, items: Vector[String]) extends State
final case class ChargingPayment(orderId: String, token: String) extends State
final case class Scheduling(orderId: String, token: String, receipt: String) extends State
final case class Done(trackingId: String) extends State
final case class Aborting(reason: String, stagesToUndo: List[String]) extends State
def apply(persistenceId: PersistenceId, replyTo: ActorRef[Result]): Behavior[Command] =
EventSourcedBehavior[Command, Event, State](
persistenceId = persistenceId,
emptyState = Idle,
commandHandler = (state, cmd) => commandHandler(state, cmd, replyTo),
eventHandler = eventHandler
).snapshotWhen((_, ev, _) => ev.isInstanceOf[ShippingStepDone] || ev.isInstanceOf[Failed])
private def commandHandler(state: State, cmd: Command, replyTo: ActorRef[Result]): Effect[Event, State] = {
(state, cmd) match {
case (Idle, Start(orderId, items, _)) =>
Effect.persist(Started(orderId, items))
case (ReservingInventory(_, _), InventoryReserved(token)) =>
Effect.persist(InventoryStepDone(token))
case (ChargingPayment(_, _), PaymentCharged(receipt)) =>
Effect.persist(PaymentStepDone(receipt))
case (Scheduling(_, _, _), ShippingScheduled(tracking)) =>
Effect.persist(ShippingStepDone(tracking))
.thenReply(replyTo)(_ => Succeeded(tracking))
case (_, InventoryFailed(reason)) =>
Effect.persist(Failed(reason)).thenReply(replyTo)(_ => Aborted(reason))
case (_, PaymentFailed(reason)) =>
Effect.persist(Failed(reason), Compensated("inventory"))
.thenReply(replyTo)(_ => Aborted(reason))
case _ =>
Effect.unhandled
}
}
private def eventHandler(state: State, event: Event): State =
(state, event) match {
case (_, Started(id, items)) => ReservingInventory(id, items)
case (ReservingInventory(id, _), InventoryStepDone(token)) => ChargingPayment(id, token)
case (ChargingPayment(id, token), PaymentStepDone(receipt)) => Scheduling(id, token, receipt)
case (_, ShippingStepDone(tracking)) => Done(tracking)
case (_, Failed(reason)) => Aborting(reason, Nil)
case (s, Compensated(_)) => s
case (s, _) => s
}
}
Key senior observations:
- The saga state machine is encoded as states and transitions. On recovery, every prior event replays and the actor lands in the correct state without any external coordination.
snapshotWhentakes a predicate that picks meaningful events to snapshot at. Snapshotting after every event is wasteful; after terminal events is enough for fast recovery.- The
Compensatedevent records the compensation step. The actor is the single source of truth for what has and has not been undone, which removes the need for external coordination. - The saga should be hosted under cluster sharding so that the same saga id always routes to the same node and crashes recover from the journal.
Example 3: Backpressure-Aware Ingestion Pipeline¶
An ingestion pipeline reads from a fast source (Kafka), enriches each message by calling a slower service, and writes to a database. Without backpressure, a Kafka burst will OOM the JVM.
import org.apache.pekko.NotUsed
import org.apache.pekko.stream._
import org.apache.pekko.stream.scaladsl._
import org.apache.pekko.actor.typed.ActorSystem
import scala.concurrent.Future
import scala.concurrent.duration._
final case class RawEvent(id: String, payload: String)
final case class Enriched(id: String, payload: String, tags: List[String])
class EnrichmentService {
def enrich(e: RawEvent): Future[Enriched] = Future.successful(
Enriched(e.id, e.payload, List("auto"))
)
}
class EventWriter {
def write(e: Enriched): Future[Done.type] = Future.successful(Done)
}
object IngestionPipeline {
def run(system: ActorSystem[_], kafkaSource: Source[RawEvent, NotUsed]): Future[Done] = {
implicit val sys: ActorSystem[_] = system
val enrich = new EnrichmentService
val writer = new EventWriter
kafkaSource
.buffer(size = 256, OverflowStrategy.backpressure)
.mapAsync(parallelism = 16)(enrich.enrich)
.groupedWithin(100, 200.millis)
.mapAsync(parallelism = 4)(batch =>
Future.sequence(batch.map(writer.write)).map(_ => Done)
)
.toMat(Sink.ignore)(Keep.right)
.run()
}
}
Senior-level analysis:
buffer(256, OverflowStrategy.backpressure)makes the buffer bounded and propagates pressure upstream when full. The Kafka source then stops requesting messages, which slows the broker rather than crashing the JVM.mapAsync(parallelism = 16)runs sixteen enrichment calls in parallel but preserves order. If order does not matter, usemapAsyncUnorderedfor higher throughput.groupedWithin(100, 200.millis)accumulates up to one hundred enriched events or two hundred milliseconds, whichever comes first. The downstream writes batched, which usually outperforms one-at-a-time writes by ten times or more.- The pipeline gracefully degrades — if enrichment is slow, the buffer fills, Kafka backs off, the consumer lag grows, and your alerts fire. No memory blow-up, no message loss, no message duplication.
For pipelines where Kafka offsets must be committed only after the write succeeds, use the Alpakka Kafka connector's committableSource with explicit commit at the end of the flow. Otherwise you risk message loss on restart.
Pros & Cons¶
Pros
- Location transparency lets you scale a logical design horizontally without rewriting business logic.
- Cluster sharding turns "millions of concurrent stateful objects" into a tractable engineering problem.
- Event sourcing through persistent actors provides natural audit, replay, and time-travel debugging.
- Supervision hierarchies plus monitoring make failure a normal control flow signal.
- Reactive Streams composition gives backpressure-correct pipelines with declarative code.
- Mature toolchains exist for Erlang/OTP, Akka/Pekko, and Orleans — operational patterns are well-understood.
Cons
- Operational complexity is high. You need to understand serialization, gossip, failure detection, and downing strategies.
- Debugging distributed actor flows requires tracing infrastructure from day one.
- Performance tuning has many knobs (dispatcher, mailbox, throughput, serializer) and bad defaults bite.
- Versioning persistent events is non-trivial — once an event is in the journal, you cannot change it.
- Akka 2.7+ moved to a non-OSS license; Pekko, the Apache fork, is the open-source path forward but still maturing.
- The model amplifies subtle bugs — race conditions in untyped APIs, leaked actor refs, wrong dispatcher choices.
Use Cases¶
- IoT device shadows. Millions of devices, each modeled as a sharded entity holding latest telemetry and desired-vs-reported state. Orleans is the canonical platform here.
- Multiplayer game backends. Each match or room is a sharded actor; player connections route to the right entity.
- Real-time bidding. Each campaign or advertiser is an actor; sharding keeps state local and fast.
- Chat and messaging. Each conversation is a sharded persistent actor with event-sourced history.
- Workflow orchestration at scale. Sagas as persistent actors for long-running cross-service flows.
- Telecom switches and signaling. Erlang's original domain — soft real-time, hot upgrades, fault tolerance.
- Financial reconciliation. Each account or position is a persistent actor; events are the immutable ledger.
Coding Patterns¶
- Behavior factory. Wrap actor creation in a function that takes collaborators as parameters. Enables test injection.
- Stash-on-init. Persistent actors stash incoming commands during recovery and unstash when ready.
- Reply protocol with
replyTo: ActorRef[Reply]. Typed request/response without ad-hoc patterns. - Sealed trait protocol. Every command and event sealed under a trait — the compiler enforces exhaustive handling.
- Idempotent commands. Every external command carries an id so duplicate delivery is detectable and rejectable.
- Adapter actors. Convert protocol of an external system to typed commands for the actor that consumes them.
- Work-pulling. Workers signal readiness; coordinator dispatches one job at a time. Built-in backpressure.
Clean Code¶
Actor code accumulates noise quickly. A few rules keep it readable.
- Keep the protocol (sealed trait) at the top of the actor file. The protocol is the actor's API.
- Separate command-handling from state-transition logic. Persistent actors enforce this; pure actors should follow voluntarily.
- Name behaviors after the state, not the action:
idle,loading,ready,closing. Reading the message-handler match should feel like reading a state machine, not a switch. - One actor, one concern. If an actor handles connection management and business logic, split it.
- Use timeouts at the message-protocol level, not as ambient configuration.
askpatterns must always have explicit timeouts. - Log with structured fields, not formatted strings. Include actor path, message type, and correlation id on every log line.
Best Practices¶
- Pin shard counts on day one. Document the value, the rationale, and warn about resharding cost.
- Always passivate idle sharded entities. Unbounded entity counts cause silent memory growth.
- Snapshot persistent actors before journal sizes hit replay-time budgets. A snapshot every thousand events is a common starting point.
- Choose a binary serializer (Protobuf, Avro) for cluster traffic. JSON does not scale.
- Configure a split-brain resolver before going to production. Default auto-down is not safe.
- Treat the cluster singleton's buffer overflow as a hard alert. It means the system is making no progress.
- Use dedicated dispatchers for blocking work. The default fork-join pool starves under blocking I/O.
- Instrument with OpenTelemetry from the start. Adding tracing after a production incident is too late.
- Test recovery on every persistent actor change. A schema change can break replay silently.
- Run sharding-rebalance tests in staging. Real production traffic during a rebalance is not the place to discover bugs.
Edge Cases & Pitfalls¶
- Message loss during hand-off. Sharding hand-off is best-effort; a few in-flight messages can be lost if the source crashes mid-handoff. Treat sharded messaging as at-most-once unless paired with persistent retries.
- Snapshot serialization drift. A field added to the state class breaks deserialization of older snapshots. Version your state types and write upgraders.
- Journal corruption. A bad event handler that throws on certain events causes infinite recovery loops. Always validate events before persisting.
- Cluster singleton flapping. During leader changes, the singleton can flap between nodes. Use a debounce or a lease.
- Split-brain under partition. Without a resolver, both halves continue; with auto-down, both halves think the other died. Pick a resolver and test it with a chaos tool.
- Actor leaks via watch. Watching an actor that never dies and never gets unwatched keeps a reference alive. Always unwatch when no longer interested.
- Backpressure breakage from
tell. Sending withtelldoes not respect downstream demand. Mixing raw actors with stream stages bypasses backpressure and corrupts the model. - Unbounded ask buffers.
askallocates a temporary actor per call; if the target is overloaded, those build up. Use bounded ask viaAskTimeoutExceptionhandling and explicit retry caps.
Common Mistakes¶
- Treating the in-memory state of a persistent actor as durable. Only events on disk are durable.
- Using sharding for a single-instance global counter. That is a singleton, not a sharded entity.
- Forgetting to register custom serializers, then discovering production uses Java serialization for everything. Disable Java serialization explicitly.
- Pinning shard count to node count. The cluster will eventually scale; shard count should be a multiple, not equal.
- Building a saga with timers and external state instead of a persistent actor. The saga loses progress on every restart.
- Calling blocking I/O from an actor on the default dispatcher. One blocked thread is fine; a hundred blocked threads starves the entire system.
- Spawning a child per request. Spawning is cheap but not free; a request-per-second sustained spawn rate generates GC pressure.
- Catching exceptions inside the actor and continuing. Supervision exists for a reason. Let the actor crash and supervisor decide.
Tricky Points¶
askis sugar over a hidden actor. Everyaskcreates a temporary actor that waits for the reply. High-frequency asks are a performance trap.- Stash size is bounded. Default stash capacity is 100. A persistent actor with a long recovery and a flood of incoming commands will lose messages when stash overflows.
- Dispatcher starvation by long messages. A handler that takes a long time blocks the dispatcher's slot from other actors. The throughput setting controls fairness; long handlers should yield by sending a message to themselves.
- Recovery from a corrupted snapshot. If snapshot replay throws, the actor cannot start. The recovery path is to delete the snapshot and replay from journal. Have a runbook for this.
- CRDT convergence is not consistency. Two halves of a partition each updating a
GCounterproduce a sum, not a strict ordering. Do not use CRDTs where you need real consensus. - Cluster sharding and roles. Shards can be restricted to nodes with a given role. Forgetting this on a heterogeneous cluster causes hand-off to bounce.
- Pekko vs Akka API drift. Pekko froze on Akka 2.6.x's open-source surface. Newer Akka APIs do not exist in Pekko. Mind your imports.
Test Yourself¶
- Explain why a persistent actor's reply must happen after
Effect.persistsucceeds, not before. - Why is a shard count of 1000 typical even for a 10-node cluster?
- What happens to messages addressed to a sharded entity during a hand-off?
- Describe two split-brain resolution strategies and the trade-offs between them.
- What is the difference between
mapAsyncandmapAsyncUnorderedin a stream? - Why does the cluster singleton proxy buffer messages, and what does buffer overflow mean operationally?
- How does work-pulling provide backpressure without using Reactive Streams?
- What is the cost of snapshotting after every event? After never snapshotting?
- Why is the default Java serializer dangerous in production?
- Name three workloads where the actor model is the wrong choice and explain why.
Tricky Questions¶
- Your sharded entity has a five-minute receive timeout. A burst of one million unique entity ids arrives in ten seconds. What does your heap look like at minute four?
- A persistent actor's event class gains a new required field. You deploy. Recovery fails for old entities. What do you do?
- Two nodes lose the network link between them but stay connected to the rest of the cluster. With keep-majority downing, what happens to each?
- You see
asktimeouts spike from 1% to 30% under load. The target actor's CPU is at 40%. What is the most likely cause? - A cluster singleton is responsible for scheduling jobs. After a deploy, jobs are firing twice. What went wrong?
- Your enrichment service has a p99 of 2 seconds, p50 of 50 ms. Your
mapAsync(parallelism = 8)stream has lower throughput than expected. What changes do you try? - A persistent actor crashes during recovery because one event in its journal makes the event handler throw. The actor is in production. What is your recovery sequence?
- You add OpenTelemetry but spans for downstream actors do not appear in the trace. What did you forget to propagate?
- Your
GCounterfor active sessions reads 12,453 on node A and 12,447 on node B. After convergence it reads 12,460. Why? - You picked the actor model for a CPU-bound numerical service and throughput is 10x worse than the previous thread-pool design. How do you explain the result, and what is your migration plan?
Cheat Sheet¶
ADDRESSING
Local ref -> in-process pointer
Remote ref -> cluster-routed via gossip view
Sharded entity -> EntityRef(id) -> shard region -> node
SHARDING
shardId = hash(entityId) % shardCount
passivate idle entities to bound memory
shardCount >> nodeCount, fixed forever
PERSISTENCE
Command -> validate -> Effect.persist(Event) -> update state -> reply
Snapshot when (state, event, seqNr) predicate true
Recovery: load snapshot, replay events after snapNr
CLUSTER SINGLETON
Manager on every node, one actually hosts
Proxy buffers messages while no singleton exists
Use lease for safety across split-brain
SPLIT-BRAIN
Auto-down -> dangerous
Static quorum -> fixed size
Keep-majority -> adapts
Lease (k8s, consul) -> strongest
BACKPRESSURE
Reactive Streams: request(n) downstream -> upstream emits up to n
Work-pulling: worker -> Pull; master -> one work item per Pull
DISPATCHERS
Default fork-join -> mixed CPU
Dedicated TPE -> blocking I/O
Pinned dispatcher -> one thread per actor (low-latency)
TESTING
TestProbe.expectMessage / fishForMessage
ManualScheduler for time control
PersistenceTestKit for journal/snapshot
ShardingTestKit for routing tests
Summary¶
The senior view of the actor model is about distribution, durability, and operability. Local actors with mailboxes and supervision are the easy part. The hard part is making millions of stateful entities survive node failures, network partitions, deploys, and traffic spikes without losing data or producing duplicates.
Cluster sharding gives you horizontal placement. Persistent actors give you durability through event sourcing. Cluster singletons give you exactly-once role hosting. CRDTs give you available-during-partition state. Reactive Streams give you backpressure-correct pipelines. Tracing gives you observability across the whole system. Every one of these is a technique with trade-offs, and choosing well requires understanding both the mechanism and the failure modes.
The model is powerful but not universal. CPU-bound numerical work, simple stateless RPC, and hard real-time pipelines should look elsewhere. When the problem genuinely involves many independent stateful entities communicating asynchronously, the actor model gives you a framework that handles concerns most other models leave to the application — at the cost of operational complexity that you must take seriously from day one.
What You Can Build¶
- Sharded session service with passivation, persistence, and OpenTelemetry tracing across a five-node cluster.
- Saga orchestrator for cross-service order placement with persistent state, snapshots, and compensations.
- IoT device shadow service holding desired-vs-reported state for one million simulated devices with eventual consistency.
- Real-time chat backend with per-room sharded persistent actors and a CQRS read-side projection to a search index.
- Job scheduler built on a cluster singleton with a backup lease for split-brain safety.
- Backpressure-aware ingestion pipeline from Kafka through an enrichment microservice to PostgreSQL with end-to-end commit-on-success semantics.
- Multiplayer game session service with sharded match actors, presence tracking via CRDTs, and lobby allocation via leader election.
Further Reading¶
- Reactive Messaging Patterns with the Actor Model by Vaughn Vernon — production patterns with Akka.
- Designing Data-Intensive Applications by Martin Kleppmann — distributed systems fundamentals that underpin everything here.
- Akka and Pekko official documentation, especially the cluster, sharding, and persistence sections.
- Designing for Scalability with Erlang/OTP by Cesarini and Vinoski.
- Microsoft Orleans documentation — virtual actor model and grain placement strategies.
- Roland Kuhn's talks on Reactive Streams design.
- Lightbend's split-brain resolver guide.
- The CRDT papers by Marc Shapiro et al.
- OpenTelemetry documentation for asynchronous propagation.
- Pat Helland's Life Beyond Distributed Transactions — the conceptual foundation for sagas.
Related Topics¶
- Actor Model — Junior
- Actor Model — Middle
- Actor Model — Optimization
- Concurrency Models — Overview
- Event Sourcing and CQRS
- Saga Pattern
- Distributed Tracing
- Reactive Streams
- CRDTs
- Failure Detectors
Diagrams & Visual Aids¶
Cluster sharding routing
Client
|
| service ! ShardingEnvelope("user-42", msg)
v
+---------------------+ +---------------------+
| Node A | | Node B |
| ShardRegion --------|------>| ShardRegion |
| (no shard 7) | | Shard 7 |
| | | | |
| | | v |
| | | Entity user-42 |
+---------------------+ +---------------------+
Persistent actor recovery
start
|
v
load snapshot (seqNr = S)
|
v
replay events (S+1 ... latest)
|
v
state ready, unstash deferred commands
|
v
receive new commands
Cluster singleton over time
Node A (oldest) hosts singleton
Node A leaves cluster
|
v
Node B (now oldest) starts singleton
proxy on every node redirects future messages to Node B
Split-brain with keep-majority
Partition: {A, B, C} | {D, E}
Last-known cluster size: 5
Half {A,B,C} has 3 of 5 -> majority -> keeps running
Half {D,E} has 2 of 5 -> minority -> shuts itself down
Reactive Streams demand flow
Sink <- request(64) -- Flow <- request(64) -- Source
Sink -- onNext(x1) -> Flow -- onNext(x1) -> Source
...
buffer fills -> downstream stops requesting -> upstream stops emitting
Work-pulling pattern
Master ---- Job ----> Worker (only after worker sent Pull)
Worker -- Pull ----> Master
No queue grows; mailbox depth = number of pending pulls (bounded)
Saga state machine
Idle
| Start
v
ReservingInventory --(InventoryReserved)--> ChargingPayment
| InventoryFailed | PaymentFailed
v v
Aborting <-(Compensated)---- Aborting (undo inventory)
|
v
Scheduling --(ShippingScheduled)--> Done
Trace propagation through actors
HTTP request (trace id T, span S0)
|
v
Actor A: span S1 (parent S0) --send msg with T,S1-->
|
v
Actor B: span S2 (parent S1)
|
v
Actor C: span S3 (parent S2)
These diagrams together describe the routing, durability, leader, partition, backpressure, control, and observability stories that this document covers. Internalize them and you can read most production actor-system code and reason about it correctly.