Skip to content

Repository Concept — Specification

What? This file states the formal contract of a Repository as Eric Evans laid it out in Domain-Driven Design (2003) and Vaughn Vernon refined in Implementing Domain-Driven Design (2013) — what behaviours every implementation must honour, regardless of whether the backing store is JPA, JOOQ, MongoDB, or a HashMap. It also formalises the Specification pattern, which serves as the standard extension point when callers need to fetch aggregates by criteria more complex than identity. How? Treat the four properties — collection semantics, aggregate-scoped fetching, identity-based access, and persistence transparency — as the acceptance criteria for any new repository implementation. Treat the Specification pattern as the canonical alternative to a forest of findByXAndYAndZ methods.


1. The four properties of a Repository

A class satisfies the Repository contract if and only if it exhibits the following four properties. Together they distinguish a repository from a DAO, a query service, or a generic CRUD wrapper.

1.1 Collection semantics

The repository presents itself to callers as if it were an in-memory collection of aggregates that happens to persist between application runs.

  • Membership — an aggregate is either in the repository or not; there is no half-state.
  • Identity uniqueness — at most one aggregate per identity is in the collection.
  • Add/remove vocabulary — the public methods read like operations on a Set: add/save to include, remove/delete to exclude, findById to look up.
  • Iteration is optional — unlike a Collection<T>, a repository may legitimately refuse to enumerate all members (a billion orders shouldn't fit in memory).
public interface OrderRepository {
    Optional<Order> findById(OrderId id);   // look up
    void save(Order order);                  // include / update
    void delete(Order order);                // exclude
    OrderId nextIdentity();                  // mint a new identity
}

1.2 Aggregate-scoped fetching

A repository returns or stores whole aggregates, never internal entities or partial slices. The unit of fetch is the aggregate root and everything inside its consistency boundary.

  • OrderRepository.findById(id) returns an Order with its OrderLines and ShippingAddress already inside.
  • It does not return an OrderLine directly — that would let the caller mutate a part of the aggregate without going through the root, breaking invariants.
  • It does not return a flat DTO for screen display — that's a query service's job (see senior.md).

The corollary: one repository per aggregate root, not per entity, not per table.

1.3 Identity-based access

The repository's primary lookup is by the aggregate's identity. Every aggregate root has a stable identity value object (OrderId, CustomerId, …), and findById is the canonical fetch.

OrderId id = new OrderId(UUID.fromString("..."));
Order o = orders.findById(id).orElseThrow();

Other lookups exist (by domain-meaningful keys, by Specification, by query service), but identity is the privileged path — it's what makes the repository a collection rather than a query engine.

1.4 Persistence transparency

The repository's public interface must not expose persistence-mechanism details. Callers must not learn whether the backing store is SQL, NoSQL, in-memory, or remote.

  • No Connection, EntityManager, Session, MongoCollection on method signatures or return types.
  • No SQL strings, JPQL strings, or Mongo query documents in the contract.
  • No Optional<EntityClass> — the return type names domain types only.
  • Exceptions thrown from the contract are domain exceptions (OrderNotFoundException) or generic infrastructure exceptions wrapped to be substitutable across implementations.

This is what makes substitution possible: swapping JpaOrderRepository for InMemoryOrderRepository doesn't ripple through callers.


2. Required method semantics

A conforming implementation must honour these behavioural contracts. They form the LSP test (see ../../03-design-principles/01-solid-principles/): any implementation passing the same test suite as the reference can substitute for it.

2.1 findById(Id) : Optional<T>

  • Returns the unique aggregate with the given identity, fully hydrated, ready to enforce its invariants.
  • Returns Optional.empty() if no such aggregate exists.
  • Never returns a partial aggregate.
  • Reads visible to the caller within the same transaction are visible.

2.2 save(T) / add(T)

  • Includes the aggregate in the collection (insert if absent, update if present, under persistence-oriented semantics).
  • Under collection-oriented semantics, add is required only the first time; subsequent mutations are tracked.
  • After a successful return, findById(aggregate.id()) returns an equivalent aggregate.
  • Idempotent under identical state: calling save(o) twice with the same o is observationally equivalent to calling it once.

2.3 delete(T) / remove(T)

  • Excludes the aggregate from the collection.
  • After a successful return, findById(aggregate.id()) returns Optional.empty().
  • Deleting a non-member is implementation-defined: either a silent no-op or a domain exception, but not a silent corruption.

2.4 nextIdentity() : Id

  • Returns an identity that is not currently in use.
  • Pure: does not by itself store anything.
  • Independent of the database: the implementation may use UUIDs, a pre-allocated sequence, or a snowflake-style generator, but the contract returns a usable identity before any aggregate has been stored under it.

3. The Specification pattern — the standard extension

Eric Evans introduces the Specification pattern in DDD §6.5 as the formal answer to the question "how do we let callers describe a subset of aggregates without growing the repository interface forever?"

A Specification is a value object that encapsulates a single yes/no test on a candidate aggregate.

public interface Specification<T> {
    boolean isSatisfiedBy(T candidate);
}

It composes via Boolean operations:

public interface Specification<T> {
    boolean isSatisfiedBy(T candidate);
    default Specification<T> and(Specification<T> other) {
        return c -> isSatisfiedBy(c) && other.isSatisfiedBy(c);
    }
    default Specification<T> or(Specification<T> other) {
        return c -> isSatisfiedBy(c) || other.isSatisfiedBy(c);
    }
    default Specification<T> negate() {
        return c -> !isSatisfiedBy(c);
    }
}

Concrete specifications live in the domain and use domain vocabulary:

public final class OpenOrders implements Specification<Order> {
    @Override public boolean isSatisfiedBy(Order o) {
        return o.status() == OrderStatus.OPEN;
    }
}

public final class ForCustomer implements Specification<Order> {
    private final CustomerId customer;
    public ForCustomer(CustomerId c) { this.customer = c; }
    @Override public boolean isSatisfiedBy(Order o) { return o.customer().equals(customer); }
}

public final class PlacedAfter implements Specification<Order> {
    private final Instant cutoff;
    public PlacedAfter(Instant cutoff) { this.cutoff = cutoff; }
    @Override public boolean isSatisfiedBy(Order o) { return o.placedAt().isAfter(cutoff); }
}

Callers compose them in the application service:

Specification<Order> spec = new OpenOrders()
    .and(new ForCustomer(customerId))
    .and(new PlacedAfter(yesterday));
List<Order> result = orders.findSatisfying(spec);

The repository's surface stays small:

public interface OrderRepository {
    Optional<Order> findById(OrderId id);
    List<Order> findSatisfying(Specification<Order> spec);
    void save(Order order);
    OrderId nextIdentity();
}

3.1 Persistence translation

A Specification that only runs in-memory (isSatisfiedBy) forces the repository to load every aggregate before filtering — fatal at scale. The pragmatic version exposes a translation hook:

public interface JpaSpecification<T> extends Specification<T> {
    Predicate toPredicate(Root<T> root, CriteriaBuilder cb);
}

The JPA repository implementation calls toPredicate and builds a where clause; the in-memory implementation falls back to isSatisfiedBy. The contract from the domain's point of view is unchanged.

Spring Data offers an idiomatic version with org.springframework.data.jpa.domain.Specification<T> and JpaSpecificationExecutor<T> — covered in professional.md.

3.2 When Specification is not the right tool

  • Sorting and pagination — Specifications are predicate-shaped, not ordering-shaped. Sort + page belong on separate parameters.
  • Aggregations — sums, averages, counts are projections, not selections. Put them on a query service.
  • Cross-aggregate joins — if you need Order joined with Customer.name, that's a read model. Don't twist a Specification into producing it.

4. Equality and identity for repositories

A repository is itself a domain service (in Evans' wider taxonomy), so it has no identity of its own and need not implement equals/hashCode. The aggregates it holds, however, are entities — they have identity, and equality is by id, never by field values.

public class Order {
    private final OrderId id;
    // ...
    @Override public boolean equals(Object o) {
        return o instanceof Order other && other.id.equals(this.id);
    }
    @Override public int hashCode() { return id.hashCode(); }
}

This matters for repositories because two Order references with the same id must be treated as the same aggregate even if a field differs in memory (one is stale). Implementations that compare by content silently break this.


5. Concurrency contract

A repository may be called from multiple threads (Spring services are singletons by default). The contract:

  • Implementations must be thread-safe for read methods.
  • Write methods are serialised by the surrounding transaction, not by the repository.
  • Optimistic locking is the aggregate's concern, not the repository's — the aggregate carries the version field; the repository merely participates.

A ConcurrentHashMap-based in-memory repository should use the map's atomic operations (compute, computeIfAbsent) for save to honour this without explicit locks.


6. Lifecycle contract

A repository is a stateless service for the lifetime of the application. Specifically:

  • Constructed once (typically via Spring DI), shared everywhere.
  • No init / close semantics on the contract — the implementation's resources (pool, session factory) are managed by the framework.
  • The contract does not pin the caller to any particular transactional or session lifecycle. Use cases own transactions.

This is what keeps the repository a collection abstraction rather than a connection abstraction.


7. Conformance checklist

Use this as a yes/no review on any new or refactored repository:

  • Interface lives in the domain layer, not infrastructure.
  • Method signatures mention only domain types (or Optional/List of them).
  • No EntityManager, Connection, Session, Page<EntityClass> on the contract.
  • One repository per aggregate root (count roots, count repositories).
  • findById returns a complete aggregate when present.
  • nextIdentity is independent of the database.
  • Complex queries arrive via Specification or a query service, not as ten findBy… methods.
  • Implementation is thread-safe for reads; writes rely on the surrounding transaction.

8. What's next

Topic File
10 bug scenarios with diagnosis and fix find-bug.md
Fetch joins, projections, second-level cache optimize.md
8 hands-on exercises with worked solution tasks.md
20 numbered interview Q&A interview.md
Aggregates this contract is built around ../03-aggregates/
Entities that live inside aggregates ../02-entities/
Domain services for cross-aggregate logic ../05-domain-services/

Memorize this: Four properties define a Repository — collection semantics, aggregate-scoped fetching, identity-based access, persistence transparency. The Specification pattern is the canonical extension when callers need criteria beyond identity. Anything that fails the four-property test is either a DAO, a query service, or a misnamed CRUD bag.