Skip to content

Object Identity vs Equality — Specification Reading Guide

The identity-vs-equality distinction is spec-defined in three places: JLS §15.21 (equality operators), JLS §5.1.7 (boxing conversion and the integer cache), and java.lang.Object (equals / hashCode contracts plus System.identityHashCode). Add java.lang.String.intern, java.lang.Enum for the enum-singleton guarantee, and the bytecode-level if_acmpeq / if_acmpne instructions in JVMS §6.5, and you have the complete spec footprint.


1. Where to find the canonical text

Concept Authoritative source
Equality operators == and != JLS §15.21Equality Operators
Numerical equality == on primitives JLS §15.21.1
Boolean equality JLS §15.21.2
Reference equality == on objects JLS §15.21.3
Boxing conversion and the cached range JLS §5.1.7Boxing Conversion
Unboxing conversion JLS §5.1.8
Object.equals(Object) contract java.lang.Object Javadoc
Object.hashCode() contract java.lang.Object Javadoc
System.identityHashCode(Object) semantics java.lang.System Javadoc
String.intern() and the string pool java.lang.String.intern Javadoc
String literal interning JLS §3.10.5String Literals
Enum singleton guarantee JLS §8.9Enum Classes, java.lang.Enum
if_acmpeq / if_acmpne bytecodes JVMS §6.5.if_acmpeq
Records and structural equality JLS §8.10

The JLS is the binding text for the source-level semantics; the JVMS gives the bytecode-level mechanics. The JDK class library Javadoc binds the platform-level contracts (e.g., String.intern).


2. JLS §15.21 — the equality operators

§15.21 splits the == / != operators into three cases based on operand types.

  • §15.21.1 — Numerical Equality Operators. Both operands are numeric (primitive or autoboxable wrappers). The result follows IEEE-754 for floating point and two's-complement integer comparison for integral types. Wrappers are unboxed first if both sides are numeric.
  • §15.21.2 — Boolean Equality Operators. Both operands are boolean (or Boolean autoboxed). Result is the boolean comparison.
  • §15.21.3 — Reference Equality Operators. Both operands are reference types. The result is true if both refer to the same object or both are null, false otherwise. No method call. No unboxing.

The third case is the one that bites everyone:

"At run time, the result of == is true if the operand values are both null or both refer to the same object or array; otherwise, the result is false."

That single sentence defines identity comparison. There is no fallback to equals, no notion of "logical equality" — == is purely same instance. The compiler will accept s1 == s2 for any two String references and emit if_acmpeq (JVMS §6.5), which performs a pointer comparison.

The boxing/unboxing twist:

Integer a = 200, b = 200;
a == b;            // §15.21.3 — both reference; identity comparison; false
a == 200;          // §15.21.1 — RHS is primitive; a is unboxed; value comparison; true

The choice of which subsection applies is made by the compiler at compile time based on the operand types. If both sides are reference types, you get identity; if either side is primitive, the other is unboxed and you get value comparison. This is the rule that makes == so dangerous on wrappers — the answer depends on the static types of the operands, not their content.


3. JLS §5.1.7 — Boxing Conversion and the Integer cache

Boxing conversion (§5.1.7) is what happens when a primitive int becomes an Integer. The spec mandates a minimum cache:

"If the value p being boxed is true, false, a byte, or a char in the range to , or an int or short number between -128 and 127 (inclusive), then let r1 and r2 be the results of any two boxing conversions of p. It is always the case that r1 == r2."

So the spec guarantees, for these ranges:

  • Boolean.TRUE and Boolean.FALSE are the only two Boolean instances ever produced by autoboxing.
  • Byte, -128..127 (the whole range).
  • Character, 0..127.
  • Short, -128..127.
  • Integer, -128..127.
  • Long, -128..127 (added in Java 6).

Outside these ranges, the spec is permissive but not requiring. JVMs are free to cache more if they have spare memory; HotSpot's Integer cache upper bound can be raised with -XX:AutoBoxCacheMax=N. The lower bound -128 is mandated and cannot be lowered. Float and Double are explicitly not cached — autoboxing them always allocates.

Why does the spec mandate this? Performance. Without a cache, every Integer i = 0; would allocate. The cache makes autoboxing affordable for the common case of small integer values (loop counters, status codes). The cost: the == boundary at 127/128 becomes a behavioural cliff the spec officially blesses.

Practical implication: code that depends on == between boxed values is non-portable. The JVM may extend the cache; you cannot rely on Integer.valueOf(1000) != Integer.valueOf(1000). You also cannot rely on the reverse (== always succeeding inside -128..127) being a sensible thing to encode in your program — it's a JVM implementation contract, not an API contract.


4. JLS §3.10.5 — String literals interned

§3.10.5 defines string literals and binds them to the pool:

"A string literal is a reference to an instance of class String. ... A string literal always refers to the same instance of class String. This is because string literals — or, more generally, strings that are the values of constant expressions — are interned so as to share unique instances, using the method String.intern."

The mechanism:

  1. The compiler emits ldc instructions referencing string literals stored in the class file's constant pool (a JVM-internal table per class, JVMS §4.4).
  2. When the class is loaded, the JVM resolves each ldc of a string by calling String.intern() on the constant-pool entry, yielding the canonical shared instance.
  3. Subsequent loads of the same literal return the same String reference.

So "abc" == "abc" is reliably true because both literals share the canonical pooled instance — even across class files, the pool is JVM-wide.

The pool is not the same as the string in the constant pool of a class file. The pool is a runtime table (java.lang.String's static state), shared by all classes loaded into the JVM. Two .class files with "abc" literals both hit the same runtime pool entry.

What does not go into the pool:

  • new String("abc") — explicitly creates a fresh, unpooled String. The single use case for new String(...) is when you need a different object identity (rare).
  • String concatenation results: "a" + "b". Unless both operands are compile-time constants, in which case the compiler folds the concatenation at compile time and the result is a literal that goes in the pool (JLS §15.28 — Constant Expressions).
  • Strings built via StringBuilder, String.format, etc.
  • Strings parsed from input.

Use s.intern() to push a runtime string into the pool. The method returns the canonical pooled instance, allocating if necessary:

intern() — "When intern is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned."

The pool was historically stored in the JVM's PermGen (pre-Java 7), which had a hard size limit and turned intern() into a known memory hazard. From Java 7 onward the pool lives on the regular heap, but unbounded interning still causes memory leaks. The pool's capacity can be inspected with -XX:StringTableSize=N; defaults to ~60,000 entries.


5. Object.equals and hashCode — the contract pair

The contract is in java.lang.Object's Javadoc. The key clauses:

For equals(Object obj):

  • Reflexive: x.equals(x) is true for any non-null x.
  • Symmetric: x.equals(y) implies y.equals(x).
  • Transitive: x.equals(y) and y.equals(z) imply x.equals(z).
  • Consistent: repeated calls return the same result if the objects haven't changed.
  • Null-handling: x.equals(null) is always false.

The default Object.equals(Object obj) is identity:

"The equals method for class Object implements the most discriminating possible equivalence relation on objects; that is, for any non-null reference values x and y, this method returns true if and only if x and y refer to the same object (x == y has the value true)."

So a class that doesn't override .equals uses identitysomeInstance.equals(other) is equivalent to someInstance == other. Two newly allocated Person() objects with no overridden equals are never equal to each other. This is the default that records, JDK value types, and most JDK collections override.

For hashCode():

  • Consistent: the value must not change for an object as long as equals would return the same results.
  • Equal-implies-equal-hash: if x.equals(y) then x.hashCode() == y.hashCode().
  • Unequal-objects-not-required-to-differ: equal hash codes for unequal objects are allowed (and inevitable, since int has only 2³² values).

The default Object.hashCode() is identity-based — see §6.

The contracts are paired: a class that overrides equals must also override hashCode. Failure to do so breaks every hash-based collection (HashMap, HashSet, LinkedHashMap, ConcurrentHashMap). See ../01-equals-hashcode-tostring-contracts/ for full treatment.


6. Object.hashCode and System.identityHashCode

Object.hashCode()'s default implementation is the identity hash. From the Javadoc:

"As much as is reasonably practical, the hashCode method defined by class Object returns distinct integers for distinct objects."

The spec is deliberately weak — it doesn't say "always distinct", because collisions are unavoidable in a 32-bit hash. HotSpot's identity hash is computed lazily on first call and stored in the object's mark word (see senior.md §5 for header internals).

System.identityHashCode(Object x):

"Returns the same hash code for the given object as would be returned by the default method hashCode(), whether or not the given object's class overrides hashCode(). The hash code for the null reference is zero."

So System.identityHashCode(x) is "what x.hashCode() would have returned if x's class hadn't overridden it". For a class that doesn't override, the two return the same value; for String or a record, x.hashCode() returns the content hash and System.identityHashCode(x) returns the identity hash. Both are stable for the object's lifetime; the identity hash is identical across different overriding behaviour.

identityHashCode is what IdentityHashMap uses for hashing. IdentityHashMap also uses == (not .equals) for key comparison.


7. JLS §8.9 — Enum constants are singletons

§8.9 (and java.lang.Enum's Javadoc) make the singleton guarantee explicit:

"An enum class has no instances other than those defined by its enum constants."

The JVM enforces this in three ways:

  1. The compiler disallows new EnumType(...) from outside the enum — even reflective construction is blocked by Constructor.newInstance throwing IllegalArgumentException on enum types.
  2. Deserialisation is special. ObjectInputStream resolves enum constants by their name via Enum.valueOf(Class, String), returning the canonical instance. There's no allocation. (This is why enum singletons are safer than hand-rolled readResolve-based singletons — Item 3 of Effective Java.)
  3. Reflection-based cloning is blocked. Object.clone() is final on Enum; Cloneable is not implemented.

So for any enum:

public enum OrderStatus { OPEN, CLOSED, CANCELLED }

OrderStatus a = OrderStatus.OPEN;
OrderStatus b = OrderStatus.OPEN;
System.out.println(a == b);          // true, by spec

This guarantee survives serialisation (assuming the same classloader on both sides). It does not survive across classloaders — OrderStatus.OPEN loaded by l1 is a different object from OrderStatus.OPEN loaded by l2, because they are different OrderStatus classes.

enum constants compared with == are the only reference comparisons reviewers should reflexively not flag.


8. JVMS §6.5 — if_acmpeq and if_acmpne

At the bytecode level, == between reference types compiles to if_acmpeq (or its inverse if_acmpne). From JVMS §6.5.if_acmpeq:

"Both value1 and value2 must be of type reference. They are both popped from the operand stack and compared. The result of the comparison is as follows: if value1 and value2 refer to the same object, if_acmpeq succeeds and execution proceeds at branchoffset. Otherwise execution proceeds at the instruction following the if_acmpeq instruction."

It's a single bytecode that performs a pointer comparison on the JVM's operand stack. There is no virtual dispatch, no method invocation, no allocation. On a modern x86, this becomes one cmp instruction followed by a conditional jump — a handful of CPU cycles, branch-predictable.

By contrast, .equals(...) compiles to invokevirtual java/lang/Object.equals (or invokeinterface if dispatched through an interface), which performs a vtable/itable lookup, a method call, and the body's instructions. Inline-able by the JIT for monomorphic call sites, but still strictly slower than if_acmpeq for general reference types.

This is the bytecode-level reason == is faster than .equals for the cases where both produce the right answer (enum constants, identity comparisons). It's also the bytecode-level reason == is wrong for value equality — it doesn't run the type's equality logic at all.


9. Records (JLS §8.10) — structural equality by construction

Records (final in Java 16, JEP 395) auto-generate equals, hashCode, and toString based on the component list. From the JLS:

"The implicitly declared equals(Object) method ... returns true if and only if the argument is an instance of the record class and the values of the components of the argument are equal to the corresponding values of the components of this record class. For components of primitive type, equality is ==. For components of reference type, equality is .equals."

So records cannot have buggy equals — the compiler generates the correct one. They are immune to all the classic equality contract violations (asymmetric, intransitive, missed-field). The only way to break record equality is to declare an explicit equals that overrides the generated one, which is possible but flagged by any reasonable static analyser.

Records also can't be hand-coded as identity-based — they're auto-equal by structure. If you want an identity-based "record-like" data carrier, you can't use a record. You write a final class with no .equals override (so equality falls back to Object.equals, which is identity).

The records spec puts SRP-style "value carriers" on the right side of the identity-vs-equality line by default. This is one of the design wins of records over Java-7-era POJOs.


10. JEP references and identity-vs-equality

JEP Feature Effect on identity-vs-equality
JEP 395 Records Records auto-generate value-equality. Identity is meaningless for them.
JEP 401 (preview) Value classes (Project Valhalla) Value classes have no identity at all. == on a value class is defined to be value comparison. IdentityHashMap rejects value-class keys.
JEP 169 Value-based classes (Optional, etc.) "Do not use ==" is a contract documented on these types; future JVMs may merge instances.
JEP 280 invokedynamic for String concat Concatenation results don't go in the string pool.
JEP 305 / 394 Pattern matching for instanceof Pattern variables introduce well-typed identity probes.

The future direction is clear: identity-bearing objects will be the exception, value-equal objects the rule. Project Valhalla collapses the identity question for value-class types into "always equality, never identity, and the JVM is free to optimise by merging".


11. The String.intern Javadoc — verbatim

The spec text for String.intern():

"Returns a canonical representation for the string object.

A pool of strings, initially empty, is maintained privately by the class String.

When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.

It follows that for any two strings s and t, s.intern() == t.intern() is true if and only if s.equals(t) is true.

All literal strings and string-valued constant expressions are interned. String literals are defined in section 3.10.5 of The Java Language Specification."

Two consequences worth knowing:

  1. The pool uses .equals to deduplicate — so s.intern() == t.intern() requires s.equals(t). The pool can't dedupe by identity (identity is what it's trying to create).
  2. The pool has no eviction. Interned strings live until the JVM exits (in practice, until the class that referenced them is unloaded — but for the JVM-wide pool, that means "forever"). Don't intern unbounded input.

12. Reading list

  1. JLS §15.21 — Equality operators. The defining spec for == and !=. Read all three subsections.
  2. JLS §5.1.7 — Boxing Conversion. The integer cache range is mandated here.
  3. JLS §3.10.5 — String literals and the pool.
  4. JLS §8.9 — Enum classes and the singleton guarantee.
  5. JLS §8.10 — Records and their auto-generated equals.
  6. java.lang.Object Javadocequals and hashCode contracts.
  7. java.lang.System.identityHashCode — the identity-hash semantics.
  8. java.lang.String.intern — the pool's API contract.
  9. JVMS §6.5.if_acmpeq — the bytecode behind ==.
  10. JEP 169 — "value-based classes". Defines the warning: do not use == on Optional, LocalDate, etc.
  11. JEP 401 — preview, value classes. Future direction.
  12. Joshua BlochEffective Java, 3rd edition. Item 3 ("Enforce the singleton property with a private constructor or an enum type"), Item 10 ("Obey the general contract when overriding equals"), Item 11 (hashCode), Item 12 (toString). The canonical guidance.
  13. Barbara Liskov, Jeannette WingA Behavioral Notion of Subtyping — gives the formal "behavioural equality" framing that underlies how .equals ought to work.

Memorize this: == is spec-defined as identity for reference types (JLS §15.21.3); .equals is spec-defined by the Object.equals contract — identity by default, programmer-overridden by all sensible value-bearing types. The Integer cache (JLS §5.1.7) is a minimum spec guarantee for -128..127 only and must not be programmed against. The string pool (JLS §3.10.5) interns literals at class load; String.intern() extends it explicitly. Enum constants (JLS §8.9) are JVM-wide singletons with == as the idiomatic comparison. if_acmpeq (JVMS §6.5) is the single-instruction bytecode behind ==; .equals is a virtual call that costs more — but only .equals runs the type's equality logic. The spec gives you the tools; the judgement of when to use which is still yours.