Constant Time O(1) -- Professional Level¶
Table of Contents¶
- Formal Definition of O(1)
- Proving Operations Are Constant Time
- Array Access Proof
- Hash Table Expected O(1) Proof
- Amortized O(1) Proof via Potential Method
- The Word-RAM Model
- Perfect Hashing -- O(1) Worst Case
- Static Perfect Hashing
- FKS Hashing Scheme
- Cuckoo Hashing
- Theoretical Limits
- Key Takeaways
Formal Definition of O(1)¶
Definition (Big-O notation for constant functions):
A function T(n) is O(1) if and only if there exist positive constants c and n0 such that for all n >= n0:
Equivalently: T(n) = O(1) if lim sup_{n -> infinity} T(n) < infinity.
Important distinctions:
O(1)describes an upper bound. IfT(n) = O(1), the function is bounded above by a constant.Theta(1)means the function is bounded both above AND below by constants. For most practical algorithms, when we say O(1) we implicitly mean Theta(1).Omega(1)means the function is at least constant -- trivially true for any algorithm that does at least one operation.
What O(1) is NOT:
- It does not mean the function equals 1.
- It does not mean the function is "fast" in absolute terms.
- It does not describe cache behavior, branch prediction, or hardware effects.
T(n) = 10^100is technically O(1) -- the constant may be enormous.
Proving Operations Are Constant Time¶
Array Access Proof¶
Claim: Accessing element A[i] in an array of size n is O(1) in the word-RAM model.
Proof:
In the word-RAM model, we assume: 1. Memory is a flat array of words. 2. Each word has w >= log n bits. 3. Arithmetic on word-sized operands takes O(1) time. 4. Memory access by address takes O(1) time.
Given array A with base address b and element size s:
This requires: - One multiplication: i * s -- O(1) in word-RAM. - One addition: b + (i * s) -- O(1) in word-RAM. - One memory read at the computed address -- O(1) in word-RAM.
Total: 3 operations, all O(1). Therefore T(n) <= 3 for all n. QED.
Hash Table Expected O(1) Proof¶
Claim: Under the Simple Uniform Hashing Assumption (SUHA), lookup in a hash table with chaining is O(1) expected time when alpha = n/m = O(1).
Proof sketch:
Under SUHA, each key is equally likely to hash to any of the m slots, independently of other keys.
- Expected chain length at any slot =
n/m = alpha. - Expected time for unsuccessful search =
Theta(1 + alpha). - Expected time for successful search =
Theta(1 + alpha/2).
If we maintain alpha = O(1) (e.g., alpha <= 0.75 by rehashing), then both are Theta(1 + O(1)) = O(1). QED.
Caveat: SUHA is an idealization. Real hash functions may not satisfy it. Universal hashing provides a rigorous alternative.
Amortized O(1) Proof via Potential Method¶
Claim: Appending to a dynamic array with doubling strategy has amortized O(1) cost.
Proof using the potential method:
Define the potential function:
where D_i is the state of the data structure after the i-th operation.
Case 1: No resize. size < capacity. - Actual cost: c_i = 1 (place element). - Phi(D_i) - Phi(D_{i-1}) = 2(s+1) - cap - (2s - cap) = 2. - Amortized cost: c_hat_i = c_i + Phi(D_i) - Phi(D_{i-1}) = 1 + 2 = 3.
Case 2: Resize. size = capacity = k. New capacity = 2k. - Actual cost: c_i = k + 1 (copy k elements + place new element). - Before: Phi = 2k - k = k. - After: Phi = 2(k+1) - 2k = 2. - Phi(D_i) - Phi(D_{i-1}) = 2 - k. - Amortized cost: c_hat_i = (k + 1) + (2 - k) = 3.
In both cases, the amortized cost is exactly 3 = O(1). QED.
The Word-RAM Model¶
The word-RAM model is the standard computational model for analyzing algorithms in practice. Key assumptions:
-
Word size
w: The machine operates on words ofwbits, wherew >= log n(enough bits to address the input). -
O(1) operations on words:
- Arithmetic: addition, subtraction, multiplication, division.
- Bitwise: AND, OR, XOR, NOT, shifts.
- Comparison: equality, less-than.
-
Memory: read/write a word at any address.
-
Memory: Unlimited flat address space with O(1) random access.
Why this matters for O(1):
Many "O(1)" claims depend on the word-RAM model. For example: - Array access is O(1) because address arithmetic is O(1) on word-sized operands. - Hash computation is O(1) when keys fit in O(1) words. - Integer comparison is O(1) for word-sized integers.
When the model breaks: - Arbitrary-precision integers: Addition of k-bit numbers is O(k/w), not O(1). - Python's integers are arbitrary precision, so a + b on very large integers is NOT O(1). - Strings of length k: Hashing is O(k), not O(1), unless the string fits in O(1) words.
Perfect Hashing -- O(1) Worst Case¶
Standard hash tables are O(1) expected (average case) but O(n) worst case. Perfect hashing achieves O(1) worst case for a static set of keys.
Static Perfect Hashing¶
A perfect hash function for a set S of n keys is a hash function h: S -> {0, ..., m-1} with no collisions: h(x) != h(y) for all x != y in S.
A minimal perfect hash function achieves m = n (table size equals number of keys).
FKS Hashing Scheme¶
The Fredman-Komloss-Szemeredi (FKS) scheme (1984) achieves: - O(1) worst-case lookup time. - O(n) space. - O(n) expected construction time.
Two-level structure:
Level 1: Hash n keys into m = n buckets using a universal hash function h1. Let b_j = number of keys hashing to bucket j.
Level 2: For each bucket j, create a secondary hash table of size b_j^2 with its own universal hash function h_j. By the birthday paradox, a random function into a table of size k^2 is collision-free with probability >= 1/2 for k keys.
Space analysis:
By universality of h1, E[sum(b_j^2)] = O(n). If sum(b_j^2) > cn for some constant c, re-choose h1. Expected number of trials is O(1).
Total space: O(n) for the primary table + O(sum(b_j^2)) = O(n) for secondary tables.
Lookup: 1. Compute j = h1(key) -- O(1). 2. Compute k = h_j(key) -- O(1). 3. Check if table[j][k] == key -- O(1).
Total: O(1) worst case.
Go¶
package main
import (
"fmt"
"math/rand"
)
// Simplified FKS-style perfect hash for demonstration.
// In production, use a proper universal hash family.
type FKSHashTable struct {
primary [][]entry
hashA int
hashB int
tableSize int
secondaryA []int
secondaryB []int
secondaryS []int
}
type entry struct {
key int
value string
used bool
}
func universalHash(key, a, b, m int) int {
// h(k) = ((a*k + b) mod p) mod m where p is a large prime
p := 1000000007
return ((a*key+b)%p + p) % p % m
}
func BuildFKS(keys []int, values []string) *FKSHashTable {
n := len(keys)
m := n
ht := &FKSHashTable{tableSize: m}
// Level 1: find a hash function that gives sum(b_j^2) <= 4n
for {
ht.hashA = rand.Intn(999999) + 1
ht.hashB = rand.Intn(999999)
buckets := make([][]int, m)
bucketVals := make([][]string, m)
for i, k := range keys {
j := universalHash(k, ht.hashA, ht.hashB, m)
buckets[j] = append(buckets[j], k)
bucketVals[j] = append(bucketVals[j], values[i])
}
sumSquares := 0
for _, b := range buckets {
sumSquares += len(b) * len(b)
}
if sumSquares <= 4*n {
// Build level 2
ht.primary = make([][]entry, m)
ht.secondaryA = make([]int, m)
ht.secondaryB = make([]int, m)
ht.secondaryS = make([]int, m)
for j := 0; j < m; j++ {
bj := len(buckets[j])
if bj == 0 {
continue
}
s := bj * bj
ht.secondaryS[j] = s
// Find collision-free secondary hash
for {
a2 := rand.Intn(999999) + 1
b2 := rand.Intn(999999)
table := make([]entry, s)
collision := false
for idx, k := range buckets[j] {
pos := universalHash(k, a2, b2, s)
if table[pos].used {
collision = true
break
}
table[pos] = entry{key: k, value: bucketVals[j][idx], used: true}
}
if !collision {
ht.primary[j] = table
ht.secondaryA[j] = a2
ht.secondaryB[j] = b2
break
}
}
}
return ht
}
}
}
// Lookup is O(1) WORST CASE -- two hash computations and one comparison
func (ht *FKSHashTable) Lookup(key int) (string, bool) {
j := universalHash(key, ht.hashA, ht.hashB, ht.tableSize)
if ht.primary[j] == nil {
return "", false
}
s := ht.secondaryS[j]
pos := universalHash(key, ht.secondaryA[j], ht.secondaryB[j], s)
e := ht.primary[j][pos]
if e.used && e.key == key {
return e.value, true
}
return "", false
}
func main() {
keys := []int{10, 22, 37, 40, 52, 60, 70, 82}
vals := []string{"ten", "twenty-two", "thirty-seven", "forty",
"fifty-two", "sixty", "seventy", "eighty-two"}
ht := BuildFKS(keys, vals)
for _, k := range keys {
v, ok := ht.Lookup(k)
fmt.Printf("Lookup(%d) = %q, found=%v\n", k, v, ok)
}
// Lookup for a key not in the set
v, ok := ht.Lookup(99)
fmt.Printf("Lookup(99) = %q, found=%v\n", v, ok)
}
Java¶
import java.util.*;
public class FKSHashTable {
private static final int PRIME = 1000000007;
private int hashA, hashB, tableSize;
private int[] secondaryA, secondaryB, secondaryS;
private int[][] keyTable;
private String[][] valTable;
private boolean[][] usedTable;
static int universalHash(int key, int a, int b, int m) {
long h = ((long) a * key + b) % PRIME;
if (h < 0) h += PRIME;
return (int) (h % m);
}
public FKSHashTable(int[] keys, String[] values) {
Random rng = new Random();
int n = keys.length;
tableSize = Math.max(n, 1);
while (true) {
hashA = rng.nextInt(999999) + 1;
hashB = rng.nextInt(999999);
List<List<Integer>> buckets = new ArrayList<>();
List<List<String>> bucketVals = new ArrayList<>();
for (int i = 0; i < tableSize; i++) {
buckets.add(new ArrayList<>());
bucketVals.add(new ArrayList<>());
}
for (int i = 0; i < n; i++) {
int j = universalHash(keys[i], hashA, hashB, tableSize);
buckets.get(j).add(keys[i]);
bucketVals.get(j).add(values[i]);
}
int sumSq = 0;
for (var b : buckets) sumSq += b.size() * b.size();
if (sumSq > 4 * n) continue;
secondaryA = new int[tableSize];
secondaryB = new int[tableSize];
secondaryS = new int[tableSize];
keyTable = new int[tableSize][];
valTable = new String[tableSize][];
usedTable = new boolean[tableSize][];
for (int j = 0; j < tableSize; j++) {
int bj = buckets.get(j).size();
if (bj == 0) continue;
int s = bj * bj;
secondaryS[j] = s;
while (true) {
int a2 = rng.nextInt(999999) + 1;
int b2 = rng.nextInt(999999);
int[] kt = new int[s];
String[] vt = new String[s];
boolean[] ut = new boolean[s];
boolean collision = false;
for (int idx = 0; idx < bj; idx++) {
int pos = universalHash(buckets.get(j).get(idx), a2, b2, s);
if (ut[pos]) { collision = true; break; }
kt[pos] = buckets.get(j).get(idx);
vt[pos] = bucketVals.get(j).get(idx);
ut[pos] = true;
}
if (!collision) {
secondaryA[j] = a2;
secondaryB[j] = b2;
keyTable[j] = kt;
valTable[j] = vt;
usedTable[j] = ut;
break;
}
}
}
return;
}
}
// O(1) worst-case lookup
public String lookup(int key) {
int j = universalHash(key, hashA, hashB, tableSize);
if (keyTable[j] == null) return null;
int pos = universalHash(key, secondaryA[j], secondaryB[j], secondaryS[j]);
if (usedTable[j][pos] && keyTable[j][pos] == key) return valTable[j][pos];
return null;
}
public static void main(String[] args) {
int[] keys = {10, 22, 37, 40, 52, 60, 70, 82};
String[] vals = {"ten", "twenty-two", "thirty-seven", "forty",
"fifty-two", "sixty", "seventy", "eighty-two"};
FKSHashTable ht = new FKSHashTable(keys, vals);
for (int k : keys) {
System.out.printf("Lookup(%d) = %s%n", k, ht.lookup(k));
}
System.out.printf("Lookup(99) = %s%n", ht.lookup(99));
}
}
Python¶
import random
PRIME = 1000000007
def universal_hash(key, a, b, m):
return ((a * key + b) % PRIME) % m
def build_fks(keys, values):
"""Build an FKS perfect hash table. O(n) expected time."""
n = len(keys)
m = max(n, 1)
while True:
a1 = random.randint(1, 999999)
b1 = random.randint(0, 999999)
buckets = [[] for _ in range(m)]
bucket_vals = [[] for _ in range(m)]
for i, k in enumerate(keys):
j = universal_hash(k, a1, b1, m)
buckets[j].append(k)
bucket_vals[j].append(values[i])
sum_sq = sum(len(b) ** 2 for b in buckets)
if sum_sq > 4 * n:
continue
secondary = [None] * m
for j in range(m):
bj = len(buckets[j])
if bj == 0:
continue
s = bj * bj
while True:
a2 = random.randint(1, 999999)
b2 = random.randint(0, 999999)
table = [None] * s
collision = False
for idx, k in enumerate(buckets[j]):
pos = universal_hash(k, a2, b2, s)
if table[pos] is not None:
collision = True
break
table[pos] = (k, bucket_vals[j][idx])
if not collision:
secondary[j] = (a2, b2, s, table)
break
return (a1, b1, m, secondary)
def fks_lookup(ht, key):
"""O(1) worst-case lookup."""
a1, b1, m, secondary = ht
j = universal_hash(key, a1, b1, m)
if secondary[j] is None:
return None
a2, b2, s, table = secondary[j]
pos = universal_hash(key, a2, b2, s)
if table[pos] is not None and table[pos][0] == key:
return table[pos][1]
return None
keys = [10, 22, 37, 40, 52, 60, 70, 82]
vals = ["ten", "twenty-two", "thirty-seven", "forty",
"fifty-two", "sixty", "seventy", "eighty-two"]
ht = build_fks(keys, vals)
for k in keys:
print(f"Lookup({k}) = {fks_lookup(ht, k)}")
print(f"Lookup(99) = {fks_lookup(ht, 99)}")
Cuckoo Hashing¶
Cuckoo hashing is another approach to O(1) worst-case lookups:
- Uses two hash functions
h1andh2. - Each key is in exactly one of two possible positions:
h1(key)orh2(key). - Lookup: O(1) worst case -- check exactly two positions.
- Insert: O(1) amortized -- may trigger a chain of evictions, but amortized O(1).
- Space: O(n) with load factor up to ~50%.
Theoretical Limits¶
Lower bound: Any data structure that supports membership queries on a set of n elements from a universe of size u requires Omega(n log(u/n)) bits of space. This is the information-theoretic lower bound.
Cell-probe lower bound: In the cell-probe model, any data structure that answers membership queries in t probes requires Omega(n log(u/n) / t) bits.
Practical implication: O(1) worst-case lookup is achievable with O(n) space for static sets (FKS, cuckoo hashing). For dynamic sets, expected O(1) with O(n) space (standard hash tables) or O(1) worst-case with O(n) space (dynamic perfect hashing, de la Brandais, Dietzfelbinger et al.).
Key Takeaways¶
-
The formal definition of O(1) is simple:
T(n) <= cfor some constantcand all sufficiently largen. -
The word-RAM model is the standard framework for analyzing O(1) claims. It assumes word-sized operations take constant time.
-
The potential method provides rigorous amortized O(1) proofs for dynamic arrays, hash table resizing, and other amortized data structures.
-
FKS hashing achieves O(1) worst-case lookup with O(n) space for static key sets, using a two-level hashing scheme.
-
Cuckoo hashing provides O(1) worst-case lookups for dynamic sets with O(1) amortized insertions.
-
Perfect hashing is not just theoretical -- it is used in compilers (keyword lookup), databases (static indices), and networking (packet classification).
-
Python's arbitrary-precision integers break the word-RAM assumption. Operations on very large integers are NOT O(1).