Time vs Space Complexity — Mathematical Foundations and Complexity Theory¶
Table of Contents¶
- Formal Definition
- Correctness Proof — Loop Invariants
- Amortized Analysis
- NP-Completeness and Polynomial Reductions
- Randomized Algorithm Probability Bounds
- Cache-Oblivious Analysis
- Comparison with Alternatives
- Summary
Formal Definition¶
Definition: Time Complexity
Let A be an algorithm and I be an input of size n.
T_A(n) = max { t(A, I) : |I| = n }
where t(A, I) is the number of basic operations A performs on input I.
We say A runs in time O(f(n)) if there exist constants c > 0 and n₀ ≥ 0
such that T_A(n) ≤ c · f(n) for all n ≥ n₀.
Definition: Space Complexity
S_A(n) = max { s(A, I) : |I| = n }
where s(A, I) is the maximum number of memory cells A uses on input I
(beyond the input itself — auxiliary space).
We say A uses space O(g(n)) if there exist constants c > 0 and n₀ ≥ 0
such that S_A(n) ≤ c · g(n) for all n ≥ n₀.
Definition: Time-Space Trade-Off (Formal)
For a computational problem P, define:
T(P, S) = min { T_A(n) : S_A(n) ≤ S }
This is the minimum time achievable by any algorithm using at most S space.
A time-space trade-off exists for P if T(P, S) is strictly decreasing in S
for some range of S values.
The Computation Model: RAM Machine¶
The RAM (Random Access Machine) model:
- Memory: infinite array M[0], M[1], M[2], ...
- Each cell holds an integer of O(log n) bits (word-RAM model)
- Operations: read M[i], write M[i], arithmetic (+, -, ×, ÷), comparison — each O(1)
- Program counter: sequential execution with conditional jumps
Time = number of operations executed
Space = number of distinct memory cells accessed (auxiliary)
Correctness Proof — Loop Invariants¶
Proof: In-Place Array Reversal is Correct¶
Algorithm: reverseInPlace(A, n)
left ← 0, right ← n - 1
while left < right:
swap A[left] and A[right]
left ← left + 1
right ← right - 1
Claim: After termination, A[i] = A_orig[n-1-i] for all 0 ≤ i < n.
Loop Invariant I(k): After k iterations:
(1) A[i] = A_orig[n-1-i] for 0 ≤ i < k (left portion swapped)
(2) A[i] = A_orig[n-1-i] for n-k ≤ i < n (right portion swapped)
(3) A[i] = A_orig[i] for k ≤ i < n-k (middle unchanged)
(4) left = k, right = n-1-k
Base case (k=0):
No swaps performed. A[i] = A_orig[i] for all i.
left = 0, right = n-1. ✓
Inductive step: Assume I(k) holds. Show I(k+1) holds.
Guard: left < right → k < n-1-k → k < (n-1)/2.
Swap A[k] and A[n-1-k]:
A[k] = A_orig[n-1-k] and A[n-1-k] = A_orig[k].
Now:
Swapped left portion: 0 ≤ i ≤ k → A[i] = A_orig[n-1-i] ✓
Swapped right portion: n-1-k ≤ i < n → A[i] = A_orig[n-1-i] ✓
Middle: k+1 ≤ i < n-1-k → A[i] = A_orig[i] ✓
left = k+1, right = n-2-k ✓
I(k+1) holds.
Termination:
left increases by 1 and right decreases by 1 each iteration.
Gap = right - left decreases by 2 each step.
Terminates when left ≥ right, i.e., after ⌊n/2⌋ iterations.
Postcondition:
At termination, k = ⌊n/2⌋. The middle portion (3) is empty or a single
element (which equals its own reverse). Combined with (1) and (2),
A[i] = A_orig[n-1-i] for all i. QED
Complexity:
Time: O(n/2) = O(n) — one swap per iteration, ⌊n/2⌋ iterations.
Space: O(1) — only left, right, and one temp variable for swap.
Go¶
// Time: O(n), Space: O(1) — proven correct above
func reverseInPlace(arr []int) {
left, right := 0, len(arr)-1
for left < right {
arr[left], arr[right] = arr[right], arr[left]
left++
right--
}
}
Java¶
public static void reverseInPlace(int[] arr) {
int left = 0, right = arr.length - 1;
while (left < right) {
int temp = arr[left];
arr[left] = arr[right];
arr[right] = temp;
left++;
right--;
}
}
Python¶
def reverse_in_place(arr):
left, right = 0, len(arr) - 1
while left < right:
arr[left], arr[right] = arr[right], arr[left]
left += 1
right -= 1
Amortized Analysis¶
Dynamic Array: Three Methods¶
A dynamic array doubles its capacity when full. Individual push can cost O(n) (resize), but amortized cost is O(1).
Aggregate Method¶
Consider n push operations starting from an empty array with capacity 1.
Resizes occur at pushes 1, 2, 3, 5, 9, 17, ..., i.e., at push i where
i-1 is a power of 2. Cost of resize at capacity c is c (copy all elements).
Total cost = n (one write per push) + Σ 2^k for k=0 to ⌊log₂(n-1)⌋
= n + (2^(⌊log₂(n-1)⌋+1) - 1)
< n + 2n = 3n
Amortized cost per push = 3n / n = 3 = O(1). QED
Potential Method¶
Define potential function:
Φ(D) = 2 · size(D) - capacity(D)
Properties:
- After resize: size = capacity/2, so Φ = 2(cap/2) - cap = 0
- Just before resize: size = capacity, so Φ = 2·cap - cap = cap
- Φ(D₀) = 0 (empty array), Φ(Dᵢ) ≥ 0 for all i ✓
Amortized cost of push (no resize):
â = c + Φ(Dᵢ) - Φ(Dᵢ₋₁) = 1 + 2 = 3
Amortized cost of push (with resize from capacity c to 2c):
actual cost = 1 (push) + c (copy)
Φ before = 2c - c = c
Φ after = 2(c/2 + 1) - 2c = 2 - c (wait, size after = c+1, capacity = 2c)
Φ after = 2(c+1) - 2c = 2
â = (1 + c) + 2 - c = 3
Every push has amortized cost 3 = O(1). QED
Accounting Method¶
Charge each push $3:
- $1 to pay for the push itself
- $2 saved as credit on the pushed element
When a resize occurs (capacity doubles from c to 2c):
- c/2 elements entered since last resize, each with $2 credit
- Total credit: c/2 × $2 = $c
- Cost of copying c elements: $c
- Credit exactly covers the resize cost ✓
Since each push pays $3 and total credit never goes negative,
amortized cost per push = O(1). QED
Go¶
package main
import "fmt"
type DynamicArray struct {
data []int
size int
capacity int
resizes int
}
func NewDynamicArray() *DynamicArray {
return &DynamicArray{data: make([]int, 1), capacity: 1}
}
func (da *DynamicArray) Push(val int) {
if da.size == da.capacity {
da.capacity *= 2
newData := make([]int, da.capacity)
copy(newData, da.data)
da.data = newData
da.resizes++
}
da.data[da.size] = val
da.size++
}
func main() {
da := NewDynamicArray()
for i := 0; i < 1000000; i++ {
da.Push(i)
}
fmt.Printf("Size: %d, Capacity: %d, Resizes: %d\n",
da.size, da.capacity, da.resizes)
// Resizes ≈ 20 (log₂(1,000,000)), confirming amortized O(1)
}
Java¶
public class DynamicArray {
private int[] data;
private int size = 0;
private int capacity;
private int resizes = 0;
public DynamicArray() {
capacity = 1;
data = new int[capacity];
}
public void push(int val) {
if (size == capacity) {
capacity *= 2;
int[] newData = new int[capacity];
System.arraycopy(data, 0, newData, 0, size);
data = newData;
resizes++;
}
data[size++] = val;
}
public static void main(String[] args) {
DynamicArray da = new DynamicArray();
for (int i = 0; i < 1_000_000; i++) da.push(i);
System.out.printf("Size: %d, Capacity: %d, Resizes: %d%n",
da.size, da.capacity, da.resizes);
}
}
Python¶
class DynamicArray:
def __init__(self):
self._data = [None]
self._size = 0
self._capacity = 1
self._resizes = 0
def push(self, val):
if self._size == self._capacity:
self._capacity *= 2
new_data = [None] * self._capacity
for i in range(self._size):
new_data[i] = self._data[i]
self._data = new_data
self._resizes += 1
self._data[self._size] = val
self._size += 1
da = DynamicArray()
for i in range(1_000_000):
da.push(i)
print(f"Size: {da._size}, Capacity: {da._capacity}, Resizes: {da._resizes}")
NP-Completeness and Polynomial Reductions¶
Time-space trade-offs connect to complexity theory through the following:
Theorem (Savitch, 1970):
NSPACE(f(n)) ⊆ DSPACE(f(n)²)
Any problem solvable by a nondeterministic Turing machine using f(n) space
can be solved deterministically using O(f(n)²) space.
Implication: NL ⊆ DSPACE(log²n) — nondeterministic log-space problems
can be solved in deterministic log²(n) space. This is a fundamental
time-space trade-off at the complexity class level.
Theorem (Time-Space Trade-Off for Sorting):
Any comparison-based sorting algorithm using S(n) space requires
T(n) = Ω(n² / S(n)) time in the worst case (Borodin-Cook, 1982).
Corollaries:
S = O(1) → T = Ω(n²) (e.g., insertion sort, selection sort)
S = O(n) → T = Ω(n) (but merge sort achieves O(n log n))
S = O(√n) → T = Ω(n^{3/2})
Merge sort uses O(n) space to achieve O(n log n) time.
In-place merge sort exists but with worse constant factors.
Problem: Subset Sum
Input: Set S = {s₁, ..., sₙ} of positive integers, target T
Question: Is there a subset of S that sums to exactly T?
Time-space trade-off:
- Brute force: O(2ⁿ) time, O(n) space (enumerate all subsets)
- DP: O(nT) time, O(T) space (pseudo-polynomial)
- Meet-in-the-middle: O(2^{n/2}) time, O(2^{n/2}) space
The meet-in-the-middle approach splits S into two halves, enumerates all
subset sums for each half (2^{n/2} each), then uses a hash set to check
if any pair sums to T. This is a direct time-space trade-off:
using O(2^{n/2}) space reduces time from O(2ⁿ) to O(2^{n/2}).
Randomized Algorithm Probability Bounds¶
Theorem: Expected time of randomized quicksort is O(n log n).
Proof sketch:
Let X = total number of comparisons.
Let X_ij = indicator variable: X_ij = 1 iff elements z_i and z_j are compared.
By linearity of expectation:
E[X] = Σᵢ<ⱼ E[X_ij] = Σᵢ<ⱼ Pr[z_i and z_j are compared]
Elements z_i and z_j are compared iff one of them is chosen as pivot before
any element between them. Pr[z_i or z_j is first pivot in {z_i,...,z_j}] = 2/(j-i+1).
E[X] = Σᵢ₌₁ⁿ Σⱼ₌ᵢ₊₁ⁿ 2/(j-i+1)
= Σᵢ₌₁ⁿ Σₖ₌₂ⁿ 2/k (substituting k = j-i+1)
≤ 2n · Hₙ = 2n · O(ln n) = O(n log n). QED
High-probability bound (Chernoff):
Pr[X ≥ 4n ln n] ≤ 1/n²
With probability ≥ 1 - 1/n², quicksort runs in O(n log n) time.
Space: O(log n) expected stack depth (O(n) worst case).
Theorem: Hash table with universal hashing.
For n elements in a table of size m = O(n):
Expected chain length: O(1)
Expected time per operation: O(1)
Space: O(n)
Using Chernoff bounds, with high probability no chain exceeds O(log n / log log n).
Cache-Oblivious Analysis¶
Memory hierarchy model:
- CPU registers: ~1 ns access
- L1 cache: ~1 ns, 32-64 KB
- L2 cache: ~4 ns, 256 KB - 1 MB
- L3 cache: ~10 ns, 2-32 MB
- Main memory: ~100 ns, 4-128 GB
- SSD: ~100 μs
- HDD: ~10 ms
The external memory (I/O) model:
- Memory size: M words
- Block/cache line size: B words
- Cost metric: number of block transfers (I/Os)
Array scan: O(N/B) I/Os — optimal due to spatial locality
Linked list scan: O(N) I/Os — one cache miss per node in worst case
Binary search: O(log₂(N/B)) I/Os — each probe loads one block
B-tree search: O(log_B(N)) I/Os — each node fits in one block
Cache-oblivious algorithms achieve optimal I/O complexity without
knowing M or B:
Array scan: O(N/B) — automatic, sequential access
Van Emde Boas layout for binary tree:
Store tree recursively: top half subtree, then bottom half subtrees.
Search cost: O(log_B(N)) I/Os — matches B-tree without knowing B.
Merge sort (cache-oblivious):
Standard merge sort: O((N/B) · log_{M/B}(N/B)) I/Os
This matches the lower bound for comparison-based sorting in the I/O model.
Key: merge step naturally scans arrays sequentially → O(N/B) I/Os per level.
Comparison with Alternatives¶
| Problem | Algorithm | Time | Space | I/O (Cache) | Notes |
|---|---|---|---|---|---|
| Search (sorted) | Binary search | O(log n) | O(1) | O(log(n/B)) | Cache-unfriendly jumps |
| Search (sorted) | B-tree search | O(log_B n) | O(n) | O(log_B n) | Cache-optimal |
| Sort | Merge sort | O(n log n) | O(n) | O((n/B)log_{M/B}(n/B)) | I/O optimal |
| Sort | Quicksort | O(n log n) exp. | O(log n) | O((n/B)log_{M/B}(n/B)) exp. | Less space than merge |
| Sort (in-place) | Heap sort | O(n log n) | O(1) | O(n log n) | Cache-unfriendly |
| Duplicate detection | Brute force | O(n²) | O(1) | O(n²/B) | No extra space |
| Duplicate detection | Hash set | O(n) | O(n) | O(n/B) exp. | Space for time |
| Range sum query | Prefix sum | O(1) query | O(n) | O(1) | Precompute O(n) |
| Range sum query | Naive | O(n) query | O(1) | O(n/B) | No precomputation |
Summary¶
At the professional level, time-space complexity is analyzed through formal mathematical frameworks. We prove algorithm correctness with loop invariants, establish amortized bounds through aggregate, potential, and accounting methods, and connect to complexity theory through results like Savitch's theorem and time-space trade-offs for sorting. Randomized analysis provides expected-case guarantees via linearity of expectation and Chernoff bounds. Cache-oblivious analysis bridges the gap between asymptotic theory and hardware reality, showing that the memory hierarchy fundamentally changes which algorithms are practical. The interplay between time, space, and I/O complexity is the foundation of modern algorithm engineering.