Prefix Sums & Difference Arrays — Junior Level¶
Table of Contents¶
- Introduction
- The Running-Total Intuition
- What is a Prefix Sum?
- Range-Sum Query in O(1)
- Off-by-One: The Indexing Convention
- Visual Diagrams
- Code Examples — 1D Prefix Sum
- Go
- Java
- Python
- Pattern 1 — Subarray Sum Equals K
- What is a Difference Array?
- Pattern 2 — Range Updates with Difference Arrays
- Big-O Summary
- Real-World Analogies
- Pros and Cons
- Common Mistakes
- Cheat Sheet
- Visual Animation
- Summary
Introduction¶
A prefix sum is one of those tiny techniques that feels obvious once you see it, then turns out to solve dozens of problems. It is a precomputed running total of an array. After spending O(n) once to build it, you can answer any "what is the sum of elements between index l and r?" question in O(1) — no loop required.
Its mirror twin is the difference array. Where prefix sums let you read a range fast, difference arrays let you write to a range fast. Add +5 to every element from index 3 to index 9? One operation, O(1), no loop. Then a single pass reconstructs the final array.
Together, these two techniques are the foundation for many array problems and a stepping stone toward more powerful structures like Fenwick trees and segment trees.
The Running-Total Intuition¶
Imagine you are tracking the temperature each day for a week:
Someone asks: "What was the total temperature from Tuesday to Friday?" You add 23 + 19 + 25 + 28 = 95. Four additions. Easy.
But what if a thousand such questions arrive? You would repeat similar work over and over. A smarter approach: build a running total once.
Now Tue..Fri is just 116 - 21 = 95. One subtraction. The running total turned every future query into constant-time arithmetic.
That running total is a prefix sum.
What is a Prefix Sum?¶
Formally: given an array a of length n, the prefix-sum array prefix has length n + 1 and is defined as:
So prefix[i] is the sum of the first i elements (the first i elements before index i). Note that prefix[0] = 0 — this is a deliberate sentinel that makes the math symmetric and avoids special-casing the empty prefix.
Example with a = [3, 1, 4, 1, 5, 9, 2, 6]:
i: 0 1 2 3 4 5 6 7 8
prefix: 0 3 4 8 9 14 23 25 31
^ ^
| |
prefix[3] = 3+1+4 = 8
prefix[8] = total sum = 31
Range-Sum Query in O(1)¶
The whole point: once prefix is built, the sum of any subarray a[l..r] (inclusive on both ends) is:
Why? prefix[r+1] is everything from a[0] through a[r]. Subtracting prefix[l] removes a[0] through a[l-1], leaving exactly a[l] + a[l+1] + ... + a[r].
Example: sum(a[2..5]) for a = [3, 1, 4, 1, 5, 9, 2, 6]:
The build costs O(n) and uses O(n) extra memory. After that, every range query is O(1) regardless of how wide the range is.
Off-by-One: The Indexing Convention¶
The single most common bug is mixing up prefix[i] vs prefix[i+1]. Two conventions are common in textbooks:
Convention A (used here): prefix has length n + 1, prefix[0] = 0, and sum(a[l..r]) = prefix[r+1] - prefix[l].
Convention B: prefix has length n, prefix[i] = a[0] + ... + a[i], and sum(a[l..r]) = prefix[r] - prefix[l-1] (with a special case for l == 0).
Convention A is strictly better because it eliminates the special case. Always favor it. The mantra: prefix has one extra slot.
Visual Diagrams¶
The shaded region below visualizes the subtraction:
Index: 0 1 2 3 4 5 6 7
Array a: [ 3 ][ 1 ][ 4 ][ 1 ][ 5 ][ 9 ][ 2 ][ 6 ]
Want sum of a[2..5]:
|<------- this region ------>|
[ 4 ][ 1 ][ 5 ][ 9 ]
prefix[6] covers: [ 3 ][ 1 ][ 4 ][ 1 ][ 5 ][ 9 ] = 23
prefix[2] covers: [ 3 ][ 1 ] = 4
prefix[6] - prefix[2] removes the leading [3][1] = 19
Code Examples — 1D Prefix Sum¶
Go Implementation¶
package prefix
// BuildPrefix returns prefix[] such that prefix[i] = a[0]+...+a[i-1].
// Length is len(a) + 1 with prefix[0] = 0.
// Time: O(n). Space: O(n).
func BuildPrefix(a []int) []int {
n := len(a)
prefix := make([]int, n+1) // prefix[0] = 0 by default
for i := 0; i < n; i++ {
prefix[i+1] = prefix[i] + a[i]
}
return prefix
}
// RangeSum returns sum of a[l..r] inclusive in O(1).
// Precondition: 0 <= l <= r < len(a).
func RangeSum(prefix []int, l, r int) int {
return prefix[r+1] - prefix[l]
}
// Usage:
// a := []int{3, 1, 4, 1, 5, 9, 2, 6}
// p := BuildPrefix(a)
// fmt.Println(RangeSum(p, 2, 5)) // 19
Java Implementation¶
public class PrefixSum {
/**
* Builds prefix[] of length n+1 where prefix[i] = a[0] + ... + a[i-1].
* Uses long to guard against overflow on large sums.
* Time: O(n). Space: O(n).
*/
public static long[] buildPrefix(int[] a) {
int n = a.length;
long[] prefix = new long[n + 1]; // prefix[0] = 0 by default
for (int i = 0; i < n; i++) {
prefix[i + 1] = prefix[i] + a[i];
}
return prefix;
}
/**
* Returns sum of a[l..r] inclusive in O(1).
* Precondition: 0 <= l <= r < a.length
*/
public static long rangeSum(long[] prefix, int l, int r) {
return prefix[r + 1] - prefix[l];
}
public static void main(String[] args) {
int[] a = {3, 1, 4, 1, 5, 9, 2, 6};
long[] p = buildPrefix(a);
System.out.println(rangeSum(p, 2, 5)); // 19
}
}
Python Implementation¶
"""1D prefix sums with O(1) range queries."""
from typing import List
def build_prefix(a: List[int]) -> List[int]:
"""Return prefix[] of length n+1 with prefix[i] = a[0] + ... + a[i-1].
Time: O(n). Space: O(n).
"""
n = len(a)
prefix = [0] * (n + 1)
for i in range(n):
prefix[i + 1] = prefix[i] + a[i]
return prefix
def range_sum(prefix: List[int], l: int, r: int) -> int:
"""Return sum of a[l..r] inclusive in O(1).
Precondition: 0 <= l <= r < len(a)
"""
return prefix[r + 1] - prefix[l]
if __name__ == "__main__":
a = [3, 1, 4, 1, 5, 9, 2, 6]
p = build_prefix(a)
print(range_sum(p, 2, 5)) # 19
What it does: Builds the running-total array in a single pass, then answers any range-sum query with one subtraction.
Pattern 1 — Subarray Sum Equals K¶
A classic interview problem: given an array of integers and a target k, count how many contiguous subarrays sum to k.
Brute force: try every (l, r) pair — O(n^2). Better: prefix sums plus a hash map of counts.
Key insight: the subarray a[l..r] sums to k exactly when prefix[r+1] - prefix[l] == k, i.e. prefix[l] == prefix[r+1] - k. While scanning r, count how many previous prefix values equal prefix[r+1] - k.
Go¶
package prefix
// SubarraySumK returns the number of contiguous subarrays summing to k.
// Time: O(n). Space: O(n).
func SubarraySumK(a []int, k int) int {
count := 0
running := 0
seen := map[int]int{0: 1} // prefix=0 has been seen once (empty prefix)
for _, v := range a {
running += v
if c, ok := seen[running-k]; ok {
count += c
}
seen[running]++
}
return count
}
Java¶
import java.util.HashMap;
import java.util.Map;
public class SubarraySum {
/** Counts contiguous subarrays summing to k. Time: O(n). */
public static int subarraySumK(int[] a, int k) {
int count = 0, running = 0;
Map<Integer, Integer> seen = new HashMap<>();
seen.put(0, 1); // empty prefix
for (int v : a) {
running += v;
count += seen.getOrDefault(running - k, 0);
seen.merge(running, 1, Integer::sum);
}
return count;
}
}
Python¶
from collections import defaultdict
from typing import List
def subarray_sum_k(a: List[int], k: int) -> int:
"""Count contiguous subarrays summing to k. Time: O(n)."""
count = 0
running = 0
seen = defaultdict(int)
seen[0] = 1 # empty prefix
for v in a:
running += v
count += seen[running - k]
seen[running] += 1
return count
Why a hash map and not sliding window? Sliding window requires the running sum to grow monotonically when you extend the right end and shrink it when you advance the left. With negative numbers (or zero target), that monotonicity breaks. The hash map of prefix counts handles all cases uniformly.
What is a Difference Array?¶
A difference array is the inverse of a prefix sum. Given a, define:
So diff stores how each element differs from its predecessor. Reconstructing a is the prefix sum of diff:
The magic: adding a value v to every element of a[l..r] can be done by two updates on diff.
Why? After the change, the difference between a[l] and a[l-1] increases by v (so diff[l] += v), and the difference between a[r+1] and a[r] decreases by v (so diff[r+1] -= v). Every element inside [l..r] is shifted up by v exactly once when we reconstruct.
Pattern 2 — Range Updates with Difference Arrays¶
When you face many range-update operations followed by a single read of the final state (offline updates), difference arrays shine. Each update is O(1); the final reconstruction is O(n).
Go¶
package prefix
// ApplyRangeUpdates takes initial array a and a list of (l, r, v) updates,
// each adding v to every a[l..r]. Returns the final array.
// Time: O(n + u) where u = number of updates.
func ApplyRangeUpdates(a []int, updates [][3]int) []int {
n := len(a)
diff := make([]int, n+1) // extra slot so diff[r+1] is always valid
for _, u := range updates {
l, r, v := u[0], u[1], u[2]
diff[l] += v
diff[r+1] -= v
}
// Reconstruct: a[i] += running prefix of diff[0..i].
running := 0
result := make([]int, n)
for i := 0; i < n; i++ {
running += diff[i]
result[i] = a[i] + running
}
return result
}
Java¶
public class DifferenceArray {
/**
* Applies many range +v updates to a, returns final array.
* Each update is O(1); reconstruction is O(n).
* updates[k] = {l, r, v}.
*/
public static int[] applyRangeUpdates(int[] a, int[][] updates) {
int n = a.length;
int[] diff = new int[n + 1]; // sentinel slot at index n
for (int[] u : updates) {
int l = u[0], r = u[1], v = u[2];
diff[l] += v;
diff[r + 1] -= v;
}
int[] result = new int[n];
int running = 0;
for (int i = 0; i < n; i++) {
running += diff[i];
result[i] = a[i] + running;
}
return result;
}
}
Python¶
from typing import List, Tuple
def apply_range_updates(a: List[int], updates: List[Tuple[int, int, int]]) -> List[int]:
"""Apply many (l, r, v) range +v updates, return final array.
Each update: O(1). Reconstruction: O(n).
"""
n = len(a)
diff = [0] * (n + 1) # sentinel slot avoids bounds check
for l, r, v in updates:
diff[l] += v
diff[r + 1] -= v
result = [0] * n
running = 0
for i in range(n):
running += diff[i]
result[i] = a[i] + running
return result
Typical use case: a list of concert ticket sales each booking a range of seats. Recording each booking is a single O(1) diff update; after all bookings, one pass produces the seat-occupancy array.
Big-O Summary¶
| Operation | Prefix Sum | Difference Array |
|---|---|---|
| Build / Reconstruct | O(n) | O(n) |
| Range sum query | O(1) | O(n) (must reconstruct) |
Range update (+v to a[l..r]) | O(n) | O(1) |
| Point query | O(1) | O(n) until reconstructed |
| Space | O(n) | O(n) |
The structures are dual: prefix sums are read-fast, write-slow; difference arrays are write-fast, read-slow. Pick the one matching your access pattern.
Real-World Analogies¶
| Concept | Analogy |
|---|---|
| Prefix sum | A running bank balance — to know how much you spent between two dates, subtract the older balance from the newer one |
| Difference array | Daily deposits and withdrawals — easy to record each transaction, but to know today's balance you must replay them all |
| Build cost | One-time setup, like indexing a book — expensive once, cheap forever after |
| Range query | Looking up the index of a topic — instant, because someone built the index for you |
Where analogies break: bank balances are 1D; real array indices may be 2D (matrix queries) or even 3D (voxel sums). The principle generalizes; the analogy does not.
Pros and Cons¶
| Pros | Cons |
|---|---|
| O(1) range-sum queries after O(n) precompute | Static — does not handle frequent updates (use Fenwick or segment tree) |
| Tiny, cache-friendly code | O(n) extra memory for the prefix array |
| Works in 1D, 2D, 3D (and higher) | Off-by-one errors are common |
| Composes with hash maps for subarray problems | Only works with operations forming a group (sum, XOR) — not with min/max |
| Difference array gives O(1) range updates | Difference array is single-shot — interleaved updates and queries are slow |
When to use: - Many range queries on a static array. - Many range updates followed by a final read.
When NOT to use: - Frequent interleaved updates and queries — reach for a Fenwick tree (see 09-trees/fenwick-tree) or segment tree (see 09-trees/segment-tree). - Range min/max queries — use a sparse table (see 09-trees/sparse-table-rmq) for static input. - The data does not fit in memory — see streaming approximations in senior.md.
Common Mistakes¶
| Mistake | Why It Is Wrong | Fix |
|---|---|---|
Using prefix[r] - prefix[l] instead of prefix[r+1] - prefix[l] | Off-by-one — drops the last element from the sum | Stick to the length n+1 convention with prefix[0] = 0 |
Forgetting the sentinel prefix[0] = 0 | Special-case for l == 0 makes code ugly and bug-prone | Always allocate one extra slot |
Allocating diff of length n instead of n+1 | diff[r+1] overflows the buffer when r == n-1 | Always allocate one extra slot for the right sentinel |
| Storing 32-bit sums for big inputs | Integer overflow corrupts the running total silently | Use int64 / long / Python big ints |
| Using prefix sums with frequent updates | Every update rebuilds the prefix in O(n) — total cost O(nq) | Use Fenwick / segment tree instead |
| Trying prefix min/max | Min has no inverse — you cannot "subtract" the prefix to isolate a range | Use a sparse table (static) or segment tree (dynamic) |
| Building a difference array, then reading mid-stream | Each read costs O(n) until you reconstruct | Either commit to offline updates or use a Fenwick tree |
| Misplacing the 2D diff corners | Forgetting the +v at one corner or -v at another | Memorize the four-corner pattern — see middle.md |
Cheat Sheet¶
| Pattern | Formula | Complexity |
|---|---|---|
| Build prefix | prefix[i+1] = prefix[i] + a[i] | O(n) once |
| Range sum | prefix[r+1] - prefix[l] | O(1) per query |
| Range XOR | prefix_xor[r+1] ^ prefix_xor[l] | O(1) per query |
| Range update +v | diff[l] += v; diff[r+1] -= v | O(1) per update |
| Reconstruct after diff | a[i] += sum(diff[0..i]) | O(n) once |
| Subarray sum == k | hash map of prefix counts | O(n) |
| Count subarrays with sum % k == 0 | hash map of prefix % k counts | O(n) |
Visual Animation¶
See
animation.htmlfor an interactive visualization.The animation demonstrates: - The top row shows the input array
a[]. - The bottom row shows the prefix arrayprefix[]. - Click any two indiceslandrto highlightprefix[r+1] - prefix[l]and the corresponding subarray. - Toggle to Difference Array mode to see range updates propagate throughdiff[]and the reconstructeda[]. - Stats panel tracks operations performed and big-O badges.
Summary¶
Prefix sums turn O(n) range-sum queries into O(1) at the cost of a one-time O(n) precompute. Difference arrays do the reverse: O(1) range updates at the cost of an O(n) final reconstruction. Both rely on the same algebraic property: addition has an inverse (subtraction). The technique extends to XOR (its own inverse) and any operation forming a group, but breaks for min/max because they have no inverse — those need a sparse table or segment tree instead.
Master the off-by-one convention prefix[r+1] - prefix[l] with prefix[0] = 0, internalize the four-line diff[l] += v; diff[r+1] -= v update, and you have unlocked a surprising number of array problems that look hard at first glance.
Further Reading¶
- Introduction to Algorithms (CLRS) — sections on dynamic programming setup often introduce prefix sums in passing.
- Competitive Programming Handbook (Antti Laaksonen), chapters on range queries.
- LeetCode tag
prefix-sum— hundreds of practice problems. - See
middle.mdfor 2D prefix sums and the composability constraint. - See
09-trees/fenwick-treeand09-trees/segment-treefor the dynamic alternatives.