Linear Search — Junior Level¶
Table of Contents¶
- Introduction
- Prerequisites
- Glossary
- Core Concepts
- Big-O Summary
- Real-World Analogies
- Pros & Cons
- Step-by-Step Walkthrough
- Code Examples
- Coding Patterns
- Error Handling
- Performance Tips
- Best Practices
- Edge Cases & Pitfalls
- Common Mistakes
- Cheat Sheet
- Visual Animation
- Summary
- Further Reading
Introduction¶
Focus: "What is Linear Search?" and "When is it the right choice?"
Linear Search (also called sequential search) is the simplest search algorithm: walk through the collection from one end to the other, comparing each element to the target. The first match wins; if you reach the end without a match, the target is absent.
It is the default search you reach for when you know nothing about the data. It does not require the array to be sorted, indexed, hashed, or otherwise pre-processed. You can run it on a linked list, a stream, a generator, a file you read line-by-line — anything you can iterate.
The trade-off is speed. For an array of n elements, linear search makes up to n comparisons. That is O(n) time. Compare this with binary search at O(log n) — but binary search demands a sorted array. Compare it with hash table lookup at O(1) average — but a hash table needs to be built and maintained.
Linear search is the foundation of more advanced search algorithms. Even when you build an inverted index for a search engine, the per-bucket scan inside that index is a linear search. Understanding linear search is understanding the cost of "find me X in a pile of stuff".
In production code, linear search shows up everywhere: - Array.prototype.indexOf in JavaScript - list.index(x) in Python - slices.Contains in Go (since 1.21) - Collection.contains in Java (for non-Set collections) - strchr, strstr in C - grep walking lines of a file
It is fast enough for small n (typically n < 100), and it is the only choice when the data has no exploitable structure.
Prerequisites¶
- Required: Loops (
for,while), arrays/lists, conditionals - Required: Function return values, sentinel values (returning
-1for "not found") - Helpful: Big-O notation basics
- Helpful: Understanding of how comparison works on integers, strings, and custom types
Glossary¶
| Term | Definition |
|---|---|
| Sequential search | Synonym for linear search — scan elements one by one |
| Target (key) | The value you are looking for |
| Found / Hit | The target equals the current element being inspected |
| Miss | The element is not the target — keep going |
| Absent / Not found | The full collection was scanned and no match was found |
| Sentinel | A special value (e.g. -1) returned to signal "not found" |
| Linear time | Running time proportional to the input size — O(n) |
| Early exit | Returning the moment a match is found, without continuing the scan |
| Index | The position (zero-based) of a found element in the array |
| Predicate | A boolean function used as the matching criterion (e.g. x -> x.id == 42) |
Core Concepts¶
Concept 1: Iterate One by One Until Found¶
Walk the array from left to right. At each position i, compare arr[i] to target. If they match, return i. If you reach the end with no match, return -1.
arr = [4, 7, 2, 9, 1, 5, 8]
target = 9
i=0: arr[0]=4, 4 != 9 → continue
i=1: arr[1]=7, 7 != 9 → continue
i=2: arr[2]=2, 2 != 9 → continue
i=3: arr[3]=9, 9 == 9 → return 3 ✓
The algorithm is completely deterministic: given the same array and target, it always inspects the same positions in the same order.
Concept 2: Return the Index, or a Sentinel¶
The conventional contract is: - Found → return the index (0-based) of the first occurrence. - Not found → return -1 (or None in Python, Optional.empty() in Java, an (int, bool) pair in Go).
Returning the index (not just true/false) is more useful because callers can then read or modify the element. If they only care about presence, they wrap the call in result != -1.
Concept 3: No Preconditions on the Data¶
This is the killer feature. Binary search requires a sorted array. Hash-table lookup requires a hash table. Linear search requires nothing — just the ability to iterate.
You can run linear search on: - An unsorted array ([5, 2, 8, 1, 9]) - A sorted array ([1, 2, 5, 8, 9]) — though binary search is faster here - A linked list (no random access — binary search is impossible) - A generator / iterator / stream (data arrives one piece at a time) - A file (read line-by-line; you can't "binary search" a text file without an index)
Concept 4: Early Exit on First Match¶
Once you find the target, stop scanning. Do not continue to the end. - Find first: stop at the first match (the standard contract). - Find all: collect all indices, then keep scanning to the end. - Find last: scan from right to left, or scan everything and remember the last hit.
Concept 5: Comparison Cost Matters¶
Each comparison is one operation. For integers, comparison is one CPU cycle. For strings, comparison is O(k) where k is the string length — so a linear search through n strings of length k is actually O(n × k) time. For deep object equality, comparison can be even more expensive.
Big-O Summary¶
| Case | Time | Notes |
|---|---|---|
| Best | O(1) | Target is at index 0 — found on first comparison |
| Average | O(n) | Target uniformly random in array → ~n/2 comparisons |
| Worst | O(n) | Target at the end, or target not present → n comparisons |
| Space | O(1) | Only a loop counter is needed |
Compared to other search algorithms:
| Algorithm | Time | Requires |
|---|---|---|
| Linear search | O(n) | Nothing |
| Binary search | O(log n) | Sorted array |
| Hash lookup | O(1) avg, O(n) worst | Hash table built |
| BST search (balanced) | O(log n) | BST built |
| Trie lookup | O(k) (k = key length) | Trie built |
Cache behavior: Linear search has near-perfect cache behavior. The CPU prefetcher recognizes the sequential access pattern and pulls subsequent cache lines before they're needed. This makes linear search on small arrays shockingly fast — often beating binary search for n ≤ ~20.
Real-World Analogies¶
| Scenario | How Linear Search Maps |
|---|---|
| Looking for your friend Sara in a movie theater seat-by-seat | You walk down each row, glance at each face. Stop when you find her. |
| Reading every email in your inbox from top to bottom looking for one from "Bank" | Each email is one comparison. Found → click. Reach the end → "no email from Bank." |
| Searching a stack of unsorted papers for a specific receipt | You go through one by one. There's no "binary search" for unsorted papers. |
| Looking up a contact in a phone book (sorted) — wrong fit | This is binary search territory; you don't read every name. |
| Looking up a contact in a recently used call list (unsorted) | This is linear search — you scroll through one by one. |
| Searching for a typo in a 200-line essay | You read every line until you find it. |
The key insight from these analogies: linear search is the natural human instinct for "find this thing" when you have no organizing principle.
Pros & Cons¶
Pros¶
| Pro | Why it Matters |
|---|---|
| Works on any iterable | No sorting, no indexing, no hashing required |
| Trivial to implement | 5 lines of code in any language |
| Zero memory overhead | O(1) extra space — just a loop counter |
| Cache-friendly | Sequential memory access is the fastest pattern for modern CPUs |
| Beats binary search for small n | n < ~20: linear is faster due to lower constant factor |
| Works on unsorted data | Inserting, deleting → no need to re-sort |
| Online / streaming | Can search a stream as data arrives |
| Easy to parallelize | Split the array into chunks, search each in a separate thread |
Cons¶
| Con | Why it Matters |
|---|---|
| O(n) is slow for large n | n = 1,000,000 → up to 1 million comparisons per query |
| Worst case scans everything | Especially bad when the target is absent (no early exit) |
| Repeated queries waste work | If you search the same array 1000 times, you do 1000 × O(n) = O(n × queries) work. A hash table would cost O(n + queries). |
| No fault tolerance | A typo in the target → "not found." No fuzzy matching built-in. |
Rule of thumb: - n < 100, query is rare → linear search. - n large, single query, data already sorted → binary search. - n large, many queries → build a hash set / index, then O(1) lookups.
Step-by-Step Walkthrough¶
Let's trace linearSearch([10, 25, 3, 47, 8], 47):
Initial: arr = [10, 25, 3, 47, 8], target = 47
^
i = 0
Step 1: i=0, arr[0]=10, 10 != 47 → i++
Step 2: i=1, arr[1]=25, 25 != 47 → i++
Step 3: i=2, arr[2]=3, 3 != 47 → i++
Step 4: i=3, arr[3]=47, 47 == 47 → return 3 ✓
Found at index 3. Total comparisons: 4 (out of 5 possible).
Now trace linearSearch([10, 25, 3, 47, 8], 99):
Initial: arr = [10, 25, 3, 47, 8], target = 99
Step 1: i=0, arr[0]=10, 10 != 99 → i++
Step 2: i=1, arr[1]=25, 25 != 99 → i++
Step 3: i=2, arr[2]=3, 3 != 99 → i++
Step 4: i=3, arr[3]=47, 47 != 99 → i++
Step 5: i=4, arr[4]=8, 8 != 99 → i++
Step 6: i=5, i >= n=5 → exit loop, return -1
Not found. Total comparisons: 5 (the worst case).
Code Examples¶
Go — Basic Linear Search¶
package linsearch
// LinearSearch returns the index of the first occurrence of target in arr,
// or -1 if target is not present.
func LinearSearch(arr []int, target int) int {
for i, v := range arr {
if v == target {
return i
}
}
return -1
}
Go — Find All Occurrences¶
func FindAll(arr []int, target int) []int {
var indices []int
for i, v := range arr {
if v == target {
indices = append(indices, i)
}
}
return indices
}
Go — Find First and Find Last¶
func FindFirst(arr []int, target int) int {
for i, v := range arr {
if v == target {
return i // early exit on first match
}
}
return -1
}
func FindLast(arr []int, target int) int {
// Scan right-to-left → first match IS the last occurrence.
for i := len(arr) - 1; i >= 0; i-- {
if arr[i] == target {
return i
}
}
return -1
}
Java — Basic Linear Search¶
public class LinearSearch {
public static int search(int[] arr, int target) {
for (int i = 0; i < arr.length; i++) {
if (arr[i] == target) {
return i;
}
}
return -1;
}
// Generic version using equals() — works for any type.
public static <T> int search(T[] arr, T target) {
for (int i = 0; i < arr.length; i++) {
if (arr[i] == null ? target == null : arr[i].equals(target)) {
return i;
}
}
return -1;
}
}
Java — Find All Occurrences¶
import java.util.ArrayList;
import java.util.List;
public static List<Integer> findAll(int[] arr, int target) {
List<Integer> indices = new ArrayList<>();
for (int i = 0; i < arr.length; i++) {
if (arr[i] == target) {
indices.add(i);
}
}
return indices;
}
Python — Basic Linear Search¶
def linear_search(arr: list[int], target: int) -> int:
"""Return the index of target in arr, or -1 if not present."""
for i, value in enumerate(arr):
if value == target:
return i
return -1
Python — Find All Occurrences¶
def find_all(arr: list[int], target: int) -> list[int]:
"""Return all indices where target appears."""
return [i for i, v in enumerate(arr) if v == target]
Python — First and Last¶
def find_first(arr, target):
for i, v in enumerate(arr):
if v == target:
return i
return -1
def find_last(arr, target):
for i in range(len(arr) - 1, -1, -1):
if arr[i] == target:
return i
return -1
Coding Patterns¶
Pattern 1: Early Return¶
def linear_search(arr, target):
for i, v in enumerate(arr):
if v == target:
return i # ← Early return: no need to continue
return -1
Why: Avoids unnecessary work after the first match. Cuts average runtime in half on uniformly distributed targets.
Pattern 2: Sentinel (Place Target at End)¶
def sentinel_search(arr, target):
arr_copy = arr + [target] # append target as sentinel
i = 0
while arr_copy[i] != target:
i += 1
if i < len(arr):
return i
return -1
Why: The inner while loop has only one condition to check (arr[i] != target) instead of two (i < n AND arr[i] != target). Halves the per-iteration work. Used to be a big win in the 1980s; modern CPUs make the savings small but still measurable in tight loops.
Note: The sentinel pattern modifies (or copies) the array. In a real system you'd want to operate on a buffer that already has space for the sentinel, to avoid the copy.
Pattern 3: Predicate Search (Find First Matching)¶
def find_first_matching(arr, predicate):
"""Find first element where predicate(x) is True."""
for i, v in enumerate(arr):
if predicate(v):
return i
return -1
# Usage
users = [{"id": 1, "age": 22}, {"id": 2, "age": 35}, {"id": 3, "age": 41}]
idx = find_first_matching(users, lambda u: u["age"] > 30) # → 1
Why: Generalizes from "equality match" to "any condition." This is what Stream.filter().findFirst() (Java), arr.find(...) (JS), next(... for ... in ... if ...) (Python) all do under the hood.
Pattern 4: Use the Built-In¶
In every modern language, there's a built-in linear-search function. Use it in production code — it's been tuned by language maintainers.
# Python
if target in arr: # uses C-level linear scan
idx = arr.index(target) # raises ValueError if absent
# Java
boolean present = list.contains(target);
int idx = list.indexOf(target);
// Go (1.21+)
import "slices"
present := slices.Contains(arr, target)
idx := slices.Index(arr, target)
Error Handling¶
Empty Array¶
Linear search on an empty array should return -1 (or None). The loop body never executes; the function falls through to the return -1.
def linear_search(arr, target):
for i, v in enumerate(arr): # if arr is [], loop body never runs
if v == target:
return i
return -1 # ← reached immediately for empty arr
assert linear_search([], 5) == -1
Null / None Array¶
Decide early: do you accept None and treat it as "empty," or do you raise?
public static int search(int[] arr, int target) {
if (arr == null) {
throw new IllegalArgumentException("arr must not be null");
}
...
}
Comparing Floats¶
Direct equality on floats is dangerous due to rounding (0.1 + 0.2 != 0.3). Use a tolerance:
def find_close(arr, target, eps=1e-9):
for i, v in enumerate(arr):
if abs(v - target) < eps:
return i
return -1
Comparing NaN¶
NaN != NaN is True in IEEE-754. A naive linear search will never find NaN:
import math
arr = [1.0, 2.0, math.nan, 4.0]
arr.index(math.nan) # ValueError: nan is not in list (in some versions)
Use math.isnan explicitly when searching for NaN.
Performance Tips¶
- Early exit on first match — never scan past it (unless you need all occurrences).
- Place common targets first if you know the access distribution. Frequently-searched values at the front of the array → average comparisons drop.
- Use the language built-in (
in,contains,indexOf). The interpreter / JIT often vectorizes these to SIMD instructions. - For repeated queries on the same data, switch to a hash set — O(1) per lookup after a one-time O(n) build.
- For sorted data, switch to binary search — O(log n) per query.
- For small arrays (n < ~20), don't bother optimizing — even O(n²) algorithms are faster than the overhead of cleverer approaches.
- Avoid expensive comparisons inside the loop — pre-compute hash codes or pre-extract a key.
- Iterate in cache-friendly order — for 2D arrays, iterate row-by-row (row-major) not column-by-column.
Best Practices¶
- Return an index, not a boolean — callers can derive the boolean from
idx >= 0, but they can't derive an index fromtrue. - Document the "not found" sentinel —
-1,None,Optional.empty(), or(0, false)in Go. - Prefer
Optional<T>in Java /option[T]in Rust for type safety where idiomatic. - Keep the loop body small — push complex logic into helper functions to keep the search tight.
- Use
for-each/rangeloops when you don't need the index of non-matching elements — cleaner and harder to off-by-one. - Don't reinvent
contains— use the built-in. Only write your own loop when you need the index, all matches, or a custom predicate. - Test edge cases explicitly — empty array, target at index 0, target at index n-1, target not present, duplicates.
Edge Cases & Pitfalls¶
| Edge Case | Behavior |
|---|---|
| Empty array | Return -1 immediately. No comparisons performed. |
| Single element, match | Return 0. One comparison. |
| Single element, no match | Return -1. One comparison. |
| Target at index 0 | Return 0. One comparison. (Best case) |
| Target at index n-1 | Return n-1. n comparisons. |
| Target not present | Return -1. n comparisons. (Worst case) |
| Duplicates | Return first occurrence (or as documented — first vs last vs all). |
| All elements same as target | Return 0. One comparison. |
| All elements same, target different | Return -1. n comparisons. |
| Array length = 0 vs nil/null | Handle both — len(nil) == 0 in Go is safe; null.length is a NPE in Java. |
Common Mistakes¶
Mistake 1: Off-by-one in the Loop Bound¶
# BUG: range(len(arr) - 1) misses the last element
for i in range(len(arr) - 1): # ❌ should be range(len(arr))
if arr[i] == target:
return i
Fix: Use range(len(arr)) — or better, enumerate(arr).
Mistake 2: Returning True Instead of the Index¶
# BUG: callers can't use this for arr[idx] = new_value
def linear_search(arr, target):
for v in arr:
if v == target:
return True
return False
Fix: Return the index. If callers want a boolean, they write idx >= 0.
Mistake 3: Forgetting the "Not Found" Case¶
// BUG: no return after the loop — compile error in Go.
// In dynamic languages it returns None/undefined silently.
func search(arr []int, target int) int {
for i, v := range arr {
if v == target {
return i
}
}
// missing return -1
}
Fix: Always have a return -1 (or equivalent) after the loop.
Mistake 4: Continuing After a Found Match¶
# BUG: returns the LAST match, not the first.
def linear_search(arr, target):
found = -1
for i, v in enumerate(arr):
if v == target:
found = i
# missing break / early return
return found
Fix: return i immediately on match, or break out of the loop.
Mistake 5: Modifying the Array While Iterating¶
# BUG: removing elements shifts indices — you skip the next element.
for i, v in enumerate(arr):
if v == target:
arr.pop(i) # ❌ corrupts iteration
Fix: Build a new list, iterate over a copy, or iterate in reverse.
Mistake 6: Comparing Mutable Objects by Identity¶
# BUG: `is` checks object identity, not equality.
for v in arr:
if v is target: # ❌ should be ==
return ...
Fix: Use == (or .equals() in Java) for value equality.
Cheat Sheet¶
┌─────────────────────────┐
│ LINEAR SEARCH │
├─────────────────────────┤
│ Best: O(1) │
│ Average: O(n) │
│ Worst: O(n) │
│ Space: O(1) │
│ Stable: N/A (search) │
│ Sort?: not required │
└─────────────────────────┘
PSEUDOCODE
----------
function linear_search(arr, target):
for i from 0 to length(arr) - 1:
if arr[i] == target:
return i
return -1
WHEN TO USE
-----------
✓ Data is unsorted
✓ n is small (< 100)
✓ Single query
✓ Iterator / streaming source (no random access)
✓ Custom predicate (not just equality)
WHEN NOT TO USE
---------------
✗ n is large AND data is sorted → binary search
✗ Many lookups on the same data → hash set / hash map
✗ Range queries ("all values 5..10") → BST or sorted array + binary search
✗ Prefix lookup on strings → trie
Visual Animation¶
See animation.html for an interactive visualization. Watch the scan position move left-to-right, with each cell colored:
- Blue → not yet visited
- Yellow → currently being inspected
- Red → mismatch, moving on
- Green → match found, search ends
Use the speed slider to step through one comparison at a time. Try the preset "target at end" to see the worst case.
Summary¶
Linear search is the plain, honest, no-prerequisites search algorithm. It scans every element until it finds the target, taking O(n) time and O(1) space. It is the only choice when data is unsorted and you have no index. It is the fastest choice for tiny arrays (n < ~20) due to cache effects and low constant factors.
Key facts:
- O(n) time, O(1) space.
- No preconditions — works on anything iterable.
- Early exit on first match keeps the average case at ~n/2 comparisons.
- Used everywhere —
in,contains,indexOf,findare all linear search. - Not the fastest for large n — but the simplest, and often "fast enough."
Master linear search and you'll understand the base case all other search algorithms try to improve upon.
Further Reading¶
- Cormen, Leiserson, Rivest, Stein — Introduction to Algorithms, Chapter 2.1 (Linear Search)
- Sedgewick & Wayne — Algorithms (4th ed.), Section 1.4 (Sequential search)
- Knuth — The Art of Computer Programming, Vol. 3, Section 6.1 (Sequential Searching)
- Go standard library:
slices.Index,slices.Contains— pkg.go.dev/slices - Java standard library:
Collection.contains,List.indexOf,Arrays.asList(...).indexOf(...) - Python standard library:
list.index,inoperator - Linus Torvalds on cache effects: "arrays are simpler than linked lists"
- Continue to
middle.mdfor sentinel search, bidirectional, parallel, and SIMD variants.