Generating Permutations, Subsets, and Combinations — Middle Level¶

Focus: Getting duplicate handling exactly right (sort + skip equal siblings for subsets-II and permutations-II), the two flavors of combination sum (with and without repetition), swap vs used-array permutations and their trade-offs, lexicographic next_permutation, Gray code, and choosing the right tool by comparing them head to head.

Introduction¶

At junior level the message was: backtracking is choose → recurse → un-choose, and the tree shape decides whether you get subsets, permutations, or combinations. At middle level the questions become sharper and more practical:

The input has repeated elements. How do I emit each distinct answer exactly once — without deduping with a hash set afterward (which is slow and memory-hungry)?
Combination sum comes in two flavors: each number can be used unlimited times, or each at most once. The single-line difference between them is a classic interview discriminator.
Should I write permutations with a used[] array or with in-place swaps? They have different memory profiles and different ease of handling duplicates.
How does the standard-library next_permutation actually work, and why is it O(n) per step with no recursion?
What is Gray code, and why would I ever want subsets ordered so consecutive ones differ by a single element?

These are the questions that separate "I can write subsets from memory" from "I can pick the correct variant and prove it emits each answer once."

Deeper Concepts¶

Levels of the tree vs depth of the tree¶

Two different "directions" matter in a backtracking tree, and confusing them is the root of most duplicate-handling bugs:

Depth (vertical) — how many choices you have committed so far. Reusing the same element deeper is forbidden by passing i+1 (subsets/combinations) or by used[] (permutations).
Level / siblings (horizontal) — the alternative choices at one node. Picking equal-valued siblings here is what creates duplicate answers when the input has repeats.

The duplicate rule in one sentence: at a single tree level (the for loop of one call), never start two branches with the same value. That is "skip equal siblings."

Why a hash-set dedup is the wrong instinct¶

The tempting fix for duplicates is: generate everything, throw it in a Set, done. It works but it is wasteful — you still do the exponential redundant work and then pay extra memory hashing canonical forms. The sort+skip approach never generates the duplicates in the first place, so it is both faster and lighter. Always prefer pruning over post-filtering.

Counting distinct outputs¶

With repeats, the number of distinct results shrinks. For a multiset with element multiplicities m₁, m₂, …:

Distinct permutations = n! / (m₁! · m₂! · …) (the multinomial coefficient). E.g. [1,2,2] has 3!/2! = 3.
Distinct subsets = (m₁+1)(m₂+1)… (each distinct value contributes 0..mᵢ copies). E.g. [2,2,2] has 3+1 = 4 subsets.

Your generator must hit exactly these counts — a great sanity test.

Comparison with Alternatives¶

Approach	Output	Extra space	Duplicate-friendly	Notes
Recursive include/exclude (subsets)	leaves only	O(n) stack	yes (sort+skip)	clean binary tree
Recursive start-index (subsets/combinations)	every node / size-k	O(n) stack	yes (sort+skip)	natural pruning
Bit enumeration (subsets)	iterative	O(1)	awkward (needs canonical mask)	best for small `n`
Used-array permutations	leaves	O(n)	yes (sort + `!used[i-1]`)	does not mutate input
Swap permutations	leaves	O(1)	trickier (needs seen-set per level)	mutates input, fastest
`next_permutation` loop	one at a time	O(1)	naturally skips dups	streaming, lexicographic
Gray code	iterative	O(1)	n/a	single-element-change ordering
Generate-then-dedup with a Set	all + filter	O(output)	yes but wasteful	avoid; does redundant work

Rule of thumb: for plain subsets and small n, bit enumeration. For combinations, start-index with pruning. For permutations you must extend to duplicates, used-array (the !used[i-1] skip is the cleanest). For streaming one permutation at a time (memory-bounded), next_permutation.

Duplicate Handling Done Right¶

Subsets-II (input may repeat)¶

Sort, then in each call's loop skip a value equal to its immediate left sibling:

sort(nums)
subsets(start, current):
    emit copy(current)
    for i in start..n-1:
        if i > start and nums[i] == nums[i-1]:
            continue                 # equal sibling already explored at this level
        current.append(nums[i])
        subsets(i+1, current)
        current.pop()

i > start is critical: the first occurrence at a level (where i == start) is allowed; later equal occurrences are skipped. This lets you pick [2,2] (going deeper) but never two different sibling 2s that would yield the same [2].

Permutations-II (input may repeat)¶

Sort, then add a second condition to the used-array loop: only use the first unused copy of each value at a given depth.

sort(nums)
permute(current):
    if len(current)==n: emit copy(current); return
    for i in 0..n-1:
        if used[i]: continue
        if i > 0 and nums[i]==nums[i-1] and not used[i-1]:
            continue                 # the previous equal copy wasn't placed yet → skip
        used[i]=true; current.append(nums[i])
        permute(current)
        current.pop(); used[i]=false

The condition nums[i]==nums[i-1] && !used[i-1] enforces a canonical order among equal copies: copy i may be placed only after copy i-1 has already been placed in this branch. That breaks the symmetry between identical elements so each distinct ordering is produced exactly once.

Why !used[i-1] and not used[i-1]? Picture two identical 2s. We force them to always be placed in left-to-right index order. If i-1 is not yet used, then placing i first would later allow placing i-1, producing the same multiset arrangement reached the other way — so we skip. If i-1 is used, we are correctly placing the second copy after the first. (Both used[i-1] and !used[i-1] variants can be made correct with the matching loop order; !used[i-1] is the standard, well-tested form.)

Combination Sum Variants¶

Given candidates and a target, list all combinations summing to target.

Without repetition (each candidate used at most once) — "Combination Sum II"¶

Sort; recurse with i+1; skip equal siblings (same rule as subsets-II); prune when the running sum exceeds target.

sort(candidates)
solve(start, remaining, current):
    if remaining == 0: emit copy(current); return
    for i in start..n-1:
        if i > start and candidates[i]==candidates[i-1]: continue   # skip dup sibling
        if candidates[i] > remaining: break                          # sorted → no further fits
        current.append(candidates[i])
        solve(i+1, remaining-candidates[i], current)                # i+1: use once
        current.pop()

With repetition (each candidate reusable) — "Combination Sum"¶

Recurse with i (not i+1) so the same candidate can be chosen again; no dup-skip needed if candidates are distinct.

sort(candidates)               # distinct candidates
solve(start, remaining, current):
    if remaining == 0: emit copy(current); return
    for i in start..n-1:
        if candidates[i] > remaining: break        # prune
        current.append(candidates[i])
        solve(i, remaining-candidates[i], current) # i: reuse allowed
        current.pop()

The single discriminator: recurse with i+1 to use each candidate once, or with i to allow reuse. The break on candidates[i] > remaining (valid only because we sorted) is the key pruning that keeps both fast.

Swap vs Used-Array Permutations¶

Used-array: does not mutate the input, easy to extend to duplicates, costs O(n) for the boolean array.

Swap-based: mutates the input array in place, O(1) extra space, slightly faster. To handle duplicates with swaps you need a per-level "seen value" set (you cannot use the !used[i-1] trick because swapping destroys the sorted adjacency):

permuteSwap(start):
    if start==n: emit copy(nums); return
    seen = empty set
    for i in start..n-1:
        if nums[i] in seen: continue   # skip duplicate value at this level
        seen.add(nums[i])
        swap(nums, start, i)
        permuteSwap(start+1)
        swap(nums, start, i)           # swap back

Aspect	Used-array	Swap
Extra space	O(n)	O(1) (O(n) per level for dup-set)
Mutates input	no	yes
Lexicographic order	yes (if input sorted)	no (swaps scramble order)
Duplicate handling	`!used[i-1]` (clean)	per-level seen-set
Typical choice	interviews, clarity	tight memory, no order needed

If the problem demands lexicographic output, prefer used-array (with sorted input) or next_permutation; swap-based produces an arbitrary order.

Lexicographic next_permutation¶

next_permutation transforms an array into the next permutation in lexicographic (dictionary) order, in place, in O(n). It is how you stream permutations one at a time without recursion or storing them all. The algorithm (the same one in C++'s STL and Python via itertools logic):

next_permutation(a):
    # 1. find the largest i with a[i] < a[i+1] (the "pivot")
    i = n - 2
    while i >= 0 and a[i] >= a[i+1]: i -= 1
    if i < 0:                       # whole array descending → last permutation
        reverse(a); return false    # wrap around to the first (ascending)
    # 2. find the largest j > i with a[j] > a[i]
    j = n - 1
    while a[j] <= a[i]: j -= 1
    # 3. swap pivot with that successor
    swap(a, i, j)
    # 4. reverse the suffix after i (it was descending → make it ascending = smallest)
    reverse(a, i+1, n-1)
    return true

Starting from the sorted array and calling next_permutation until it returns false enumerates all permutations in lex order, and it naturally skips duplicates (because equal elements never produce a strictly-larger arrangement at the swap step). The correctness argument lives in professional.md; the intuition: you bump the rightmost position you can increase, then make everything after it as small as possible.

Gray Code¶

A Gray code lists all 2^n bit patterns so that consecutive patterns differ in exactly one bit — i.e. consecutive subsets differ by adding or removing a single element. The standard binary-reflected Gray code has a one-line formula:

gray(i) = i XOR (i >> 1)        # the i-th Gray code, for i in 0..2^n-1

So the sequence for n=2 is 00, 01, 11, 10 (decimal 0,1,3,2), each adjacent pair differing in one bit. Uses: minimizing state changes (mechanical encoders, Karnaugh maps), and generating subsets where each step is a cheap single toggle (useful when evaluating a function incrementally over subsets). To recover the integer from a Gray code, XOR-fold:

fromGray(g):
    n = g
    while g >>= 1: n ^= g
    return n

Code Examples¶

Subsets-II, Permutations-II, and both Combination Sums¶

Go¶

package main

import (
    "fmt"
    "sort"
)

func subsetsWithDup(nums []int) [][]int {
    sort.Ints(nums)
    var res [][]int
    var cur []int
    var bt func(start int)
    bt = func(start int) {
        res = append(res, append([]int(nil), cur...))
        for i := start; i < len(nums); i++ {
            if i > start && nums[i] == nums[i-1] {
                continue
            }
            cur = append(cur, nums[i])
            bt(i + 1)
            cur = cur[:len(cur)-1]
        }
    }
    bt(0)
    return res
}

func permuteUnique(nums []int) [][]int {
    sort.Ints(nums)
    var res [][]int
    used := make([]bool, len(nums))
    var cur []int
    var bt func()
    bt = func() {
        if len(cur) == len(nums) {
            res = append(res, append([]int(nil), cur...))
            return
        }
        for i := 0; i < len(nums); i++ {
            if used[i] {
                continue
            }
            if i > 0 && nums[i] == nums[i-1] && !used[i-1] {
                continue
            }
            used[i] = true
            cur = append(cur, nums[i])
            bt()
            cur = cur[:len(cur)-1]
            used[i] = false
        }
    }
    bt()
    return res
}

func combinationSum(cand []int, target int) [][]int {
    sort.Ints(cand)
    var res [][]int
    var cur []int
    var bt func(start, rem int)
    bt = func(start, rem int) {
        if rem == 0 {
            res = append(res, append([]int(nil), cur...))
            return
        }
        for i := start; i < len(cand); i++ {
            if cand[i] > rem {
                break
            }
            cur = append(cur, cand[i])
            bt(i, rem-cand[i]) // i: reuse allowed
            cur = cur[:len(cur)-1]
        }
    }
    bt(0, target)
    return res
}

func combinationSum2(cand []int, target int) [][]int {
    sort.Ints(cand)
    var res [][]int
    var cur []int
    var bt func(start, rem int)
    bt = func(start, rem int) {
        if rem == 0 {
            res = append(res, append([]int(nil), cur...))
            return
        }
        for i := start; i < len(cand); i++ {
            if i > start && cand[i] == cand[i-1] {
                continue
            }
            if cand[i] > rem {
                break
            }
            cur = append(cur, cand[i])
            bt(i+1, rem-cand[i]) // i+1: use once
            cur = cur[:len(cur)-1]
        }
    }
    bt(0, target)
    return res
}

func main() {
    fmt.Println(subsetsWithDup([]int{1, 2, 2}))
    fmt.Println(permuteUnique([]int{1, 1, 2}))
    fmt.Println(combinationSum([]int{2, 3, 6, 7}, 7))
    fmt.Println(combinationSum2([]int{10, 1, 2, 7, 6, 1, 5}, 8))
}

Java¶

import java.util.*;

public class Combinatorics {
    static List<List<Integer>> subsetsWithDup(int[] a) {
        Arrays.sort(a);
        List<List<Integer>> res = new ArrayList<>();
        sub(a, 0, new ArrayList<>(), res);
        return res;
    }
    static void sub(int[] a, int s, List<Integer> cur, List<List<Integer>> res) {
        res.add(new ArrayList<>(cur));
        for (int i = s; i < a.length; i++) {
            if (i > s && a[i] == a[i - 1]) continue;
            cur.add(a[i]); sub(a, i + 1, cur, res); cur.remove(cur.size() - 1);
        }
    }

    static List<List<Integer>> permuteUnique(int[] a) {
        Arrays.sort(a);
        List<List<Integer>> res = new ArrayList<>();
        perm(a, new boolean[a.length], new ArrayList<>(), res);
        return res;
    }
    static void perm(int[] a, boolean[] used, List<Integer> cur, List<List<Integer>> res) {
        if (cur.size() == a.length) { res.add(new ArrayList<>(cur)); return; }
        for (int i = 0; i < a.length; i++) {
            if (used[i]) continue;
            if (i > 0 && a[i] == a[i - 1] && !used[i - 1]) continue;
            used[i] = true; cur.add(a[i]);
            perm(a, used, cur, res);
            cur.remove(cur.size() - 1); used[i] = false;
        }
    }

    static List<List<Integer>> combinationSum(int[] c, int target) {
        Arrays.sort(c);
        List<List<Integer>> res = new ArrayList<>();
        cs(c, 0, target, new ArrayList<>(), res, false);
        return res;
    }
    static List<List<Integer>> combinationSum2(int[] c, int target) {
        Arrays.sort(c);
        List<List<Integer>> res = new ArrayList<>();
        cs(c, 0, target, new ArrayList<>(), res, true);
        return res;
    }
    static void cs(int[] c, int s, int rem, List<Integer> cur,
                   List<List<Integer>> res, boolean once) {
        if (rem == 0) { res.add(new ArrayList<>(cur)); return; }
        for (int i = s; i < c.length; i++) {
            if (once && i > s && c[i] == c[i - 1]) continue;
            if (c[i] > rem) break;
            cur.add(c[i]);
            cs(c, once ? i + 1 : i, rem - c[i], cur, res, once);
            cur.remove(cur.size() - 1);
        }
    }

    public static void main(String[] args) {
        System.out.println(subsetsWithDup(new int[]{1, 2, 2}));
        System.out.println(permuteUnique(new int[]{1, 1, 2}));
        System.out.println(combinationSum(new int[]{2, 3, 6, 7}, 7));
        System.out.println(combinationSum2(new int[]{10, 1, 2, 7, 6, 1, 5}, 8));
    }
}

Python¶

def subsets_with_dup(nums):
    nums.sort()
    res, cur = [], []

    def bt(start):
        res.append(cur[:])
        for i in range(start, len(nums)):
            if i > start and nums[i] == nums[i - 1]:
                continue
            cur.append(nums[i]); bt(i + 1); cur.pop()

    bt(0)
    return res


def permute_unique(nums):
    nums.sort()
    res, cur = [], []
    used = [False] * len(nums)

    def bt():
        if len(cur) == len(nums):
            res.append(cur[:]); return
        for i in range(len(nums)):
            if used[i]:
                continue
            if i > 0 and nums[i] == nums[i - 1] and not used[i - 1]:
                continue
            used[i] = True; cur.append(nums[i])
            bt()
            cur.pop(); used[i] = False

    bt()
    return res


def combination_sum(cand, target, once=False):
    cand.sort()
    res, cur = [], []

    def bt(start, rem):
        if rem == 0:
            res.append(cur[:]); return
        for i in range(start, len(cand)):
            if once and i > start and cand[i] == cand[i - 1]:
                continue
            if cand[i] > rem:
                break
            cur.append(cand[i])
            bt(i + 1 if once else i, rem - cand[i])
            cur.pop()

    bt(0, target)
    return res


if __name__ == "__main__":
    print(subsets_with_dup([1, 2, 2]))
    print(permute_unique([1, 1, 2]))
    print(combination_sum([2, 3, 6, 7], 7))
    print(combination_sum([10, 1, 2, 7, 6, 1, 5], 8, once=True))

next_permutation and Gray code¶

Python¶

def next_permutation(a):
    n = len(a)
    i = n - 2
    while i >= 0 and a[i] >= a[i + 1]:
        i -= 1
    if i < 0:
        a.reverse()
        return False
    j = n - 1
    while a[j] <= a[i]:
        j -= 1
    a[i], a[j] = a[j], a[i]
    a[i + 1:] = reversed(a[i + 1:])
    return True


def all_permutations_lex(a):
    a = sorted(a)
    res = [a[:]]
    while next_permutation(a):
        res.append(a[:])
    return res


def gray_code(n):
    return [i ^ (i >> 1) for i in range(1 << n)]


if __name__ == "__main__":
    print(all_permutations_lex([1, 2, 2]))   # 3 distinct, lex order
    print(gray_code(2))                       # [0, 1, 3, 2]

Run: go run main.go | javac Combinatorics.java && java Combinatorics | python combos.py

Error Handling¶

Error	Cause	Fix
Duplicate answers still appear	Forgot to sort before the skip rule.	Sort first; equal elements must be adjacent.
Subsets-II drops valid subsets	Used `i > 0` instead of `i > start`.	Skip only later siblings: `i > start`.
Permutations-II wrong count	Used `used[i-1]` instead of `!used[i-1]`.	Use the canonical `!used[i-1]` form.
Combination sum infinite loop	A candidate equals 0 (reuse case).	Forbid/strip zeros for the reuse variant.
Combination Sum II reuses an element	Recursed with `i` instead of `i+1`.	Use `i+1` for the use-once variant.
next_permutation never terminates loop	Did not handle the descending wrap-around.	Return false when no pivot found.
Wrong Gray code adjacency	Used plain binary instead of `i ^ (i>>1)`.	Apply the reflected-Gray formula.

Performance Analysis¶

Sort cost is O(n log n), negligible against exponential output; always pay it once for dup-handling and pruning.
Pruning by break in combination sum (sorted candidates) prunes whole sorted tails — often turns intractable inputs feasible.
Skip-equal-siblings reduces work to the distinct-output count, strictly better than generate-then-dedup.
next_permutation amortizes to O(1) per call over a full enumeration (a known result), so streaming all n! permutations is O(n · n!) total — optimal.
Gray code is O(1) per element via the XOR formula, with O(1) extra space.
Swap permutations avoid the used[] allocation but lose lexicographic order; only choose them when order is irrelevant.

Best Practices¶

Memorize the two skip rules verbatim: subsets/comb-sum-II use i > start && a[i]==a[i-1]; permutations use a[i]==a[i-1] && !used[i-1].
Always sort first when duplicates are possible or when prefix-sum pruning helps.
Encode the use-once vs reuse decision as the single argument i+1 vs i; comment it explicitly.
Prune combination sum with a sorted break on candidate > remaining — do not test every candidate.
Validate generators by comparing output counts to the multinomial / product formulas on small inputs.
For memory-bounded streaming, prefer next_permutation over collecting all permutations.

Visual Animation¶

See animation.html for an interactive visualization.

The animation demonstrates: - The decision tree for subsets and permutations, with each choice made and un-made - How skipping equal siblings would prune duplicate branches - The partial solution growing and shrinking as the search backtracks - Completed answers being emitted, with a mode toggle and step controls

Summary¶

The middle-level skills are all about precision. Duplicate handling is "sort, then skip equal siblings," with the exact condition depending on whether you are building subsets (i > start) or permutations (!used[i-1]). Combination sum is one template with a single switch: recurse i to reuse a candidate, i+1 to use it once — both sped up by a sorted break. Permutations come in used-array (clear, order-preserving, easy duplicates) and swap (O(1) space, no order, per-level seen-set for duplicates) flavors. For streaming, next_permutation walks lexicographic order in O(n) per step and skips duplicates for free; Gray code orders subsets by single-element changes via i ^ (i>>1). Knowing which variant to reach for — and proving it emits each answer exactly once — is the mark of fluency.