Profile DP / Broken Profile DP (Tiling a Grid with a Frontier Bitmask) — Practice Tasks¶
All tasks must be solved in Go, Java, and Python. Each task ships with starter code or a precise spec in all three languages. Implement the profile-DP logic and validate against the evaluation criteria. Always test against a brute-force backtracking oracle on small grids before trusting the profile-DP result. Anchor checks:
2 × n= FibonacciF(n+1),8 × 8= 12988816, odd area = 0.
Beginner Tasks (5)¶
Task 1 — Single broken-profile step (one cell transition)¶
Problem. Implement step(dp, r, c, n, m) that, given the current per-mask counts dp (length 2^m) at cell (r, c), returns the next-cell counts ndp by applying the three transitions: filled→advance, free→vertical, free→horizontal. Reduce mod 10^9 + 7.
Input / Output spec. - Read n, m, r, c, then 2^m integers for dp. - Print ndp as 2^m space-separated integers.
Constraints. - 1 ≤ m ≤ 12, 0 ≤ r < n, 0 ≤ c < m, entries in [0, p).
Hint. bit = 1<<c. If mask & bit: ndp[mask ^ bit] += dp[mask]. Else: vertical ndp[mask|bit] if r+1<n; horizontal ndp[mask|(1<<(c+1))] if c+1<m and right cell free.
Starter — Go.
package main
import "fmt"
const MOD = 1_000_000_007
func step(dp []int64, r, c, n, m int) []int64 {
// TODO: apply filled/vertical/horizontal transitions, mod MOD
return nil
}
func main() {
var n, m, r, c int
fmt.Scan(&n, &m, &r, &c)
full := 1 << m
dp := make([]int64, full)
for i := range dp {
fmt.Scan(&dp[i])
}
ndp := step(dp, r, c, n, m)
for i, v := range ndp {
if i > 0 {
fmt.Print(" ")
}
fmt.Print(v)
}
fmt.Println()
}
Starter — Java.
import java.util.*;
public class Task1 {
static final long MOD = 1_000_000_007L;
static long[] step(long[] dp, int r, int c, int n, int m) {
// TODO
return null;
}
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
int n = sc.nextInt(), m = sc.nextInt(), r = sc.nextInt(), c = sc.nextInt();
int full = 1 << m;
long[] dp = new long[full];
for (int i = 0; i < full; i++) dp[i] = sc.nextLong();
long[] ndp = step(dp, r, c, n, m);
StringBuilder sb = new StringBuilder();
for (int i = 0; i < full; i++) {
if (i > 0) sb.append(' ');
sb.append(ndp[i]);
}
System.out.println(sb);
}
}
Starter — Python.
import sys
MOD = 1_000_000_007
def step(dp, r, c, n, m):
# TODO
return []
def main():
data = iter(sys.stdin.read().split())
n, m, r, c = (int(next(data)) for _ in range(4))
full = 1 << m
dp = [int(next(data)) for _ in range(full)]
ndp = step(dp, r, c, n, m)
print(" ".join(map(str, ndp)))
if __name__ == "__main__":
main()
Evaluation criteria. - Correct three-case transition, verified on a 2×3 first cell (dp[0]=1, rest 0 → two outputs set). - No overflow (reduce after each add). - Bounds guards r+1<n and c+1<m respected.
Task 2 — Count domino tilings of n × m¶
Problem. Count domino tilings of an n × m grid, mod 10^9 + 7, using the broken-profile sweep. Rotate so m = min(n, m) and reject odd area.
Input / Output spec. - Read n, m. Print the count mod p.
Constraints. 1 ≤ n ≤ 10^5, 1 ≤ min(n,m) ≤ 16.
Hint. Roll a 2^m array cell by cell; answer is dp[0] after all cells.
Reference oracle (Python) — use this to validate.
def brute_tilings(n, m):
filled = [[False] * m for _ in range(n)]
def first_free():
for r in range(n):
for c in range(m):
if not filled[r][c]:
return r, c
return None
def rec():
cell = first_free()
if cell is None:
return 1
r, c = cell
total = 0
if r + 1 < n and not filled[r + 1][c]:
filled[r][c] = filled[r + 1][c] = True
total += rec()
filled[r][c] = filled[r + 1][c] = False
if c + 1 < m and not filled[r][c + 1]:
filled[r][c] = filled[r][c + 1] = True
total += rec()
filled[r][c] = filled[r][c + 1] = False
return total
return rec()
Evaluation criteria. - 2×3 → 3, 2×4 → 5, 8×8 → 12988816. - Matches brute_tilings for n, m ≤ 6. - O(n·m·2^m); rotates so m is the narrow dimension.
Task 3 — 2 × n equals Fibonacci¶
Problem. Verify that your profile DP for a 2 × n board returns Fibonacci(n+1) (with F(1)=1, F(2)=1). Read n, print both the profile-DP count and F(n+1) and confirm they match.
Input / Output spec. - Read n. Print count(2, n) and F(n+1) mod p (they must be equal).
Constraints. 1 ≤ n ≤ 10^6.
Hint. Run count_tilings(2, n) and a simple iterative Fibonacci; assert equality.
Worked check. n=1→1, n=2→2, n=3→3, n=4→5, n=5→8 (Fibonacci shape).
Evaluation criteria. - The two values are equal for all tested n. - Fibonacci computed iteratively mod p. - Demonstrates the transfer-matrix T = [[1,1],[1,0]] intuition.
Task 4 — Reject impossible boards¶
Problem. Given n, m, return the tiling count, but first short-circuit the impossible cases: odd area returns 0. Print a clear 0 for those and the real count otherwise.
Input / Output spec. - Read n, m. Print the count (0 for odd area).
Constraints. 1 ≤ n, m ≤ 10^4 with min(n,m) ≤ 14.
Hint. if (n*m) % 2 != 0: return 0 before allocating 2^m.
Evaluation criteria. - 1×1 → 0, 3×3 → 0, 1×2 → 1, 1×3 → 0. - Odd-area check happens before allocation. - Matches brute force for small grids.
Task 5 — Tilings with blocked cells¶
Problem. Given an n × m grid where some cells are blocked (cannot be covered), count domino tilings of the free cells, mod 10^9 + 7. A tiling must cover every free cell exactly once and never touch a blocked cell.
Input / Output spec. - Read n, m, then n rows of m chars (. free, # blocked). Print the count.
Constraints. 1 ≤ n ≤ 1000, 1 ≤ min(n,m) ≤ 14.
Hint. Treat a blocked cell as "already covered": when the sweep reaches it, advance (clear its bit) and never place a domino onto it. A vertical/horizontal landing on a blocked cell is illegal.
Reference oracle (Python).
def brute_blocked(grid):
n, m = len(grid), len(grid[0])
filled = [[grid[r][c] == '#' for c in range(m)] for r in range(n)]
def first_free():
for r in range(n):
for c in range(m):
if not filled[r][c]:
return r, c
return None
def rec():
cell = first_free()
if cell is None:
return 1
r, c = cell
total = 0
if r + 1 < n and not filled[r + 1][c]:
filled[r][c] = filled[r + 1][c] = True
total += rec()
filled[r][c] = filled[r + 1][c] = False
if c + 1 < m and not filled[r][c + 1]:
filled[r][c] = filled[r][c + 1] = True
total += rec()
filled[r][c] = filled[r][c + 1] = False
return total
return rec()
Evaluation criteria. - Blocked cells are never covered by a domino. - Matches brute_blocked for small grids. - An odd number of free cells yields 0.
Intermediate Tasks (5)¶
Task 6 — Count grid independent sets¶
Problem. Count subsets of cells of an n × m grid with no two orthogonally adjacent cells selected, mod 10^9 + 7. Use profile DP (bit c = "frontier cell in column c selected").
Constraints. 1 ≤ n ≤ 1000, 1 ≤ min(n,m) ≤ 16.
Hint. At cell (r,c): always allow skip (clear bit c); allow select only if up-neighbor (bit c) and left-neighbor (bit c−1) are unset. The answer sums over all final masks (every state is a valid endpoint).
Reference oracle (Python).
def brute_indep(n, m):
cells = [(r, c) for r in range(n) for c in range(m)]
from itertools import product
count = 0
for choice in product([0, 1], repeat=len(cells)):
sel = {cells[i] for i in range(len(cells)) if choice[i]}
ok = True
for (r, c) in sel:
if (r + 1, c) in sel or (r, c + 1) in sel:
ok = False
break
if ok:
count += 1
return count
# brute_indep(2,2) == 7 ; brute_indep(3,3) == 63
Evaluation criteria. - 2×2 → 7, 3×3 → 63. - Sums over all masks (not just mask=0). - Matches brute_indep for n·m ≤ 12.
Task 7 — Maximum-weight independent set on a grid¶
Problem. Given an n × m grid of non-negative weights, select non-adjacent cells maximizing total weight. Use profile DP with max-plus combine.
Constraints. 1 ≤ n ≤ 1000, 1 ≤ min(n,m) ≤ 16, weights in [0, 10^9].
Hint. Same skeleton as Task 6 but track the maximum total weight instead of summing counts; selecting (r,c) adds weight[r][c]. Use a sufficiently negative sentinel for unreachable masks.
Evaluation criteria. - Matches a brute-force max over all independent subsets for small grids. - Correctly forbids selecting adjacent cells. - O(n·m·2^m).
Task 8 — Tilings for large n via transfer matrix¶
Problem. Given a tiny fixed m (m ≤ 5) and a huge n (up to 10^18), count domino tilings of n × m mod 10^9 + 7 using transfer-matrix exponentiation.
Constraints. 1 ≤ m ≤ 5, 1 ≤ n ≤ 10^18.
Hint. Build T[next][mask] via a column recursion, then compute (T^n)[0][0] by binary exponentiation. Validate against the sweep for moderate n.
Reference cross-check. For m=2, Z(2,n) = F(n+1); for any moderate n, the transfer-matrix result must equal the broken-profile sweep.
Evaluation criteria. - count(2, 8) == 34 (F(9)); matches the sweep for n ≤ 100. - O(8^m log n); instant for n = 10^18. - Matrix built once; binary exponentiation correct.
Task 9 — Tilings with L-trominoes¶
Problem. Count tilings of an n × m grid by L-trominoes (all 4 rotations), mod 10^9 + 7. Return 0 if n·m is not divisible by 3.
Constraints. 1 ≤ n ≤ 12, 1 ≤ m ≤ 12, min(n,m) small.
Hint. Use a memoized recursion on (pos, mask) where the mask covers a 2-row band (2m bits), since an L-tromino spans at most 2 rows. At the first free cell, try each rotation that fits; advance and shift the window at row boundaries.
Reference oracle (Python).
def brute_trominoes(n, m):
SHAPES = [
[(0, 0), (0, 1), (1, 0)],
[(0, 0), (0, 1), (1, 1)],
[(0, 0), (1, 0), (1, 1)],
[(0, 0), (1, 0), (1, -1)],
]
filled = [[False] * m for _ in range(n)]
def first_free():
for r in range(n):
for c in range(m):
if not filled[r][c]:
return r, c
return None
def rec():
cell = first_free()
if cell is None:
return 1
r, c = cell
total = 0
for shape in SHAPES:
cells = [(r + dr, c + dc) for dr, dc in shape]
if all(0 <= rr < n and 0 <= cc < m and not filled[rr][cc] for rr, cc in cells):
for rr, cc in cells:
filled[rr][cc] = True
total += rec()
for rr, cc in cells:
filled[rr][cc] = False
return total
if (n * m) % 3 != 0:
return 0
return rec()
# brute_trominoes(2,3) == 2
Evaluation criteria. - 2×3 → 2, 3×2 → 2; n·m not divisible by 3 → 0. - Matches brute_trominoes for small grids. - Frontier mask covers a 2-row band.
Task 10 — Mixed dominoes and L-trominoes¶
Problem. Count tilings of an n × m grid using both dominoes (1×2) and L-trominoes, mod 10^9 + 7. Any cell is covered by exactly one piece.
Constraints. 1 ≤ n ≤ 10, 1 ≤ m ≤ 10.
Hint. Union the transition sets of Task 2 (dominoes) and Task 9 (trominoes) at each free cell. Use the 2-row-band mask so both piece types fit.
Evaluation criteria. - For 2×2, count dominoes-only (2) plus any tromino placements (0, since trominoes can't tile a 4-cell board alone but can mix) — verify against a brute-force oracle that tries both piece types. - Matches a combined brute-force oracle for small grids. - Both piece types correctly enumerated at each free cell.
Advanced Tasks (5)¶
Task 11 — CRT-combined exact tiling count across two primes¶
Problem. Count domino tilings of n × m exactly when the true count may exceed 10^9 + 7. Run the broken-profile sweep under two coprime primes (10^9 + 7 and 998244353) and reconstruct via CRT.
Constraints. 1 ≤ n ≤ 1000, min(n,m) ≤ 12. Guarantee the true answer fits below p₁·p₂.
Hint. Run the identical sweep twice (once per prime), then apply two-prime CRT. The runs are independent and parallelizable.
Two-prime CRT (Python).
Evaluation criteria. - Recovers values larger than a single prime for small grids where the exact count is known (e.g. compare to Python big-int sweep without modulus). - Both modular runs agree with (exact) mod pᵢ. - crt2 output matches the exact integer when it fits below p₁·p₂.
Task 12 — Domino tilings with a per-column coloring constraint¶
Problem. Tile n × m with dominoes, but each horizontal domino must be one of q colors and no two horizontally adjacent (sharing a vertical edge) horizontal dominoes may share a color. Count colored tilings mod 10^9 + 7. (A semiring/weighting extension of the basic count.)
Constraints. 1 ≤ n ≤ 100, min(n,m) ≤ 10, 2 ≤ q ≤ 5.
Hint. Multiply a weight factor into the transition when placing a colored horizontal domino, and extend the frontier to record the color of the most recent horizontal in each column so the adjacency-color constraint can be checked. This generalizes the additive transition to additive-with-multiplicative-weights.
Evaluation criteria. - Reduces to the plain count when q = 1 and the color constraint is dropped. - Matches a brute-force colored-tiling oracle for small grids. - Documents how the frontier was widened to carry color information.
Task 13 — Cached transfer-matrix ladder for many large-n queries¶
Problem. Given a fixed tiny m and Q queries each asking for the tiling count of n_q × m (huge n_q), answer all queries efficiently by precomputing the transfer-matrix squaring ladder T^{2^0}, T^{2^1}, … once, then composing per query.
Constraints. 1 ≤ m ≤ 5, 1 ≤ Q ≤ 10^5, each n_q ≤ 10^18.
Hint. Precompute is O(8^m · 60). Per query: multiply the ladder entries whose bits are set in n_q into a running matrix (start from identity), then read [0][0]. Even better, push the row vector e_0 through the ladder in O(4^m · popcount(n)) per query.
Evaluation criteria. - Ladder built exactly once, reused across all queries. - Per-query cost uses the cached ladder (not a fresh full power per query). - Matches per-query independent transfer-matrix exponentiation.
Task 14 — Profile DP on a board with forbidden cells and large n¶
Problem. Count domino tilings of an n × m grid where a fixed periodic pattern of cells is blocked every row (the same m-bit blocked-mask in every row), for astronomically large n. Build a transfer matrix that respects the per-row block pattern and exponentiate.
Constraints. 1 ≤ m ≤ 6, 1 ≤ n ≤ 10^18, a fixed blocked mask per row.
Hint. In the column recursion that builds T, skip blocked cells (treat them as pre-occupied so no piece covers them) — the per-row map is still fixed, so (T^n)[start][start] with the appropriate start mask gives the count. Define the start/end occupancy carefully given the blocked pattern.
Evaluation criteria. - Reduces to the unblocked count when blocked = 0. - Matches the broken-profile sweep (with the same per-row blocked cells) for moderate n. - Correct handling of the block pattern in the row recursion.
Task 15 — Decide when profile DP is the wrong tool¶
Problem. Given a problem statement, decide whether profile DP applies. Implement classify(rows, cols, query_type, planar3d) returning one of: PROFILE_DP, TRANSFER_MATRIX, KASTELEYN_PFAFFIAN, BRUTE_FORCE, or INTRACTABLE. Justify each.
Constraints / cases to handle. - Small narrow dimension, moderate other, counting tilings → PROFILE_DP. - Tiny narrow dimension, astronomically large other → TRANSFER_MATRIX. - Both dimensions large, planar grid, domino tilings → KASTELEYN_PFAFFIAN. - Tiny board overall → BRUTE_FORCE (oracle). - Perfect matchings on 3D / non-planar graphs → INTRACTABLE (#P-complete).
Hint. Encode the decision rules from middle.md, senior.md, and professional.md. The intractable case is the trap: 3D / non-planar matching counting is #P-complete.
Evaluation criteria. - Correctly classifies all five canonical cases. - The INTRACTABLE branch explicitly cites #P-completeness of 3D/non-planar matching counting. - Justification references the right complexity (O(n·m·2^m), O(8^m log n), O((nm)^3)).
Benchmark Task¶
Task B — Sweep vs transfer-matrix crossover¶
Problem. Empirically find the n at which transfer-matrix exponentiation overtakes the broken-profile sweep for a fixed tiny m. Implement two workloads on m = 4:
- (a) Broken-profile sweep:
O(n·m·2^m). - (b) Transfer-matrix exponentiation: build
Tonce (O(8^m)), then(T^n)[0][0]inO(8^m log n).
Measure wall-clock time for m = 4 across n ∈ {10, 100, 1000, 10^4, 10^6, 10^9, 10^18} (use huge n only for the matrix method). Report a table and identify the crossover n.
Starter — Python.
import time
from typing import List
MOD = 1_000_000_007
def sweep(n: int, m: int) -> int:
# TODO: broken-profile sweep, O(n*m*2^m)
return 0
def build_T(m: int) -> List[List[int]]:
# TODO: column recursion building T[next][mask]
return []
def mat_pow_at00(T: List[List[int]], n: int) -> int:
# TODO: binary exponentiation, return (T^n)[0][0]
return 0
def bench_sweep(n: int, m: int) -> float:
start = time.perf_counter()
sweep(n, m)
return time.perf_counter() - start
def bench_matexp(T: List[List[int]], n: int) -> float:
start = time.perf_counter()
mat_pow_at00(T, n)
return time.perf_counter() - start
def mean_ms(samples: List[float]) -> float:
return sum(samples) / len(samples) * 1000.0
def main() -> None:
m = 4
T = build_T(m)
print("n sweep_ms matexp_ms")
for n in [10, 100, 1000, 10_000, 1_000_000, 1_000_000_000, 10**18]:
rb = [bench_matexp(T, n) for _ in range(3)]
if n <= 1_000_000:
ra = [bench_sweep(n, m) for _ in range(3)]
print(f"{n:<14d} {mean_ms(ra):<17.2f} {mean_ms(rb):<14.4f}")
else:
print(f"{n:<14d} {'(skipped)':<17} {mean_ms(rb):<14.4f}")
if __name__ == "__main__":
main()
Evaluation criteria. - Same m produces the same T across Go, Java, and Python. - Sweep (a) wins for small/moderate n; matrix exponentiation (b) wins for large n. Report the crossover. - The matrix method completes for n = 10^18 in well under a second at m = 4; the sweep is infeasible there. - Report includes the mean across at least 3 runs and does not time matrix construction inside the per-n loop. - Writeup: a short note on the measured crossover n and how it compares to the theoretical crossover (n·m·2^m ≈ 8^m log n).
General Guidance for All Tasks¶
- Always validate against the brute-force backtracking oracle first. Run it on small grids, diff exactly, and only then trust the profile-DP version on large inputs.
- Rotate so
m = min(rows, cols). The cost is exponential in the narrow dimension; this one-line swap is the dominant correctness-of-performance decision. - Reject odd area immediately (
(n*m) % 2 != 0 → 0for dominoes;% 3for trominoes). - Get the final read right. Domino tilings: answer is
dp[mask=0]after all cells. Independent sets: sum over all masks. - Pin the modulus and sentinels as named constants.
MOD = 10^9 + 7; use a sufficiently negative sentinel formax-plus variants. - Mind overflow. The domino transition is purely additive — reduce after each add. For weighted/colored variants, reduce after the multiply.
- Reuse buffers. Ping-pong two
2^marrays; do not allocate per cell. - Use the transfer matrix only for astronomically large
nand tinym. Otherwise the linear sweep is simpler and faster. - Never use profile DP for 3D / non-planar matching counting. That is #P-complete (Task 15).
Each task must be implemented and tested in Go, Java, and Python, with identical outputs across all three on the same inputs.