Variables and Data Types — Interview Questions¶
Table of Contents¶
- Junior Level Questions
- Middle Level Questions
- Senior Level Questions
- Professional Level Questions
- Coding Challenges
- System Design Questions
- Behavioral / Scenario Questions
Junior Level Questions¶
Q1: What are the basic data types in Python?¶
Answer
Python has these core built-in types: | Type | Example | Mutable? | |------|---------|----------| | `int` | `42`, `-7`, `1_000` | No | | `float` | `3.14`, `1e10` | No | | `complex` | `3+4j` | No | | `str` | `"hello"`, `'world'` | No | | `bool` | `True`, `False` | No | | `NoneType` | `None` | No | | `list` | `[1, 2, 3]` | Yes | | `dict` | `{"a": 1}` | Yes | | `tuple` | `(1, 2, 3)` | No | | `set` | `{1, 2, 3}` | Yes | **Key point:** All values in Python are objects. Even `int` and `bool` are full objects with methods.Q2: What is the difference between is and ==?¶
Answer
- `==` compares **values** (calls `__eq__`) - `is` compares **identity** (checks if two names point to the same object in memory)a = [1, 2, 3]
b = [1, 2, 3]
c = a
print(a == b) # True — same value
print(a is b) # False — different objects
print(a is c) # True — same object
# Use 'is' only for singletons:
x = None
print(x is None) # Correct
print(x == None) # Works but bad practice (PEP 8)
flowchart LR subgraph Identity["is (identity)"] A1["a"] --> O1["[1,2,3]\nid: 0x1234"] C1["c"] --> O1 B1["b"] --> O2["[1,2,3]\nid: 0x5678"] end
**Rule of thumb:** Use `==` for everything except `None`, `True`, `False` checks. Q3: What is the difference between mutable and immutable types?¶
Answer
**Immutable:** value cannot be changed after creation. Any "modification" creates a new object. - `int`, `float`, `str`, `tuple`, `frozenset`, `bool`, `None` **Mutable:** value can be changed in place. The `id()` stays the same. - `list`, `dict`, `set`, `bytearray` **Why it matters for interviews:** - Mutable defaults in functions are shared across calls (classic bug) - Mutable objects cannot be used as dict keys or set members - Passing mutable objects to functions allows the function to modify the caller's dataQ4: How does Python handle variable naming? What are the rules?¶
Answer
**Rules:** 1. Must start with a letter (a-z, A-Z) or underscore (`_`) 2. Can contain letters, digits (0-9), and underscores 3. Case-sensitive (`name` != `Name`) 4. Cannot be a Python keyword (`if`, `for`, `class`, etc.) **Conventions (PEP 8):** | Type | Convention | Example | |------|-----------|---------| | Variable | snake_case | `user_name` | | Function | snake_case | `get_user()` | | Constant | UPPER_SNAKE_CASE | `MAX_RETRIES` | | Class | PascalCase | `UserProfile` | | Private | leading underscore | `_internal` | | Name-mangled | double underscore | `__private` | | Dunder | double both sides | `__init__` |Q5: What does type() vs isinstance() do? When to use which?¶
Answer
x = True
# type() returns the exact type
print(type(x)) # <class 'bool'>
print(type(x) == bool) # True
print(type(x) == int) # False — exact match only
# isinstance() checks type AND its superclasses
print(isinstance(x, bool)) # True
print(isinstance(x, int)) # True — bool is subclass of int
print(isinstance(x, (int, str))) # True — checks multiple types
Q6: What is None and how should you check for it?¶
Answer
`None` is Python's null value. It is the sole instance of `NoneType`. It is a singleton — there is only one `None` object in the entire interpreter.# Always use 'is' for None checks (PEP 8)
x = None
# Correct
if x is None:
print("No value")
if x is not None:
print("Has value")
# Incorrect (works but bad practice)
if x == None: # Calls __eq__ which can be overridden
print("No value")
# Falsy check (different meaning!)
if not x: # True for None, but also for 0, "", [], False
print("Falsy")
Q7: What is the output of this code?¶
Answer
Output: `[1, 2, 3, 4]` `b = a` does not copy the list. Both `a` and `b` point to the same list object. When you modify through `b`, `a` sees the change too.Middle Level Questions¶
Q8: Explain Python's LEGB scope rule with an example.¶
Answer
LEGB stands for: **Local -> Enclosing -> Global -> Built-in** Python resolves variable names by searching these scopes in order:x = "global" # Global scope
def outer():
x = "enclosing" # Enclosing scope
def inner():
x = "local" # Local scope
print(x) # "local" — found in Local
def inner2():
print(x) # "enclosing" — found in Enclosing
inner()
inner2()
outer()
print(x) # "global" — found in Global
print(len) # <built-in function len> — found in Built-in
flowchart TD A["Name Lookup"] --> B{"Local?"} B -->|Found| C["Use it"] B -->|Not found| D{"Enclosing?"} D -->|Found| C D -->|Not found| E{"Global?"} E -->|Found| C E -->|Not found| F{"Built-in?"} F -->|Found| C F -->|Not found| G["NameError"]
**Modifying outer scopes:** - `global x` — declares x as global (skip local scope) - `nonlocal x` — declares x from the nearest enclosing scope Q9: What is the mutable default argument pitfall?¶
Answer
Default arguments are evaluated once at function definition time, not at each call. If the default is mutable (list, dict, set), it is shared across all calls.# BUG: shared mutable default
def add_item(item, items=[]):
items.append(item)
return items
print(add_item("a")) # ['a']
print(add_item("b")) # ['a', 'b'] — BUG! Expected ['b']
print(add_item("c")) # ['a', 'b', 'c']
# FIX: use None sentinel
def add_item(item, items=None):
if items is None:
items = []
items.append(item)
return items
print(add_item("a")) # ['a']
print(add_item("b")) # ['b'] — correct!
Q10: What is the difference between shallow copy and deep copy?¶
Answer
import copy
original = [[1, 2], [3, 4]]
# Shallow copy: new outer container, same inner objects
shallow = copy.copy(original)
shallow[0].append(99)
print(original) # [[1, 2, 99], [3, 4]] — inner list was shared!
# Reset
original = [[1, 2], [3, 4]]
# Deep copy: new outer AND inner containers
deep = copy.deepcopy(original)
deep[0].append(99)
print(original) # [[1, 2], [3, 4]] — independent!
flowchart LR subgraph Shallow["Shallow Copy"] S["new list"] --> I1["[1,2] SHARED"] S --> I2["[3,4] SHARED"] O["original"] --> I1 O --> I2 end
| Method | New container? | New nested objects? | Use when | |--------|:-:|:-:|------| | Assignment (`b = a`) | No | No | Need an alias | | `copy.copy()` | Yes | No | Flat structures | | `copy.deepcopy()` | Yes | Yes | Nested structures | Q11: How do type hints work in Python? Do they affect runtime?¶
Answer
Type hints are annotations that describe expected types. They have **no runtime effect** by default — Python does not enforce them.from typing import Optional
def greet(name: str, age: int) -> str:
return f"Hello, {name}! Age: {age}"
# These "work" at runtime despite wrong types:
result = greet(42, "hello") # No error! Python ignores hints at runtime
print(result) # "Hello, 42! Age: hello"
# Type hints are stored in __annotations__:
print(greet.__annotations__)
# {'name': <class 'str'>, 'age': <class 'int'>, 'return': <class 'str'>}
Q12: What is id() and when would you use it?¶
Answer
`id()` returns the unique identity (memory address in CPython) of an object. Two objects with the same `id()` are the same object in memory.a = [1, 2, 3]
b = a
c = [1, 2, 3]
print(id(a)) # e.g., 140234866357520
print(id(b)) # same as id(a) — b is an alias
print(id(c)) # different — c is a different object
print(a is b) # True — same id
print(a is c) # False — different id
# Practical use: debugging aliasing issues
def debug_refs(*args):
for i, obj in enumerate(args):
print(f" arg[{i}]: id={id(obj)}, value={obj!r}")
Q13: What is the output? Explain why.¶
Answer
**Explanation:** 1. `f(1)` — uses default `b=[]`, appends 1, returns `[1]`. The default list now contains `[1]`. 2. `f(2, [])` — uses a NEW list `[]`, appends 2, returns `[2]`. The default list is still `[1]`. 3. `f(3)` — uses the default list again (which is `[1]`), appends 3, returns `[1, 3]`. The second call does not affect the default because a new list was explicitly passed.Senior Level Questions¶
Q14: How does Python's garbage collector handle reference cycles?¶
Answer
CPython uses two mechanisms: 1. **Reference counting** (primary) — each object has `ob_refcnt`. When it reaches 0, the object is immediately freed. Cannot handle cycles. 2. **Generational GC** (secondary) — detects and collects reference cycles.import gc
class Node:
def __init__(self, name):
self.name = name
self.ref = None
# Create a cycle
a = Node("A")
b = Node("B")
a.ref = b
b.ref = a # a -> b -> a (cycle)
# Delete external references
del a, b
# refcount of both Nodes is still 1 (from the cycle)
# Only the GC can clean this up
collected = gc.collect()
print(f"Collected {collected} objects")
Q15: Explain __slots__ and its impact on memory and performance.¶
Answer
By default, each Python instance has a `__dict__` (a dict) for storing attributes. `__slots__` replaces this with a fixed-size array, saving ~100+ bytes per instance.import sys
class Regular:
def __init__(self, x, y):
self.x = x
self.y = y
class Slotted:
__slots__ = ("x", "y")
def __init__(self, x, y):
self.x = x
self.y = y
r = Regular(1.0, 2.0)
s = Slotted(1.0, 2.0)
print(f"Regular: {sys.getsizeof(r)} + {sys.getsizeof(r.__dict__)} = "
f"{sys.getsizeof(r) + sys.getsizeof(r.__dict__)} bytes")
print(f"Slotted: {sys.getsizeof(s)} bytes")
# Slotted instances cannot have arbitrary attributes:
# s.z = 3 # AttributeError: 'Slotted' object has no attribute 'z'
Q16: What is the descriptor protocol and how does it relate to types?¶
Answer
Descriptors are objects that define `__get__`, `__set__`, and/or `__delete__`. They power `property`, `classmethod`, `staticmethod`, and `__slots__`.class Validated:
"""Descriptor that validates type on assignment."""
def __init__(self, expected_type):
self.expected_type = expected_type
def __set_name__(self, owner, name):
self.name = name
self.storage = f"_validated_{name}"
def __get__(self, obj, objtype=None):
if obj is None:
return self
return getattr(obj, self.storage, None)
def __set__(self, obj, value):
if not isinstance(value, self.expected_type):
raise TypeError(
f"{self.name} must be {self.expected_type.__name__}, "
f"got {type(value).__name__}"
)
setattr(obj, self.storage, value)
class Product:
name = Validated(str)
price = Validated(int)
def __init__(self, name: str, price: int):
self.name = name # Goes through Validated.__set__
self.price = price
p = Product("Widget", 999)
# p.price = "free" # TypeError: price must be int, got str
Q17: What is the output? This is a classic interview trap.¶
t = ([1, 2],)
try:
t[0] += [3, 4]
except TypeError as e:
print(f"Error: {e}")
finally:
print(f"t = {t}")
Answer
**Why both error AND mutation happen:** `t[0] += [3, 4]` compiles to: 1. `t[0].__iadd__([3, 4])` — this SUCCEEDS, modifying the list in place. Returns the same list. 2. `t[0] =Professional Level Questions¶
Q18: How does CPython represent integers internally?¶
Answer
CPython uses arbitrary-precision arithmetic. Internally, integers are stored as arrays of "digits" in `PyLongObject`:// Include/cpython/longintrepr.h
struct _longobject {
PyObject_VAR_HEAD // ob_refcnt + ob_type + ob_size
digit ob_digit[1]; // flexible array of 30-bit digits
};
import sys
# Size grows in steps of 4 bytes (one digit = 4 bytes, 30 bits used)
for n in [0, 2**29, 2**30, 2**59, 2**60, 2**89, 2**90]:
print(f" {n:>30}: {sys.getsizeof(n)} bytes")
# Memory layout:
# ob_refcnt: 8 bytes
# ob_type: 8 bytes
# ob_size: 8 bytes (number of digits + sign)
# ob_digit[]: 4 bytes per digit
# Total: 28 bytes for 0-digit int, +4 per additional digit
Q19: Explain the pymalloc allocator and its arena/pool/block structure.¶
Answer
CPython has a custom allocator for small objects (<= 512 bytes):Level 3: Arenas (256 KB)
├── Pool 0 (4 KB) — size class 16
├── Pool 1 (4 KB) — size class 32
├── Pool 2 (4 KB) — size class 16
└── ...
Level 2: Pools (4 KB = system page)
├── Block 0 (16 bytes)
├── Block 1 (16 bytes)
├── Block 2 (16 bytes) — FREE
└── ...
Level 1: Blocks (8, 16, 24, ..., 512 bytes)
— The actual memory used by PyObject
Q20: How does the GIL interact with reference counting?¶
Answer
The GIL (Global Interpreter Lock) exists primarily to make reference counting thread-safe:# Without the GIL, this would be a data race:
# Thread 1: Py_INCREF(obj) -> reads ob_refcnt=1, writes 2
# Thread 2: Py_DECREF(obj) -> reads ob_refcnt=1, writes 0 -> FREES object
# Thread 1: obj is now freed -> USE-AFTER-FREE BUG
# The GIL ensures only one thread executes Python bytecode at a time:
# Thread 1: [acquire GIL] Py_INCREF(obj) [release GIL]
# Thread 2: [acquire GIL] Py_DECREF(obj) [release GIL]
Coding Challenges¶
Challenge 1: Implement deepcopy for Simple Types¶
"""
Implement a simplified deep_copy function that handles:
- int, float, str, bool, None (return as-is — immutable)
- list (deep copy each element)
- dict (deep copy each key and value)
- tuple (deep copy each element)
Do NOT use copy.deepcopy.
"""
def deep_copy(obj):
# Your implementation here
pass
# Test cases
assert deep_copy(42) == 42
assert deep_copy("hello") == "hello"
original = {"a": [1, 2, [3, 4]], "b": {"c": 5}}
copied = deep_copy(original)
assert copied == original
assert copied is not original
assert copied["a"] is not original["a"]
assert copied["a"][2] is not original["a"][2]
assert copied["b"] is not original["b"]
print("All tests passed!")
Solution
def deep_copy(obj):
if isinstance(obj, (int, float, str, bool, type(None))):
return obj # Immutable — safe to share
if isinstance(obj, list):
return [deep_copy(item) for item in obj]
if isinstance(obj, tuple):
return tuple(deep_copy(item) for item in obj)
if isinstance(obj, dict):
return {deep_copy(k): deep_copy(v) for k, v in obj.items()}
if isinstance(obj, set):
return {deep_copy(item) for item in obj}
raise TypeError(f"Cannot deep copy {type(obj).__name__}")
Challenge 2: Variable Scope Debugger¶
"""
Write a function that takes a function and returns information about
its variable scopes: local variables, free variables (closures),
and global references.
"""
def analyze_scope(func) -> dict:
# Your implementation here
pass
# Test
x = 10
def outer():
y = 20
def inner(a, b):
z = a + b + y + x
return z
return inner
fn = outer()
info = analyze_scope(fn)
print(info)
# Expected output similar to:
# {
# "name": "inner",
# "locals": ["a", "b", "z"],
# "free_vars": ["y"],
# "globals_used": ["x"],
# }
Solution
import dis
import types
def analyze_scope(func: types.FunctionType) -> dict:
code = func.__code__
return {
"name": code.co_name,
"locals": list(code.co_varnames),
"free_vars": list(code.co_freevars),
"globals_used": [
name for name in code.co_names
if name not in dir(__builtins__) if not isinstance(__builtins__, dict)
else name not in __builtins__
],
}
Challenge 3: Type-Safe Registry¶
"""
Implement a type-safe registry where:
- register(key, value) stores a value with type information
- get(key, expected_type) returns the value only if it matches the expected type
- Raise TypeError if type mismatch
"""
class TypeSafeRegistry:
# Your implementation here
pass
# Test
registry = TypeSafeRegistry()
registry.register("name", "Alice")
registry.register("age", 30)
registry.register("scores", [95, 87, 92])
assert registry.get("name", str) == "Alice"
assert registry.get("age", int) == 30
assert registry.get("scores", list) == [95, 87, 92]
try:
registry.get("name", int) # Should raise TypeError
assert False, "Should have raised TypeError"
except TypeError:
pass
print("All tests passed!")
Solution
from typing import TypeVar, Type
T = TypeVar("T")
class TypeSafeRegistry:
def __init__(self) -> None:
self._store: dict[str, object] = {}
def register(self, key: str, value: object) -> None:
self._store[key] = value
def get(self, key: str, expected_type: Type[T]) -> T:
if key not in self._store:
raise KeyError(f"Key '{key}' not found")
value = self._store[key]
if not isinstance(value, expected_type):
raise TypeError(
f"Expected {expected_type.__name__} for key '{key}', "
f"got {type(value).__name__}"
)
return value # type: ignore[return-value]
def keys(self) -> list[str]:
return list(self._store.keys())
System Design Questions¶
Q21: How would you design a configuration system that is type-safe, immutable, and environment-aware?¶
Answer
**Requirements:** Type-safe, immutable, reads from env vars with defaults, validates at construction.from dataclasses import dataclass
from typing import Final
import os
@dataclass(frozen=True, slots=True)
class DatabaseConfig:
host: str = "localhost"
port: int = 5432
name: str = "mydb"
pool_size: int = 10
def __post_init__(self) -> None:
if not 1 <= self.port <= 65535:
raise ValueError(f"Invalid port: {self.port}")
if self.pool_size < 1:
raise ValueError(f"pool_size must be >= 1: {self.pool_size}")
@dataclass(frozen=True, slots=True)
class AppConfig:
debug: bool = False
db: DatabaseConfig = DatabaseConfig()
secret_key: str = ""
def __post_init__(self) -> None:
if not self.debug and not self.secret_key:
raise ValueError("secret_key required in production")
def load_config() -> AppConfig:
"""Load config from environment with type validation."""
return AppConfig(
debug=os.getenv("DEBUG", "false").lower() == "true",
db=DatabaseConfig(
host=os.getenv("DB_HOST", "localhost"),
port=int(os.getenv("DB_PORT", "5432")),
name=os.getenv("DB_NAME", "mydb"),
pool_size=int(os.getenv("DB_POOL_SIZE", "10")),
),
secret_key=os.getenv("SECRET_KEY", "dev-key-change-in-prod"),
)
Behavioral / Scenario Questions¶
Q22: You inherited a codebase with no type hints. How would you introduce them?¶
Answer
**Phased approach:** 1. **Start with CI:** Add `mypy` to CI in `--ignore-missing-imports` mode. No failures initially. 2. **Annotate new code:** All new functions must have type hints (code review rule). 3. **Annotate critical paths first:** Payment processing, auth, data models. Use `reveal_type()` in mypy for debugging. 4. **Use `monkeytype` or `pytype` for auto-generation:** 5. **Gradual strictness:** Start with `mypy --config-file=mypy.ini`: 6. **Full strict mode** once most code is annotated: **Timeline:** 6-12 months for a large codebase. Prioritize public APIs and data models.Q23: A production service is using too much memory. How do you diagnose whether data types are the cause?¶
Answer
**Step-by-step diagnosis:**# 1. Check overall memory usage
import resource
print(f"Peak RSS: {resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1024:.1f} MB")
# 2. Use tracemalloc to find top allocators
import tracemalloc
tracemalloc.start()
# ... run the operation ...
snapshot = tracemalloc.take_snapshot()
for stat in snapshot.statistics("lineno")[:10]:
print(stat)
# 3. Count objects by type
import gc
from collections import Counter
type_counts = Counter(type(obj).__name__ for obj in gc.get_objects())
for obj_type, count in type_counts.most_common(10):
print(f" {obj_type}: {count:,}")
# 4. Check for obvious waste
import sys
# Are you using dicts where __slots__ would work?
# Are you storing millions of small objects?
# Are you caching too aggressively?