Why Data Structures Matter¶
Choosing the right data structure can mean the difference between code that runs in milliseconds vs minutes. Understanding time complexity is not just academic — it’s what separates a junior dev from a senior one.
Lists¶
Python lists are dynamic arrays. Great for ordered collections.
# O(1) - append, index access
numbers = [1, 2, 3, 4, 5]
numbers.append(6) # O(1) amortized
value = numbers[2] # O(1)
# O(n) - search, insert, delete
numbers.insert(0, 0) # O(n) - shifts everything
numbers.remove(3) # O(n) - linear search
3 in numbers # O(n) - linear search
When to Use Lists¶
- Ordered data that needs index access
- When you mostly append/pop from the end
- Small to medium collections where search isn’t frequent
Dictionaries (Hash Maps)¶
Python’s dict is implemented as a hash table. The most versatile data structure.
# O(1) average for get, set, delete, membership
portfolio = {
"AAPL": 150.25,
"GOOGL": 2800.00,
"MSFT": 300.50,
}
price = portfolio["AAPL"] # O(1)
portfolio["TSLA"] = 900.00 # O(1)
"AAPL" in portfolio # O(1)
del portfolio["GOOGL"] # O(1)
Common Patterns¶
from collections import defaultdict, Counter
# Counting occurrences
words = ["python", "java", "python", "c#", "python"]
counter = Counter(words)
# Counter({'python': 3, 'java': 1, 'c#': 1})
# Grouping items
from collections import defaultdict
trades = [
("AAPL", 100), ("GOOGL", 50),
("AAPL", 200), ("MSFT", 150),
]
by_ticker = defaultdict(list)
for ticker, amount in trades:
by_ticker[ticker].append(amount)
# {'AAPL': [100, 200], 'GOOGL': [50], 'MSFT': [150]}
Sets¶
Unordered collections of unique elements. Perfect for membership testing and deduplication.
# O(1) for add, remove, membership
visited = {"Paris", "New York", "Hong Kong"}
visited.add("Tokyo") # O(1)
"Paris" in visited # O(1)
# Set operations - O(min(len(s), len(t)))
a = {1, 2, 3, 4}
b = {3, 4, 5, 6}
a | b # Union: {1, 2, 3, 4, 5, 6}
a & b # Intersection: {3, 4}
a - b # Difference: {1, 2}
a ^ b # Symmetric difference: {1, 2, 5, 6}
Deque (Double-Ended Queue)¶
From collections.deque. O(1) append and pop from both ends.
from collections import deque
# O(1) for append/pop on both ends
q = deque([1, 2, 3])
q.append(4) # [1, 2, 3, 4]
q.appendleft(0) # [0, 1, 2, 3, 4]
q.pop() # 4, deque is [0, 1, 2, 3]
q.popleft() # 0, deque is [1, 2, 3]
# Bounded deque (sliding window)
window = deque(maxlen=3)
for price in [100, 101, 99, 102, 98]:
window.append(price)
print(f"Last 3 prices: {list(window)}")
Heap (Priority Queue)¶
Python’s heapq implements a min-heap.
import heapq
# O(log n) for push/pop, O(1) for peek
prices = [150, 100, 200, 50, 175]
heapq.heapify(prices) # O(n)
lowest = heapq.heappop(prices) # O(log n) → 50
heapq.heappush(prices, 80) # O(log n)
# Top-K pattern
top_3 = heapq.nlargest(3, prices) # [200, 175, 150]
bottom_3 = heapq.nsmallest(3, prices) # [80, 100, 150]
Complexity Cheat Sheet¶
| Operation | List | Dict | Set | Deque |
|---|---|---|---|---|
| Access by index | O(1) | - | - | O(n) |
| Search | O(n) | O(1) | O(1) | O(n) |
| Insert at end | O(1)* | O(1) | O(1) | O(1) |
| Insert at start | O(n) | - | - | O(1) |
| Delete | O(n) | O(1) | O(1) | O(n) |
| Sort | O(n log n) | - | - | - |
*amortized
Key Takeaways¶
- Default to
dictwhen you need fast lookups by key - Use
setfor membership testing and deduplication - Use
dequewhen you need fast operations on both ends - Use
heapqfor priority queues and top-K problems - Know your complexities — it matters in interviews and production code