Graph Algorithms for Interviews: BFS, DFS, and Beyond — Coding Interviews

Why Graphs Deserve Their Own Deep Dive

If you scan the problems asked at Google, Meta, Amazon, and Microsoft over the past two years, graph problems show up in roughly 25-30% of coding rounds. That's not surprising — graphs model relationships, and most real-world systems are fundamentally about relationships: social networks, road maps, dependency chains, web crawlers, recommendation engines.

Yet graphs intimidate engineers more than almost any other topic. The reason is straightforward: unlike arrays or trees, graphs don't have a single canonical traversal. You need to choose a representation, choose a traversal strategy, handle cycles, and often combine graph algorithms with other patterns like dynamic programming or binary search. That combination of decisions is what makes graph problems feel harder — and it's exactly what interviewers are testing.

This article gives you reusable templates for every major graph algorithm you'll encounter in interviews. Each template is something you can internalize and adapt, not something you need to memorize line by line. We'll cover when to reach for each algorithm, what the verbal walkthrough should sound like, and where candidates commonly trip up.

The single biggest mistake in graph interview problems is jumping to code before clarifying the graph structure. Is it directed or undirected? Weighted or unweighted? Can it have cycles? Can it be disconnected? These four questions determine which algorithm you need, and skipping them leads to solutions that solve the wrong problem.

Graph Representation: Adjacency List vs. Adjacency Matrix

Before you can traverse a graph, you need to represent it. Interviews almost always give you an edge list — something like edges = [[0,1], [1,2], [2,0]] — and you need to convert it into a usable structure. You have two primary options.

Adjacency List

An adjacency list maps each node to a list of its neighbors. In Python, this is typically a defaultdict(list). In Java, a HashMap<Integer, List<Integer>>. In JavaScript, a Map or plain object.

Template (Python):

from collections import defaultdict

def build_adj_list(edges, directed=False):
    graph = defaultdict(list)
    for u, v in edges:
        graph[u].append(v)
        if not directed:
            graph[v].append(u)
    return graph

For weighted graphs, store tuples: graph[u].append((v, weight)).

Adjacency Matrix

An adjacency matrix is a 2D array where matrix[i][j] = 1 (or the edge weight) if there's an edge from node i to j. It's rarely the right choice in interviews because it uses O(V^2) space regardless of edge count.

When to use a matrix: The graph is dense (edges close to V^2), the problem involves checking if a specific edge exists in O(1), or the input is already given as a grid (like in "Number of Islands" — the grid is the adjacency matrix).

When to use a list: Almost always. Sparse graphs, traversal-heavy problems, and anything where you iterate over neighbors. Interview graphs are sparse in the vast majority of cases.

Quick Reference

Operation	Adjacency List	Adjacency Matrix
Space	O(V + E)	O(V^2)
Check if edge exists	O(degree)	O(1)
Iterate neighbors of v	O(degree of v)	O(V)
Add an edge	O(1)	O(1)
Best for	Sparse graphs, traversals	Dense graphs, edge lookups

In practice, default to adjacency list. Tell the interviewer why: "The graph has V nodes and E edges where E is much less than V squared, so an adjacency list is more space-efficient and gives us O(degree) neighbor iteration."

BFS and DFS: The Two Fundamental Traversals

Every graph algorithm is built on one of two traversal strategies. Choosing wrong doesn't just slow you down — it can make the problem unsolvable within the expected complexity. Here's the decision framework.

BFS Template

BFS explores nodes level by level using a queue. It guarantees the shortest path in unweighted graphs. Use it whenever the problem says "minimum," "shortest," "nearest," or "fewest steps."

from collections import deque

def bfs(graph, start):
    visited = set([start])
    queue = deque([start])
    while queue:
        node = queue.popleft()
        for neighbor in graph[node]:
            if neighbor not in visited:
                visited.add(neighbor)
                queue.append(neighbor)
    return visited

For shortest path, track distance: initialize dist = {start: 0}, and set dist[neighbor] = dist[node] + 1 when enqueuing.

Interview problem — Rotting Oranges (LC 994): Multi-source BFS. Start by enqueuing all rotten oranges simultaneously. Each BFS level represents one minute of spreading. The answer is the number of levels processed. The key insight candidates miss: this isn't single-source BFS from one cell — you seed the queue with every rotten orange at time 0.

Verbal walkthrough: "I'll model the grid as an implicit graph where each cell connects to its four neighbors. Since I need the minimum time for all oranges to rot, I need shortest paths from multiple sources — that's multi-source BFS. I'll initialize the queue with all initially rotten cells at distance 0 and expand level by level."

DFS Template

DFS explores as deep as possible before backtracking, using recursion or an explicit stack. Use it for exhaustive search, path finding, connected components, and cycle detection.

def dfs(graph, node, visited):
    visited.add(node)
    for neighbor in graph[node]:
        if neighbor not in visited:
            dfs(graph, neighbor, visited)
    return visited

Iterative version (avoids stack overflow on deep graphs):

def dfs_iterative(graph, start):
    visited = set()
    stack = [start]
    while stack:
        node = stack.pop()
        if node in visited:
            continue
        visited.add(node)
        for neighbor in graph[node]:
            if neighbor not in visited:
                stack.append(neighbor)
    return visited

Interview problem — Number of Islands (LC 200): Treat the grid as a graph. For each unvisited land cell, run DFS to mark all connected land cells. Count how many times you initiate a DFS — that's the island count.

Verbal walkthrough: "I'll iterate through each cell. When I find an unvisited '1', I increment my island count and run DFS to mark all connected land cells as visited. Each DFS call discovers one complete island. Time is O(rows * cols) since each cell is visited at most once."

When to Use BFS vs. DFS

This decision trips up candidates constantly. The rule is simpler than people think:

BFS when you need shortest path, minimum steps, or level-order processing. BFS is also better when the answer is likely close to the start node — it finds nearby solutions first.
DFS when you need to explore all paths, detect cycles, find connected components, or the problem involves backtracking. DFS is also the default for tree problems (pre-order, in-order, post-order are all DFS variants).
Either works for simple reachability ("can I get from A to B?") or visiting all nodes. In these cases, pick whichever you can implement more cleanly.

One nuance worth mentioning: BFS uses O(width) space where width is the maximum number of nodes at any level. DFS uses O(depth) space. For very wide, shallow graphs, DFS is more space-efficient. For deep, narrow graphs, BFS is more space-efficient. In interviews, this trade-off rarely matters — but mentioning it shows depth of understanding.

Grid Problems Are Graph Problems

A pattern that catches many candidates off guard: grid problems are graph problems in disguise. A 2D grid of cells is an implicit graph where each cell is a node and each cell connects to its 4 (or 8) neighbors. You don't build an adjacency list — you compute neighbors on the fly using direction arrays.

directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]

def bfs_grid(grid, start_r, start_c):
    rows, cols = len(grid), len(grid[0])
    visited = set()
    visited.add((start_r, start_c))
    queue = deque([(start_r, start_c)])
    while queue:
        r, c = queue.popleft()
        for dr, dc in directions:
            nr, nc = r + dr, c + dc
            if 0 <= nr < rows and 0 <= nc < cols and (nr, nc) not in visited and grid[nr][nc] == 1:
                visited.add((nr, nc))
                queue.append((nr, nc))
    return visited

The boundary check 0 <= nr < rows and 0 <= nc < cols replaces the adjacency list lookup. Everything else — BFS, DFS, visited set — works identically to explicit graph traversal. Problems like Walls and Gates (LC 286), Shortest Path in Binary Matrix (LC 1091), and Pacific Atlantic Water Flow (LC 417) all follow this pattern. Once you internalize that grids are graphs, these problems become mechanical rather than intimidating.

Interview problem — Word Ladder (LC 127): Find the shortest transformation sequence from a start word to an end word, changing one letter at a time, where each intermediate word must be in a dictionary. This is BFS on an implicit graph where each word is a node and two words are connected if they differ by exactly one character. The "shortest transformation" phrasing is the BFS signal. Build the graph by generating all one-letter variations of each word and checking dictionary membership.

Verbal walkthrough: "I need the shortest path between two words where edges connect words differing by one letter — that's BFS on an implicit graph. For each word, I'll generate all possible one-character substitutions and check if they're in the dictionary. I'll use a set for O(1) lookup and track visited words to avoid cycles. The BFS level equals the transformation distance."

Topological Sort: Ordering Dependencies

Topological sort produces a linear ordering of nodes in a directed acyclic graph (DAG) such that for every edge u to v, u appears before v. If the graph has a cycle, no valid topological order exists — which makes topological sort double as a cycle detection method for directed graphs.

Kahn's Algorithm (BFS-based)

This is the version you should use in interviews. It's more intuitive and naturally detects cycles.

from collections import deque, defaultdict

def topological_sort(num_nodes, edges):
    graph = defaultdict(list)
    in_degree = [0] * num_nodes
    for u, v in edges:
        graph[u].append(v)
        in_degree[v] += 1

    queue = deque([i for i in range(num_nodes) if in_degree[i] == 0])
    order = []

    while queue:
        node = queue.popleft()
        order.append(node)
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)

    if len(order) != num_nodes:
        return []  # cycle detected
    return order

Interview problem — Course Schedule II (LC 210): Given numCourses and prerequisite pairs, return a valid course order. This is a direct application of Kahn's algorithm. The prerequisite pairs are edges, and the topological order is the course sequence.

Verbal walkthrough: "This is a dependency ordering problem on a directed graph — I need topological sort. I'll use Kahn's algorithm: compute in-degrees, seed a queue with all zero-in-degree nodes, and process them. If I can't process all nodes, there's a circular dependency and no valid order exists. Time is O(V + E)."

Interview problem — Alien Dictionary (LC 269): Given a sorted list of words in an alien language, determine the character ordering. Build a directed graph where an edge from char a to char b means a comes before b. Derive edges by comparing adjacent words character by character. Then run topological sort on the character graph.

For a broader view of how topological sort fits alongside other interview patterns, see our coding interview patterns overview.

Shortest Path Algorithms

BFS gives shortest paths in unweighted graphs. When edges have weights, you need something stronger.

Dijkstra's Algorithm

Dijkstra's finds the shortest path from a source to all other nodes in a graph with non-negative edge weights. It's a greedy algorithm that uses a priority queue (min-heap) to always process the nearest unvisited node.

import heapq
from collections import defaultdict

def dijkstra(graph, start, n):
    dist = [float('inf')] * n
    dist[start] = 0
    heap = [(0, start)]  # (distance, node)

    while heap:
        d, u = heapq.heappop(heap)
        if d > dist[u]:
            continue  # stale entry
        for v, weight in graph[u]:
            if dist[u] + weight < dist[v]:
                dist[v] = dist[u] + weight
                heapq.heappush(heap, (dist[v], v))
    return dist

The if d > dist[u]: continue line is the lazy deletion optimization. Without it, you process nodes multiple times. With it, Dijkstra's runs in O((V + E) log V) using a binary heap.

Interview problem — Network Delay Time (LC 743): Given a directed weighted graph, find the time for a signal to reach all nodes from a source. Run Dijkstra's from the source. The answer is the maximum distance in the result. If any node is unreachable (distance remains infinity), return -1.

Verbal walkthrough: "This is single-source shortest path with positive weights — Dijkstra's. I'll build an adjacency list with weights, initialize distances to infinity except the source at 0, and use a min-heap to greedily process the nearest node. The answer is the max distance across all nodes, since the signal must reach everyone."

When You Need Bellman-Ford

Dijkstra's breaks with negative edge weights because its greedy assumption — that the shortest known path to a node is final — no longer holds. Bellman-Ford handles negative weights by relaxing all edges V-1 times. If any distance improves on the V-th iteration, a negative cycle exists.

def bellman_ford(n, edges, start):
    dist = [float('inf')] * n
    dist[start] = 0

    for _ in range(n - 1):
        for u, v, w in edges:
            if dist[u] != float('inf') and dist[u] + w < dist[v]:
                dist[v] = dist[u] + w

    # Check for negative cycles
    for u, v, w in edges:
        if dist[u] != float('inf') and dist[u] + w < dist[v]:
            return None  # negative cycle
    return dist

When to use each: Dijkstra's is your default for weighted shortest path. Switch to Bellman-Ford only if the problem explicitly mentions negative weights or asks you to detect negative cycles. In interviews, Dijkstra's covers 90% of weighted shortest path problems. Bellman-Ford is rare but worth knowing — the "Cheapest Flights Within K Stops" problem (LC 787) is a modified Bellman-Ford that limits relaxation to K iterations.

A Common Dijkstra Pitfall

Candidates frequently implement Dijkstra's without the stale entry check (if d > dist[u]: continue) and then wonder why their solution is slow or wrong. Without this check, you process the same node multiple times with outdated distances. The heap in Python doesn't support decrease-key, so stale entries accumulate. The continue check is the standard workaround — it makes the lazy deletion approach work correctly. Always include it, and when the interviewer asks about it, explain that it's compensating for the lack of a decrease-key operation in a standard binary heap.

Another subtlety: Dijkstra's doesn't work with negative weights because once you pop a node from the heap and mark its distance as final, a later path through a negative edge could produce a shorter distance — but you've already committed. This is the fundamental reason Bellman-Ford exists: it relaxes all edges V-1 times, allowing shorter paths discovered later to propagate.

Union-Find for Connected Components

Union-Find (also called Disjoint Set Union or DSU) tracks which nodes belong to the same connected component. It supports two operations: find (which component does this node belong to?) and union (merge two components). With path compression and union by rank, both operations run in nearly O(1) amortized time.

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
        self.components = n

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # path compression
        return self.parent[x]

    def union(self, x, y):
        px, py = self.find(x), self.find(y)
        if px == py:
            return False  # already connected
        if self.rank[px] < self.rank[py]:
            px, py = py, px
        self.parent[py] = px
        if self.rank[px] == self.rank[py]:
            self.rank[px] += 1
        self.components -= 1
        return True

When to use Union-Find vs. BFS/DFS: Both can find connected components. Union-Find shines when edges arrive incrementally (streaming), when you need to efficiently merge components, or when you need to quickly check if two nodes are connected without retraversing the graph. DFS/BFS is simpler when you have the full graph upfront and just need to count components once.

Interview problem — Number of Connected Components (LC 323): Given n nodes and an edge list, find the number of connected components. Initialize Union-Find with n components. For each edge, union the two endpoints. The final components count is your answer.

Interview problem — Redundant Connection (LC 684): Find the edge that, when removed, makes the graph a tree. Process edges one by one. The first edge where union returns False (both endpoints are already connected) is the redundant edge — it creates a cycle.

Verbal walkthrough: "I need to detect the edge that creates a cycle. Union-Find is ideal here — I process edges in order and union each pair. The moment I try to union two nodes that already share a root, that edge is redundant. Time is effectively O(E * alpha(V)) which is nearly O(E)."

Cycle Detection

Detecting cycles is a different problem in directed vs. undirected graphs, and using the wrong approach is a common interview mistake.

Undirected Graphs

Two approaches work. Union-Find: process each edge; if both endpoints have the same root, you found a cycle. DFS: track the parent of each node; if you visit a neighbor that's already visited and isn't the parent, there's a cycle.

def has_cycle_undirected(graph, n):
    visited = set()

    def dfs(node, parent):
        visited.add(node)
        for neighbor in graph[node]:
            if neighbor not in visited:
                if dfs(neighbor, node):
                    return True
            elif neighbor != parent:
                return True  # back edge = cycle
        return False

    for i in range(n):
        if i not in visited:
            if dfs(i, -1):
                return True
    return False

Directed Graphs

In directed graphs, you can't just check "visited" — a node visited by a different DFS path doesn't imply a cycle. Instead, use three-color marking: white (unvisited), gray (in current DFS path), black (fully processed). A cycle exists if and only if you encounter a gray node during DFS — that's a back edge in the current path.

def has_cycle_directed(graph, n):
    WHITE, GRAY, BLACK = 0, 1, 2
    color = [WHITE] * n

    def dfs(node):
        color[node] = GRAY
        for neighbor in graph[node]:
            if color[neighbor] == GRAY:
                return True  # back edge = cycle
            if color[neighbor] == WHITE and dfs(neighbor):
                return True
        color[node] = BLACK
        return False

    for i in range(n):
        if color[i] == WHITE:
            if dfs(i):
                return True
    return False

Alternatively, Kahn's topological sort detects directed cycles: if the topological order contains fewer than V nodes, the graph has a cycle. This is often simpler to implement and explain.

Interview problem — Course Schedule (LC 207): Determine if all courses can be finished given prerequisite pairs. This is cycle detection in a directed graph. If the prerequisite graph has a cycle, it's impossible. Use either three-color DFS or Kahn's algorithm — both are valid, but Kahn's is easier to get right under pressure.

Verbal walkthrough: "The question is whether the prerequisite graph has a cycle. If course A requires B, and B requires C, and C requires A, no valid ordering exists. I'll model prerequisites as directed edges and run Kahn's topological sort. If all nodes get processed, it's acyclic and the courses can be completed. If some nodes remain with nonzero in-degree, there's a cycle."

The Directed vs. Undirected Trap

This is the single most common bug in cycle detection during interviews. In an undirected graph, when you run DFS and encounter a visited neighbor, you must check whether it's the parent — because every edge is traversed in both directions. If you skip the parent check, you'll falsely detect a cycle on every edge. In a directed graph, parent tracking is irrelevant — what matters is whether the neighbor is in the current DFS path (gray), not just whether it's been visited at all. Using undirected cycle detection logic on a directed graph (or vice versa) produces incorrect results. Before writing any cycle detection code, confirm with the interviewer whether the graph is directed, and choose your approach accordingly.

Putting It All Together: Algorithm Selection Guide

When you see a graph problem in an interview, run through this decision tree:

Problem Type	Algorithm	Time Complexity	Key Signal
Shortest path, unweighted	BFS	O(V + E)	"minimum steps," "shortest path," grid traversal
Shortest path, positive weights	Dijkstra's	O((V+E) log V)	Weighted edges, no negative weights
Shortest path, negative weights	Bellman-Ford	O(V * E)	Negative edge weights, or limited stops/edges
All paths / exhaustive search	DFS + backtracking	O(V + E) to O(2^V)	"find all," "does any path exist"
Dependency ordering	Topological sort	O(V + E)	Prerequisites, build order, scheduling
Connected components	Union-Find or DFS	O(V + E)	"number of groups," "are X and Y connected"
Cycle detection (directed)	Three-color DFS or Kahn's	O(V + E)	"is it possible," circular dependencies
Cycle detection (undirected)	Union-Find or DFS with parent	O(V + E)	"redundant edge," "find the extra connection"

Print this table or reconstruct it from memory before your interview. The ability to quickly map a problem statement to the right algorithm is more than half the battle.

One final note on practice: graph problems reward structured thinking more than any other category. If you find yourself writing code before you've identified whether the graph is directed or undirected, weighted or unweighted, cyclic or acyclic — stop. Those properties determine the algorithm, and the algorithm determines the code. Get the properties right first.

If you want to pressure-test your graph algorithm skills with timed practice, Hoppers AI's coding mock interviews include graph problems with real-time feedback on your approach and verbal explanation — a useful way to build the habit of articulating your algorithm choice before coding. For related patterns that frequently combine with graph algorithms, see our guides on dynamic programming (many graph problems use DP for optimal paths) and trees and binary search (trees are a special case of graphs, and the traversal intuition transfers directly).