Oct 17, 2024


Graph Representation and Coding Interview Questions

Graphs are a data structure that shows how objects relate to each other. A graph has vertices (or nodes) and edges that connect them. Graphs can be directed or undirected, weighted or unweighted, and can have cycles or be acyclic.

Types of Graph Representation

Adjacency Matrix

An adjacency matrix is a 2D array where the element at row i and column j tells you if there's an edge between vertex i and vertex j. This representation works well for dense graphs where there are many edges.

Example:

# Adjacency Matrix Representation
graph = [
    [0, 1, 0, 0],
    [1, 0, 1, 1],
    [0, 1, 0, 1],
    [0, 1, 1, 0]
]

Weighted Adjacency Matrix

A weighted adjacency matrix is a 2D array where the element at row i and column j stores the weight of the edge between vertex i and vertex j. You can use 0 or None to show that no edge exists between those vertices.

Example:

# Weighted Adjacency Matrix Representation
graph = [
    [0, 2, 0, 0],
    [2, 0, 3, 1],
    [0, 3, 0, 4],
    [0, 1, 4, 0]
]

Adjacency List

An adjacency list is an array of lists. The index represents the vertex, and each element in the list contains the vertices adjacent to it. This is the go-to representation for sparse graphs where there aren't many edges.

Example:

# Adjacency List Representation
graph = {
    0: [1],
    1: [0, 2, 3],
    2: [1, 3],
    3: [1, 2]
}

Weighted Adjacency List

A weighted adjacency list is an array of lists where each list contains tuples. The index represents the vertex, and each tuple in the list has an adjacent vertex and the weight of the edge connecting them. This works well for sparse graphs with weighted edges.

Example:

# Weighted Adjacency List Representation
graph = {
    0: [(1, 2)],
    1: [(0, 2), (2, 3), (3, 1)],
    2: [(1, 3), (3, 4)],
    3: [(1, 1), (2, 4)]
}

Edge List

An edge list is exactly what it sounds like. It is a list of all edges in the graph, where each edge is represented as a pair of vertices. This format is handy for algorithms that work with edges directly.

Example:

# Edge List Representation
edges = [
    (0, 1),
    (1, 2),
    (1, 3),
    (2, 3)
]

Weighted Edge List

A weighted edge list is a list of all edges where each edge is a tuple containing two vertices and the weight of the edge. This format is useful for edge-centric algorithms that need to consider weights, like Kruskal's algorithm for finding a Minimum Spanning Tree.

Example:

# Weighted Edge List Representation
edges = [
    (0, 1, 2),
    (1, 2, 3),
    (1, 3, 1),
    (2, 3, 4)
]

Pros and Cons of Different Graph Representations

Adjacency Matrix

Pros:

Cons:

Adjacency List

Pros:

Cons:

Edge List

Pros:

Cons:

Each representation has trade-offs. The right choice depends on your specific problem. Think about how dense the graph is and what operations you need to perform.

Coding Interview Questions

For the questions below, we assume the graph is represented as an adjacency list. Here is the format:

graph = {
    0: [1],
    1: [0, 2, 3],
    2: [1, 3],
    3: [1, 2]
}

Graph Traversal

Problem: Implement Depth-First Search (DFS) and Breadth-First Search (BFS) for a given graph.

Solution:

def dfs(graph, start, visited=None):
    if visited is None:
        visited = set()
    visited.add(start)
    for next in graph[start]:
        if next not in visited:
            dfs(graph, next, visited)
    return visited

def bfs(graph, start):
    visited = set()
    queue = [start]
    while queue:
        vertex = queue.pop(0)
        if vertex not in visited:
            visited.add(vertex)
            queue.extend(set(graph[vertex]) - visited)
    return visited

Cycle Detection for Directed Graph

Problem: Determine if a cycle exists in a given directed graph. In a directed graph, being able to travel from A to B does not mean you can travel from B to A.

Main Idea: Use depth-first search (DFS) to detect cycles by tracking visited nodes and the recursion stack.

Solution:

def has_cycle(graph):
    def visit(vertex, visited, rec_stack):
        visited.add(vertex)
        rec_stack.add(vertex)
        for neighbor in graph[vertex]:
            if neighbor not in visited:
                if visit(neighbor, visited, rec_stack):
                    return True
            elif neighbor in rec_stack:
                return True
        rec_stack.remove(vertex)
        return False

    visited = set()
    rec_stack = set()
    for node in graph:
        if node not in visited:
            if visit(node, visited, rec_stack):
                return True
    return False

Cycle Detection for Undirected Graph

Explanation: In an undirected graph, each edge goes both ways. If there is an edge from node A to node B, there is also an edge from node B to node A. Detecting cycles in these graphs requires care. You do not want to mistake the two-way nature of edges for a cycle. The approach uses DFS while keeping track of the parent node so you do not count going back to where you came from as a cycle.

Solution:

def has_cycle_undirected(graph):
    def visit(vertex, visited, parent):
        visited.add(vertex)
        for neighbor in graph[vertex]:
            if neighbor not in visited:
                if visit(neighbor, visited, vertex):
                    return True
            elif parent is not None and neighbor != parent:
                return True
        return False

    visited = set()
    for node in graph:
        if node not in visited:
            if visit(node, visited, None):
                return True
    return False

This solution uses DFS similar to the directed graph case, but it adds a parent parameter to track which node we came from. This prevents the algorithm from falsely detecting a cycle when it encounters the edge that brought it to the current node.

Shortest Path for Unweighted Graph

Problem: Find the shortest path between two nodes in an unweighted graph.

Main Idea: Use breadth-first search (BFS) to explore all possible paths from the start node. BFS naturally finds the shortest path in unweighted graphs because it explores all nodes at distance k before moving to distance k+1.

Solution:

from collections import deque

def shortest_path(graph, start, goal):
    queue = deque([(start, [start])])
    while queue:
        (vertex, path) = queue.popleft()
        for next in graph[vertex] - set(path):
            if next == goal:
                return path + [next]
            else:
                queue.append((next, path + [next]))
    return None

Shortest Path for Weighted Graph

Problem: Find the shortest path between two nodes in a weighted graph.

Main Idea: Use Dijkstra's algorithm, which is the standard approach for finding shortest paths in graphs with non-negative edge weights. The algorithm uses a priority queue to always explore the node with the smallest current distance first. As it explores, it updates the shortest known distance to each node.

Solution:

import heapq

def dijkstra(graph, start, goal):
    # Priority queue to store (distance, vertex) tuples
    queue = [(0, start)]
    # Dictionary to store the shortest distance to each node
    distances = {start: 0}
    # Dictionary to store the path taken to reach each node
    previous_nodes = {start: None}

    while queue:
        current_distance, current_vertex = heapq.heappop(queue)

        # If we reach the goal, reconstruct the path
        if current_vertex == goal:
            path = []
            while current_vertex is not None:
                path.append(current_vertex)
                current_vertex = previous_nodes[current_vertex]
            return path[::-1]

        # Explore neighbors
        for neighbor, weight in graph[current_vertex].items():
            distance = current_distance + weight

            # If a shorter path to the neighbor is found
            if neighbor not in distances or distance < distances[neighbor]:
                distances[neighbor] = distance
                previous_nodes[neighbor] = current_vertex
                heapq.heappush(queue, (distance, neighbor))

    return None

This implementation of Dijkstra's algorithm uses a priority queue to always expand the least costly node first, which guarantees that the shortest path is found. The algorithm also keeps track of previous nodes so you can reconstruct the actual path once you reach the goal node.