Oct 12, 2024


Data structures are essential for organizing complex data, enabling efficient retrieval and manipulation. In software engineering interviews, candidates are frequently assessed through coding challenges that require selecting the most suitable data structure and algorithm for a given problem. Below is a thoughtfully curated list of data structures, complete with example questions and sample problems, designed to help you prepare effectively.

One-dimensional Data Structures

Arrays

  1. Subarray Sum Equals K: Given an array of integers and an integer k, find the total number of continuous subarrays whose sum equals to k.

  2. Maximum Product Subarray: Given an integer array nums, find a contiguous non-empty subarray within the array that has the largest product, and return the product.

  3. Container With Most Water: Given n non-negative integers a1, a2, ..., an, where each represents a point at coordinate (i, ai). n vertical lines are drawn such that the two endpoints of line i are at (i, ai) and (i, 0). Find two lines, which together with the x-axis forms a container, such that the container contains the most water.

  4. Jump Game: Given an array of non-negative integers nums, you are initially positioned at the first index of the array. Each element in the array represents your maximum jump length at that position. Determine if you are able to reach the last index.

  5. Product of Array Except Self: Given an integer array nums, return an array answer such that answer[i] is equal to the product of all the elements of nums except nums[i]. You must write an algorithm that runs in O(n) time and without using the division operation.

Strings

  1. Longest Palindromic Substring: Given a string s, find the longest palindromic substring in s.

  2. Group Anagrams: Given an array of strings, group the anagrams together.

  3. Minimum Window Substring: Given two strings s and t, return the minimum window in s which will contain all the characters in t.

  4. Word Ladder: Given two words (beginWord and endWord), and a dictionary's word list, find the length of the shortest transformation sequence from beginWord to endWord.

  5. Edit Distance: Given two strings word1 and word2, return the minimum number of operations required to convert word1 to word2.

  6. Longest Substring Without Repeating Characters: Given a string s, find the length of the longest substring without repeating characters.

Heaps

Heaps are a specialized tree-based data structure that satisfy the heap property. In a max heap, for any given node I, the value of I is greater than or equal to the values of its children, ensuring that the largest element is always at the root. Conversely, in a min heap, the value of I is less than or equal to the values of its children, ensuring that the smallest element is always at the root. Heaps are commonly used to implement priority queues, where the element with the highest (or lowest) priority is always efficiently accessible.

  1. Kth Largest Element in an Array: Given an integer array nums and an integer k, return the kth largest element in the array. This problem requires efficient use of a heap to manage the k largest elements seen so far.

  2. Find Median from Data Stream: Continuously receive numbers and return the median of the numbers seen so far. This involves maintaining two heaps to efficiently track the median as new numbers are added.

  3. Merge K Sorted Lists: Given an array of k linked-lists, each sorted in ascending order, merge them into one sorted linked-list. This problem can be efficiently solved using a min-heap to track the smallest current head among the lists.

  4. Top K Frequent Words: Given an array of strings words and an integer k, return the k most frequent strings. Use a heap to efficiently determine the top k frequent elements.

  5. Sliding Window Median: Given an array of integers nums, there is a sliding window of size k moving from the very left of the array to the very right. You can only see the k numbers in the window. Each time the sliding window moves right by one position, return the median of the current window. This problem requires maintaining a balance between two heaps to efficiently calculate the median as the window slides. very left of the array to the very right. You can only see the k numbers in the window. Each time the sliding window moves right by one position.

Linked Lists

A linked list is a linear data structure where elements, called nodes, are stored in a sequence. Each node contains two parts: data and a reference (or link) to the next node in the sequence. This structure allows for efficient insertion and deletion of elements, as it does not require shifting elements like an array. Linked lists can be singly linked, where each node points to the next node, or doubly linked, where nodes have references to both the next and previous nodes. They are commonly used in scenarios where dynamic memory allocation is needed, and the size of the data structure is not known in advance.

  1. Reverse Linked List: Given the head of a singly linked list, reverse the list and return the reversed list.

  2. Remove Nth Node From End of List: Given the head of a linked list, remove the nth node from the end of the list and return its head.

  3. Detect Cycle in a Linked List: Given the head of a linked list, determine if the linked list has a cycle in it.

  4. Merge K Sorted Lists: Given an array of k linked-lists, each sorted in ascending order, merge them into one sorted linked-list and return it.

  5. Copy List with Random Pointer: A linked list is given such that each node contains an additional random pointer which could point to any node in the list or null. Return a deep copy of the list.

Queues

A queue is a linear data structure that follows the First In First Out (FIFO) principle, meaning that the first element added to the queue will be the first one to be removed. It is analogous to a line of people waiting for a service, where the person who arrives first is served first. Queues are used in various applications such as scheduling processes in operating systems, handling requests in web servers, and managing tasks in print spooling.

Queues can be implemented using arrays or linked lists, and they typically support the following operations:

Queues can also be specialized into different types, such as:

Queues are fundamental in computer science and are used in various algorithms and systems to manage data efficiently.

Here are some illustrative problems that can be elegantly addressed using queues:

  1. Task Scheduler: Given a list of tasks and a cooldown period, determine the minimum time required to complete all tasks. Each task can be executed in one unit of time, and the same task must be separated by at least the cooldown period.

  2. Rotting Oranges: Given a grid of oranges where each cell can be empty, contain a fresh orange, or contain a rotten orange, determine the minimum time required for all fresh oranges to become rotten. A rotten orange can rot adjacent fresh oranges in one unit of time.

  3. Binary Tree Right Side View: Given a binary tree, return the values of the nodes you can see ordered from top to bottom when looking from the right side of the tree.

  4. Jump Game VI: Given an array of integers and an integer k, you can jump from index i to index j if i < j <= i + k. Your score is the sum of the values of the indices you visit. Return the maximum score you can achieve.

  5. Shortest Path in Binary Matrix: Given an n x n binary matrix, find the shortest path from the top-left corner to the bottom-right corner. You can move in 8 possible directions, and the path can only be formed by traversing cells with a value of 0.

Stacks

  1. Valid Parentheses: Given a string containing just the characters '(', ')', '{', '}', '[' and ']', determine if the input string is valid. An input string is valid if open brackets are closed by the same type of brackets and in the correct order.

  2. Evaluate Reverse Polish Notation: Evaluate the value of an arithmetic expression in Reverse Polish Notation. Valid operators are +, -, *, and /. Each operand may be an integer or another expression.

  3. Simplify Path: Given a string path, which is an absolute path (starting with a slash '/') to a file or directory in a Unix-style file system, convert it to the simplified canonical path.

  4. Largest Rectangle in Histogram: Given an array of integers heights representing the histogram's bar height where the width of each bar is 1, return the area of the largest rectangle in the histogram.

  5. Basic Calculator II: Implement a basic calculator to evaluate a simple expression string. The expression string contains non-negative integers, '+', '-', '*', '/' operators, and empty spaces. The integer division should truncate toward zero.

Strings

  1. Longest Substring Without Repeating Characters: Given a string, find the length of the longest substring without repeating characters.

  2. Group Anagrams: Given an array of strings, group the anagrams together. You can return the answer in any order.

  3. Longest Palindromic Substring: Given a string s, return the longest palindromic substring in s.

  4. Word Search: Given an m x n board and a word, find if the word exists in the grid. The word can be constructed from letters of sequentially adjacent cells, where "adjacent" cells are horizontally or vertically neighboring.

  5. Minimum Window Substring: Given two strings s and t, return the minimum window in s which will contain all the characters in t. If there is no such window in s that covers all characters in t, return the empty string "".

Two-dimensional Data Structures

Two-dimensional array

  1. Rotate Image: Given an n x n 2D matrix representing an image, rotate the image by 90 degrees (clockwise).

  2. Set Matrix Zeroes: Given an m x n matrix, if an element is 0, set its entire row and column to 0. Do it in place.

  3. Spiral Matrix: Given an m x n matrix, return all elements of the matrix in spiral order.

  4. Word Search: Given an m x n board and a word, find if the word exists in the grid. The word can be constructed from letters of sequentially adjacent cells, where "adjacent" cells are horizontally or vertically neighboring.

  5. Search a 2D Matrix: Write an efficient algorithm that searches for a value in an m x n matrix. This matrix has the following properties: Integers in each row are sorted from left to right. The first integer of each row is greater than the last integer of the previous row.

Hashtable

  1. Two Sum: Given an array of integers nums and an integer target, return indices of the two numbers such that they add up to target.

  2. Subarray Sum Equals K: Given an array of integers nums and an integer k, return the total number of continuous subarrays whose sum equals to k.

  3. Longest Consecutive Sequence: Given an unsorted array of integers, find the length of the longest consecutive elements sequence.

  4. Top K Frequent Elements: Given a non-empty array of integers, return the k most frequent elements.

  5. Group Anagrams: Given an array of strings, group the anagrams together. You can return the answer in any order.

Hierarchical Data Structures

Trees

  1. Binary Tree Inorder Traversal: Given the root of a binary tree, return the inorder traversal of its nodes' values.

  2. Construct Binary Tree from Preorder and Inorder Traversal: Given two integer arrays preorder and inorder where preorder is the preorder traversal of a binary tree and inorder is the inorder traversal of the same tree, construct and return the binary tree.

  3. Serialize and Deserialize Binary Tree: Design an algorithm to serialize and deserialize a binary tree. There is no restriction on how your serialization/deserialization algorithm should work. You just need to ensure that a binary tree can be serialized to a string and this string can be deserialized to the original tree structure.

  4. Binary Search Tree Iterator: Implement an iterator over a binary search tree (BST). Your iterator will be initialized with the root node of a BST. Calling next() will return the next smallest number in the BST.

  5. Lowest Common Ancestor of a Binary Tree: Given a binary tree, find the lowest common ancestor (LCA) of two given nodes in the tree.

Tries

A trie, also known as a prefix tree, is a tree-like data structure that stores a dynamic set of strings, where the keys are usually strings. It is used to efficiently store and retrieve keys in a dataset of strings. Each node in a trie represents a single character of a string, and the path from the root to a node represents a prefix of the stored strings. Tries are commonly used for tasks such as autocomplete, spell checking, and IP routing, as they allow for fast lookups, insertions, and deletions of words or prefixes.

Here are some example interview questions that can be efficiently solved by Tries:

  1. Autocomplete System: Design a system that suggests a list of words based on a given prefix. The system should efficiently handle a large dataset of words and provide suggestions in real-time as the user types.

  2. Spell Checker: Implement a spell checker that can quickly identify and suggest corrections for misspelled words in a document. The system should be able to handle a large dictionary of words and provide suggestions based on common spelling errors.

  3. Word Search: Given a 2D board and a list of words from the dictionary, find all words in the board. Each word must be constructed from letters of sequentially adjacent cells.

  4. Replace Words: Given a dictionary of roots and a sentence, replace all successors in the sentence with the root forming it. If a successor has multiple roots, replace it with the shortest root.

  5. Longest Word in Dictionary: Given an array of strings words representing an English Dictionary, return the longest word in words that can be built one character at a time by other words in words.

  6. Maximum XOR of Two Numbers in an Array: Given an integer array nums, return the maximum result of nums[i] XOR nums[j], where 0 ≤ i, j < n.

B+ Tree

A B+ tree is a type of self-balancing tree data structure that maintains sorted data and allows for efficient insertion, deletion, and search operations. It is an extension of the B-tree and is commonly used in databases and file systems to store large amounts of data that cannot fit entirely in memory.

Key features of a B+ tree include:

  1. Leaf Nodes: All data entries are stored in the leaf nodes, which are linked together in a linked list. This allows for efficient sequential access to the data.

  2. Internal Nodes: Internal nodes only store keys and act as a guide to direct the search to the appropriate leaf node. They do not store actual data.

  3. Balanced Structure: The tree remains balanced by ensuring that all leaf nodes are at the same depth, which guarantees that the time complexity for search, insertion, and deletion operations is O(log n).

  4. Order: The order of a B+ tree determines the maximum number of children each node can have. A higher order means fewer levels and potentially faster operations, but more space used per node.

  5. Range Queries: B+ trees are particularly efficient for range queries, as the linked list of leaf nodes allows for easy traversal of consecutive data entries.

B+ trees are widely used in database indexing and file systems due to their ability to handle large volumes of data efficiently, providing quick access and updates.

During an interview focused on solutions involving sorted binary trees, the interviewer might inquire about handling scenarios where the tree is too large to fit into memory. In such cases, candidates can enhance their solutions by utilizing B+ trees.

skip lists

A skip list is a data structure that allows for fast search, insertion, and deletion operations within an ordered sequence of elements. It is a probabilistic alternative to balanced trees and is used in various applications where efficient search operations are required.

Skip lists consist of multiple levels of linked lists. The bottom level is a standard sorted linked list, and each higher level acts as an "express lane" that skips over multiple elements, allowing for faster traversal. Each element in the list is represented by a node, and nodes are linked in such a way that they can be traversed both horizontally and vertically.

The key operations in a skip list are:

  1. Search: To search for an element, you start at the topmost level and move forward until you find a node with a value greater than or equal to the target. If the target is not found at the current level, you drop down to the next level and continue the search.

  2. Insertion: To insert an element, you first perform a search to find the appropriate position. Then, you insert the element at the bottom level and randomly decide whether to promote it to higher levels, creating new forward pointers as needed.

  3. Deletion: To delete an element, you perform a search to locate the element and then remove it from all levels where it appears.

Skip lists offer an elegant solution to many problems typically addressed by sorted trees. They are particularly advantageous in scenarios where simplicity and ease of implementation are prioritized, as they do not require complex balancing operations. Additionally, skip lists can be a better choice when average-case performance is sufficient, providing efficient search, insertion, and deletion operations with an average time complexity of O(log n). This makes them a compelling alternative for applications where these operations need to be performed quickly and efficiently.

Graphs

A graph is a data structure that consists of a set of nodes (also called vertices) and a set of edges that connect pairs of nodes. Graphs are used to model relationships between objects, where the objects are represented by nodes and the relationships are represented by edges. Graphs can be directed or undirected, depending on whether the edges have a direction. In a directed graph, each edge has a direction, indicating a one-way relationship, while in an undirected graph, edges have no direction, indicating a two-way relationship. Graphs are widely used in computer science for various applications, such as representing networks, social connections, and paths in a map.

Here are a few possible data structures to represent graphs

  1. Adjacency List: This representation uses a list of lists or a dictionary of lists to store the neighbors of each vertex. It is space-efficient for sparse graphs because it only stores existing edges. The tradeoff is that checking for the existence of a specific edge can be slower, as it may require iterating through a list.

  2. Adjacency Matrix: This representation uses a 2D array to store edge information, where the cell at row i and column j indicates the presence or absence of an edge between vertices i and j. It allows for quick edge existence checks and is efficient for dense graphs. However, it uses more space, especially for graphs with many vertices but few edges.

  3. Edge List: This representation stores a list of all edges in the graph, where each edge is represented as a pair (or tuple) of vertices. It is simple and space-efficient for storing the graph structure. The tradeoff is that operations like checking for the existence of a specific edge or finding all neighbors of a vertex can be slower, as they may require iterating through the entire list.

Here are some questions that can involves graphs:

  1. Course Schedule: Given the number of courses and a list of prerequisite pairs, determine if it is possible to finish all courses.

  2. Network Delay Time: Given a network of nodes, find the time it takes for all nodes to receive a signal sent from a starting node.

  3. Word Ladder: Given two words and a dictionary, find the shortest transformation sequence from the start word to the end word, changing one letter at a time.

  4. Redundant Connection: Given a graph with n nodes and n edges, find an edge that can be removed to make the graph a tree.

  5. Critical Connections in a Network: Given a network of servers and connections, find all critical connections that, if removed, will disconnect the network.


Join us on WeChat

Ming Dao School uses 1-1 coaching and group events to help high-tech professionals grow their careers and handle career transitions.

If you like to join our upcoming mock system design interview events or other coaching programs, please contact us on LinkedIn.