Professional & Direct
Microsoft Interview Prep
Microsoft Q&A: Interview Edition
Navigating the Microsoft Interview
The Microsoft Candidate's Guide
Creative & Playful
Cracking the Code: Microsoft Edition (good for technical roles)
Inside the Microsoft Interview Room
The Redmond Review (referencing Microsoft's HQ city)
Prepping for the Microsoft Maze
Clear & Descriptive
Microsoft Job Interview: Questions & Answers
Your Microsoft Interview Helper
Ask Me: Microsoft Interview Questions
My top recommendation for a chat name would be:
"Mastering the Microsoft Interview"
I
Question 1: Reverse a Linked List
Question: Write a function to reverse a singly linked list.
Explanation: The classic linked list problem. The iterative approach involves changing the next pointers of each node to point to the previous node. You need three pointers: previous, current, and next_temp. You traverse the list, point current.next to previous, and then move all three pointers one step forward.
Sample Code:
class ListNode: def __init__(self, val=0, next=None): self.val = val self.next = next def reverse_list(head: ListNode) -> ListNode: previous = None current = head while current is not None: next_temp = current.next # Store the next node current.next = previous # Reverse the current node's pointer previous = current # Move previous to this node current = next_temp # Move current to the next node return previous # previous is the new head of the reversed list # Example usage: # Original: 1 -> 2 -> 3 -> None # Reversed: 3 -> 2 -> 1 -> None
Question 2: Validate a Binary Search Tree (BST)
Question: Given the root of a binary tree, determine if it is a valid BST.
Explanation: A valid BST is defined as:
The left subtree of a node contains only nodes with keys less than the node's key.
The right subtree of a node contains only nodes with keys greater than the node's key.
Both the left and right subtrees must also be BSTs.
The efficient way is to do an in-order traversal (DFS) while keeping track of the previously visited node. Since an in-order traversal of a BST should yield nodes in strictly increasing order, we can check if each node is greater than the previous one.
Sample Code:
class TreeNode: def __init__(self, val=0, left=None, right=None): self.val = val self.left = left self.right = right def is_valid_bst(root: TreeNode) -> bool: # Helper function for in-order traversal def dfs(node, lower_bound, upper_bound): if not node: return True # Check if current node's value is within the allowed range if not (lower_bound < node.val < upper_bound): return False # For the left child, the upper bound becomes the current node's value. # For the right child, the lower bound becomes the current node's value. return (dfs(node.left, lower_bound, node.val) and dfs(node.right, node.val, upper_bound)) # Start with infinity as the bounds return dfs(root, float('-inf'), float('inf')) # Example of a valid BST: # 2 # / \ # 1 3
Question 3: Find the Missing Number
Question: You are given an array nums containing n distinct numbers in the range [0, n]. Since the array has only n numbers, it's missing one number from the range [0, n]. Find the missing one.
Explanation: The sum of numbers from 0 to n is given by the formula n * (n + 1) // 2. If we calculate this expected sum and subtract the actual sum of the array, the result will be the missing number.
Sample Code:
def missing_number(nums: list[int]) -> int: n = len(nums) expected_sum = n * (n + 1) // 2 actual_sum = sum(nums) return expected_sum - actual_sum # Example: # Input: [3, 0, 1] (n=3, range [0, 3]) # Expected Sum: (3*4)//2 = 6 # Actual Sum: 3 + 0 + 1 = 4 # Missing Number: 6 - 4 = 2
Question 4: Design a Data Structure (MinStack)
Question: Design a stack that supports push, pop, top, and retrieving the minimum element in constant time, O(1).
Explanation: The trick is to use two stacks:
The main stack to store all values.
A separate "min stack" that only stores the current minimum value. Each time you push a value, you also push the current minimum (which is the min of the new value and the previous top of the min stack). This way, each entry in the min stack corresponds to the minimum value at the time it was pushed, ensuring you always know the min, even after popping.
Sample Code:
class MinStack: def __init__(self): self.stack = [] self.min_stack = [] # Tracks the minimum at each level of the stack def push(self, val: int) -> None: self.stack.append(val) # If min_stack is empty, the min is `val`. # Otherwise, it's the min between `val` and the last min. new_min = val if not self.min_stack else min(val, self.min_stack[-1]) self.min_stack.append(new_min) def pop(self) -> None: self.stack.pop() self.min_stack.pop() # Pop from both stacks in sync def top(self) -> int: return self.stack[-1] def getMin(self) -> int: return self.min_stack[-1] # Constant time lookup # Your MinStack object will be instantiated and called as such: # obj = MinStack() # obj.push(5) # obj.push(2) # obj.pop() # param_3 = obj.top() # param_4 = obj.getMin()
These questions test fundamental concepts that are highly valued at companies like Microsoft. Practice understanding the "why" behind each solution, not just the code. Good luck
1. Two Sum (Unique Values)
Question: Given an array of integers and unique values, find if the sum of any two integers is equal to a given value.
Explanation: Use a hash set (or dictionary) to store visited numbers. For each number num in the array, calculate the complement (target - num). If the complement exists in the set, we found a pair. Since values are unique, we don't have to worry about duplicates.
Sample Code:
def two_sum_exists(nums, target): seen = set() for num in nums: complement = target - num if complement in seen: return True seen.add(num) return False # Example nums = [1, 4, 5, 7, 9] target = 12 print(two_sum_exists(nums, target)) # Output: True (5 + 7 = 12)
2. Set Matrix Zeroes
Question: Given a 2D array, make the entire row and column zero if any element is zero.
Explanation: Instead of using the matrix itself for storage (which can cause cascading zeros), use two sets to track which rows and columns need to be zeroed. First pass: record zero locations. Second pass: update the matrix.
Sample Code:
def set_zeroes(matrix): rows, cols = len(matrix), len(matrix[0]) zero_rows, zero_cols = set(), set() # First pass: find all zeros for i in range(rows): for j in range(cols): if matrix[i][j] == 0: zero_rows.add(i) zero_cols.add(j) # Second pass: set rows and columns to zero for i in range(rows): for j in range(cols): if i in zero_rows or j in zero_cols: matrix[i][j] = 0 # Example matrix = [ [1, 2, 3], [4, 0, 6], [7, 8, 9] ] set_zeroes(matrix) print(matrix) # Output: # [[1, 0, 3], # [0, 0, 0], # [7, 0, 9]]
3. Add Two Numbers (Linked Lists)
Question: Add two linked lists where each node contains a digit (numbers are stored in reverse order).
Explanation: Traverse both lists, add corresponding digits along with a carry. Create a new node for the sum % 10 and update the carry to sum // 10.
Sample Code:
class ListNode: def __init__(self, val=0, next=None): self.val = val self.next = next def add_two_numbers(l1, l2): dummy = ListNode() current = dummy carry = 0 while l1 or l2 or carry: sum_val = carry if l1: sum_val += l1.val l1 = l1.next if l2: sum_val += l2.val l2 = l2.next carry = sum_val // 10 current.next = ListNode(sum_val % 10) current = current.next return dummy.next # Example: 342 + 465 = 807 # l1: 2 -> 4 -> 3 # l2: 5 -> 6 -> 4 # Result: 7 -> 0 -> 8
4. Binary Tree Level Order Traversal
Question: Print node values at every level.
Explanation: Use BFS with a queue. Process each level by tracking the queue size.
Sample Code:
from collections import deque class TreeNode: def __init__(self, val=0, left=None, right=None): self.val = val self.left = left self.right = right def level_order(root): if not root: return [] result = [] queue = deque([root]) while queue: level_size = len(queue) current_level = [] for _ in range(level_size): node = queue.popleft() current_level.append(node.val) if node.left: queue.append(node.left) if node.right: queue.append(node.right) result.append(current_level) return result # Example: # 3 # / \ # 9 20 # / \ # 15 7 # Output: [[3], [9,20], [15,7]]
5. Correct BST with Two Nodes Swapped
Question: Fix a BST where two nodes are swapped.
Explanation: In-order traversal of BST should be sorted. Find two nodes where the order is violated: first anomaly is larger than next, second anomaly is smaller than previous.
Sample Code:
def recover_bst(root): stack = [] prev = first = middle = last = None current = root while current or stack: while current: stack.append(current) current = current.left current = stack.pop() if prev and prev.val > current.val: if not first: first = prev middle = current else: last = current prev = current current = current.right if first and last: first.val, last.val = last.val, first.val elif first and middle: first.val, middle.val = middle.val, first.val # Example: # Original BST: [3,1,4,null,null,2] # Inorder: [1,3,2,4] -> should be [1,2,3,4] # Swap nodes 3 and 2
6. Search in Rotated Sorted Array
Question: Find an element in a rotated sorted array.
Explanation: Use modified binary search. Find which half is sorted and check if target is in that sorted half.
Sample Code:
def search_rotated(nums, target): left, right = 0, len(nums) - 1 while left <= right: mid = (left + right) // 2 if nums[mid] == target: return mid # Left half is sorted if nums[left] <= nums[mid]: if nums[left] <= target < nums[mid]: right = mid - 1 else: left = mid + 1 # Right half is sorted else: if nums[mid] < target <= nums[right]: left = mid + 1 else: right = mid - 1 return -1 # Example: nums = [4,5,6,7,0,1,2], target = 0 # Output: 4
7. Validate BST
Question: Check if a binary tree is a valid BST.
Explanation: Use DFS with valid range (min, max) for each node.
Sample Code:
def is_valid_bst(root, low=float('-inf'), high=float('inf')): if not root: return True if not (low < root.val < high): return False return (is_valid_bst(root.left, low, root.val) and is_valid_bst(root.right, root.val, high)) # Example: # 2 # / \ # 1 3 -> Valid # # 5 # / \ # 1 4 # / \ # 3 6 -> Invalid (4 is not > 5)
8. Palindrome Permutation
Question: Check if any permutation of a string is a palindrome.
Explanation: For a string to have a palindromic permutation, at most one character can have an odd frequency.
Sample Code:
from collections import Counter def can_form_palindrome(s): freq = Counter(s) odd_count = 0 for count in freq.values(): if count % 2 != 0: odd_count += 1 if odd_count > 1: return False return True # Example: s = "code" -> False s = "aab" -> True ("aba")
9. Balanced Binary Tree
Question: Check if a binary tree is balanced (height difference ≤ 1 for all nodes).
Explanation: Use DFS to compute height. If any subtree is unbalanced, return -1.
Sample Code:
def is_balanced(root): def check_height(node): if not node: return 0 left_height = check_height(node.left) if left_height == -1: return -1 right_height = check_height(node.right) if right_height == -1 or abs(left_height - right_height) > 1: return -1 return max(left_height, right_height) + 1 return check_height(root) != -1 # Example: # Balanced: # 1 # / \ # 2 3 # # Unbalanced: # 1 # / # 2 # / # 3
10. Reverse Words in a String
Question: Reverse the order of characters in a string (word by word).
Explanation: Split the string into words, reverse the list, and join.
Sample Code:
def reverse_words(s): words = s.split() return ' '.join(words[::-1]) # Example: s = "the sky is blue" print(reverse_words(s)) # "blue is sky the"
11. Longest Palindromic Substring (Non-letter? Assuming alphanumeric)
Question: Find the longest palindromic substring.
Explanation: Expand around center for each character. Handle odd and even length palindromes.
Sample Code:
def longest_palindrome(s): def expand(l, r): while l >= 0 and r < len(s) and s[l] == s[r]: l -= 1 r += 1 return s[l+1:r] res = "" for i in range(len(s)): odd = expand(i, i) even = expand(i, i+1) res = max(res, odd, even, key=len) return res # Example: s = "babad" print(longest_palindrome(s)) # "bab" or "aba"
12. Best Time to Buy and Sell Stock
Question: Find maximum profit from buy and sell prices.
Explanation: Track the minimum price and update maximum profit.
Sample Code:
def max_profit(prices): min_price = float('inf') max_profit = 0 for price in prices: if price < min_price: min_price = price elif price - min_price > max_profit: max_profit = price - min_price return max_profit # Example: prices = [7,1,5,3,6,4] print(max_profit(prices)) # 5 (buy at 1, sell at 6)
13. Find Missing Number in Unsorted Array
Question: Find missing number in unsorted array of 1 to n.
Explanation: Use the sum formula: expected sum = n*(n+1)/2. Subtract actual sum.
Sample Code:
def missing_number(nums): n = len(nums) + 1 # Since one number is missing expected_sum = n * (n + 1) // 2 actual_sum = sum(nums) return expected_sum - actual_sum # Example: nums = [3, 7, 1, 2, 8, 4, 5] # n=8, missing 6 print(missing_number(nums)) # 6
14. Validate IP Address
Question: Validate IPv4 or IPv6 address.
Explanation: Split by '.' or ':' and validate each part.
Sample Code:
def valid_ip_address(ip): if '.' in ip: parts = ip.split('.') if len(parts) != 4: return "Neither" for part in parts: if not part.isdigit() or not (0 <= int(part) <= 255) or (len(part) > 1 and part[0] == '0'): return "Neither" return "IPv4" elif ':' in ip: parts = ip.split(':') if len(parts) != 8: return "Neither" for part in parts: if not (1 <= len(part) <= 4) or not all(c in '0123456789abcdefABCDEF' for c in part): return "Neither" return "IPv6" return "Neither" # Example: ip = "192.168.1.1" print(valid_ip_address(ip)) # IPv4
15. Clone Linked List with Random Pointer
Question: Clone a linked list with next and random pointers.
Explanation: Use a dictionary to map original nodes to new nodes. Two passes: first create nodes, then set pointers.
Sample Code:
class Node: def __init__(self, x, next=None, random=None): self.val = x self.next = next self.random = random def copy_random_list(head): if not head: return None mapping = {} current = head while current: mapping[current] = Node(current.val) current = current.next current = head while current: if current.next: mapping[current].next = mapping[current.next] if current.random: mapping[current].random = mapping[current.random] current = current.next return mapping[head]
16. LCA of BST
Question: Find lowest common ancestor in BST.
Explanation: Use BST property: LCA is the node where p and q split.
Sample Code:
def lowest_common_ancestor(root, p, q): while root: if p.val < root.val and q.val < root.val: root = root.left elif p.val > root.val and q.val > root.val: root = root.right else: return root # Example: # 6 # / \ # 2 8 # / \ / \ # 0 4 7 9 # LCA(2,4) = 2
17. Combination Sum (Digits)
Question: Find all combinations of digits in an integer that add to target.
Explanation: Use DFS/backtracking. Convert integer to string to access digits.
Sample Code:
def combination_sum(num, target): s = str(num) n = len(s) res = [] def backtrack(start, path, total): if total == target: res.append(path) return for i in range(start, n): digit = int(s[i]) if total + digit <= target: backtrack(i+1, path + [digit], total + digit) backtrack(0, [], 0) return res # Example: num = 123, target = 3 print(combination_sum(num, target)) # [[1,2], [3]]
18. Pattern Matching (Wildcard)
Question: Check if pattern matches text (with '*' and '?').
Explanation: Use dynamic programming. dp[i][j] is True if first i chars in text match first j chars in pattern.
Sample Code:
def is_match(text, pattern): m, n = len(text), len(pattern) dp = [[False] * (n+1) for _ in range(m+1)] dp[0][0] = True for j in range(1, n+1): if pattern[j-1] == '*': dp[0][j] = dp[0][j-1] for i in range(1, m+1): for j in range(1, n+1): if pattern[j-1] == '*': dp[i][j] = dp[i][j-1] or dp[i-1][j] elif pattern[j-1] == '?' or text[i-1] == pattern[j-1]: dp[i][j] = dp[i-1][j-1] return dp[m][n] # Example: text = "adceb", pattern = "*a*b" print(is_match(text, pattern)) # True
19. Clone Graph
Question: Clone a graph with nodes having neighbors.
Explanation: Use DFS and a dictionary to map original nodes to clones.
Sample Code:
class Node: def __init__(self, val=0, neighbors=None): self.val = val self.neighbors = neighbors if neighbors is not None else [] def clone_graph(node): if not node: return None mapping = {} def dfs(original): if original in mapping: return mapping[original] clone = Node(original.val) mapping[original] = clone for neighbor in original.neighbors: clone.neighbors.append(dfs(neighbor)) return clone return dfs(node)
20. Vertical Traversal of Binary Tree
Question: Print vertical traversal from leftmost to rightmost.
Explanation: Use BFS with column tracking. Use a dictionary to group nodes by column.
Sample Code:
from collections import deque, defaultdict def vertical_traversal(root): if not root: return [] column_table = defaultdict(list) queue = deque([(root, 0)]) while queue: node, col = queue.popleft() column_table[col].append(node.val) if node.left: queue.append((node.left, col-1)) if node.right: queue.append((node.right, col+1)) min_col = min(column_table.keys()) max_col = max(column_table.keys()) return [column_table[col] for col in range(min_col, max_col+1)] # Example: # 1 # / \ # 2 3 # / \ / \ # 4 5 6 7 # Vertical: [[4], [2], [1,5,6], [3], [7]]
21. Longest Substring Without Repeating Characters
Question: Find longest substring with distinct characters.
Explanation: Use sliding window with a set. Expand right, if duplicate found, shrink left.
Sample Code:
def length_of_longest_substring(s): char_set = set() left = 0 max_len = 0 for right in range(len(s)): while s[right] in char_set: char_set.remove(s[left]) left += 1 char_set.add(s[right]) max_len = max(max_len, right - left + 1) return max_len # Example: s = "abcabcbb" print(length_of_longest_substring(s)) # 3 ("abc")
22. Find Missing Element in Linked List
Question: Find missing element in linked list (assuming sorted and consecutive?).
Explanation: If linked list is sorted and consecutive, traverse and find gap.
Sample Code:
def find_missing_in_linked_list(head): current = head while current and current.next: if current.next.val != current.val + 1: return current.val + 1 current = current.next return None # No missing # Example: 1->2->4->5, missing 3
23. Middle of Linked List
Question: Find middle element of linked list.
Explanation: Use slow and fast pointers. When fast reaches end, slow is at middle.
Sample Code:
def middle_node(head): slow = fast = head while fast and fast.next: slow = slow.next fast = fast.next.next return slow # Example: 1->2->3->4->5, middle is 3 # Example: 1->2->3->4, middle is 3 (second middle)
24. Shortest Distance for All People to Meet
Question: Grid with people, find meeting point with min total distance.
Explanation: Best meeting point is median of x-coordinates and median of y-coordinates.
Sample Code:
def min_total_distance(grid): rows, cols = [], [] for i in range(len(grid)): for j in range(len(grid[0])): if grid[i][j] == 1: rows.append(i) cols.append(j) cols.sort() mid_row = rows[len(rows)//2] mid_col = cols[len(cols)//2] total = 0 for i in rows: total += abs(i - mid_row) for j in cols: total += abs(j - mid_col) return total # Example: grid = [ [1,0,0,0,1], [0,0,0,0,0], [0,0,1,0,0] ] print(min_total_distance(grid)) # 6
25. Merge Two Sorted Linked Lists
Question: Merge two sorted linked lists.
Explanation: Use dummy node. Compare nodes and attach the smaller one.
Sample Code:
def merge_two_lists(l1, l2): dummy = ListNode() current = dummy while l1 and l2: if l1.val < l2.val: current.next = l1 l1 = l1.next else: current.next = l2 l2 = l2.next current = current.next current.next = l1 if l1 else l2 return dummy.next # Example: # l1: 1->2->4, l2: 1->3->4 # Merged: 1->1->2->3->4->4
Of course. Designing systems is a key part of senior engineering interviews at companies like Microsoft. Here are detailed designs for the problems you've presented, following a structured approach.
1. Design an In-Memory Database System
Core Functionality: Fast read/write operations entirely in RAM, with optional persistence.
Key Components:
Data Store: A concurrent dictionary/hash map for O(1) access. Key → Value.
Persistence Engine:
Snapshotting: Periodically dump the entire dataset to a file (e.g., JSON, binary).
Write-Ahead Log (WAL): Append every write operation (SET, DELETE) to a log file. On restart, replay the log to reconstruct state.
Concurrency Control: Use fine-grained locking (e.g., per key or shard) or software transactional memory (STM) to allow concurrent operations.
Sample Code Skeleton:
import threading import json class InMemoryDB: def __init__(self, snapshot_file='snapshot.json', wal_file='wal.log'): self._store = {} self._lock = threading.Lock() # Coarse-grained lock for simplicity self.snapshot_file = snapshot_file self.wal_file = wal_file self._load_snapshot() # Replaying the WAL would happen here to get to the latest state def _load_snapshot(self): try: with open(self.snapshot_file, 'r') as f: self._store = json.load(f) except FileNotFoundError: self._store = {} def _write_to_wal(self, command, key, value=None): with open(self.wal_file, 'a') as f: log_entry = f"{command}|{key}|{value}\n" if value else f"{command}|{key}\n" f.write(log_entry) def set(self, key, value): with self._lock: self._store[key] = value self._write_to_wal('SET', key, value) def get(self, key): with self._lock: return self._store.get(key) def delete(self, key): with self._lock: if key in self._store: del self._store[key] self._write_to_wal('DELETE', key) def create_snapshot(self): with self._lock: with open(self.snapshot_file, 'w') as f: json.dump(self._store, f) # Can truncate the WAL after a successful snapshot
2. Design an IP Blocking System
Core Functionality: Quickly check if an IP address is blocked.
Key Components:
Storage: Use a Bloom Filter for memory-efficient, fast checks. It can give false positives (says an IP is blocked when it's not) but never false negatives. For exact checks, a Redis or Memcached set can be used as a second layer.
Blocking Rules: Support for exact IPs, subnets (e.g.,
192.168.1.0/24), and wildcards. Store subnets in a specialized structure like a Trie (Radix Tree) for IPs.API:
block(ip_address_or_subnet)unblock(ip_address_or_subnet)is_blocked(ip_address) -> Boolean
Sample High-Level Design:
from pybloom_live import ScalableBloomFilter # Requires pybloom package import ipaddress class IPBlocker: def __init__(self): # Scalable Bloom Filter handles growing data self._bloom = ScalableBloomFilter() # For exact matching and complex rules self._blocked_ips = set() self._blocked_subnets = [] # List of ipaddress.IPv4Network objects def block_ip(self, ip_str): try: ip_obj = ipaddress.ip_address(ip_str) self._bloom.add(ip_str) self._blocked_ips.add(ip_str) except ValueError: # Assume it's a subnet net_obj = ipaddress.ip_network(ip_str, strict=False) self._blocked_subnets.append(net_obj) # Add all IPs in a large subnet to Bloom Filter? Not efficient. # Rely on checking subnets separately for accuracy. def is_blocked(self, ip_str): ip_obj = ipaddress.ip_address(ip_str) # 1. Fast Bloom Filter check for exact IPs if ip_str in self._bloom: return True # 2. Check if IP belongs to any blocked subnet for blocked_net in self._blocked_subnets: if ip_obj in blocked_net: return True return False # Usage blocker = IPBlocker() blocker.block_ip("192.168.1.5") blocker.block_ip("10.0.0.0/8") print(blocker.is_blocked("192.168.1.5")) # True print(blocker.is_blocked("10.0.1.2")) # True print(blocker.is_blocked("8.8.8.8")) # False
3. Design a Voting System
Core Functionality: Cast votes and tally results accurately and securely.
Key Components:
Idempotency: A voter can only vote once. Use a persistent set (e.g., in Redis or DB) of
(voter_id, campaign_id)to check duplicates.Data Model:
Candidates(candidate_id, name, campaign_id)Votes(vote_id, candidate_id, voter_id, timestamp)
Tallying: Use a distributed counter. For a real-time tally, increment counters in Redis/Memcached for each candidate when a vote is cast.
INCR candidate:<id>:votes
Sample Architecture:
Request Flow:
User submits vote
{voter_id: "user123", candidate_id: 55}.API checks if
user123has already voted in this campaign.If not, it atomically increments the counter for candidate
55.It records the
(voter_id, candidate_id)in the database to prevent duplicate votes.
Getting Results: Simply
GET candidate:<id>:votesfor each candidate.
4. Design a System for Top 100 Products
Core Functionality: Continuously calculate the top 100 selling products in a category for a time window (e.g., last 24 hours).
Key Components:
Data Ingestion: A stream of sales events
(product_id, category, quantity, timestamp).Stream Processing: Use a framework like Apache Flink or Spark Streaming.
Sliding Window: Aggregate sales per product within the window.
Ranking: Maintain a Top-K Heap for each category. A min-heap of size 100 tracks the top sellers. If a new product's sales exceed the heap's minimum, pop the min and push the new product.
Sample Pseudocode:
from collections import defaultdict import heapq # In the streaming job for a specific category category_top_k = defaultdict(lambda: MinHeap(100)) # One heap per category def process_sale_event(event): product_id = event['product_id'] category = event['category'] quantity = event['quantity'] # Update the total sales for this product in the window product_sales[category][product_id] += quantity # Get the current sales count for this product current_sales = product_sales[category][product_id] # Update the Top-K heap for the category heap = category_top_k[category] if len(heap) < 100: heapq.heappush(heap, (current_sales, product_id)) else: # If sales are greater than the smallest top seller, replace it min_sales, min_id = heap[0] if current_sales > min_sales: heapq.heapreplace(heap, (current_sales, product_id)) # The heap now contains the top 100 products for the category
5. Design an Online Bookstore
Core Functionality: View products, prices, and make purchases.
Key Components (Microservices):
Catalog Service: Manages book information. Uses a SQL DB (PostgreSQL).
GET /books?category=Sci-FiInventory Service: Tracks stock levels. Uses a fast cache (Redis) for checks.
GET /inventory/{book_id}Shopping Cart Service: Manages user carts. Stored in Redis for speed and ephemeral data.
Order Service: Processes purchases. Uses a SQL DB for transactional safety. Publishes an
order_createdevent.Payment Service: Listens to
order_createdevents and handles payment gateway integration.Recommendation Service: Suggests books based on user history.
Sample Cart Service Code:
# shopping_cart_service.py (Using Flask & Redis) import redis from flask import Flask, request, jsonify app = Flask(__name__) redis_client = redis.Redis(host='localhost', port=6379, decode_responses=True) @app.route('/cart/<user_id>', methods=['GET']) def get_cart(user_id): cart = redis_client.hgetall(f'cart:{user_id}') return jsonify(cart) @app.route('/cart/<user_id>/add/<book_id>', methods=['POST']) def add_to_cart(user_id, book_id): quantity = request.json.get('quantity', 1) # Check inventory service first (HTTP call) # ... redis_client.hincrby(f'cart:{user_id}', book_id, quantity) return jsonify({"status": "added"})
6. Design a People Counter for a Building
Core Functionality: Track the number of people on each floor in real-time.
Key Components:
Sensors: Entry/exit sensors (e.g., IoT door counters) on each floor publish events to a message broker (e.g., MQTT or Kafka).
Event:
{sensor_id: "floor1_entry", type: "in", timestamp: 12345}
Stream Processor: A simple stateful service consumes the event stream.
State per Floor: Maintain a counter for each floor. An
inevent increments the counter, anoutevent decrements it.Dashboard: A web service queries the current value of these counters from the stream processor or a database like Redis.
Sample Stream Processing Logic:
# Using a Kafka consumer from pykafka import KafkaClient import redis redis_client = redis.Redis() client = KafkaClient(hosts="localhost:9092") topic = client.topics['sensor-events'] consumer = topic.get_simple_consumer() for message in consumer: if message is not None: event = json.loads(message.value.decode()) floor = event['sensor_id'].split('_')[0] # e.g., extract "floor1" change = 1 if event['type'] == 'in' else -1 redis_client.incrby(f'occupancy:{floor}', change) # Dashboard reads from Redis: occupancy:floor1, occupancy:floor2, etc.
7. Design an Image Tagging and Search System
Core Functionality: Upload images with tags, search images by tags.
Key Components:
Storage:
Object Store (S3): Store the actual image files.
Metadata DB (PostgreSQL): Store
image_id, user_id, S3_url, upload_date.
Tagging Index (Inverted Index):
Table:
Tags(tag_name, image_id)with an index ontag_name.OR Search:
SELECT image_id FROM Tags WHERE tag_name IN ('mountain', 'lake')AND Search:
SELECT image_id FROM Tags WHERE tag_name = 'mountain' INTERSECT SELECT image_id FROM Tags WHERE tag_name = 'lake'
API:
POST /upload(with image file and tags)GET /search?tags=beach,sunset(returns matching image metadata)
8. Design a Notification Service
Core Functionality: Send notifications (email, SMS, push) to many users/devices.
Key Components:
Notification Request:
{user_id, message_template, context_data, channels: ['email', 'push']}User Preferences Service: Checks if the user wants to be notified via the requested channel.
Message Queue (RabbitMQ): Decouples the request from sending. A high-priority queue for urgent messages.
Worker Pool: Consumers that take jobs from the queue.
Channel Connectors: Separate services or SDKs for Email (e.g., SendGrid), SMS (e.g., Twilio), Push (e.g., Firebase FCM).
Sample Architecture:
The main API receives a request, validates it, and places a message on the queue.
Email workers consume messages, render the HTML template using the context data, and send it via the Email connector.
9. Design a Pausable Workflow System
Core Functionality: A long-running workflow that can be paused and resumed.
Key Components:
Workflow Engine: Use an existing framework like Temporal or Airflow. They handle persistence, timers, and state.
State Persistence: The engine automatically persists the state of each workflow (including local variables) to a DB. This is called checkpointing.
Pause/Continue: Expose a signal API to the workflow. When a "pause" signal is received, the current activity finishes, and the workflow does not schedule the next one until a "continue" signal is received.
Workflow Definition:
python# Pseudocode for a Temporal workflow def my_workflow(params): result1 = yield workflow.execute_activity(activity1, params) # Checkpoint: state is saved here # Wait for a pause/continue signal yield workflow.wait_for_signal('continue_signal') result2 = yield workflow.execute_activity(activity2, result1) return result2
10. Design a Low-Latency Scheduler Service
Core Functionality: Manage millions of future tasks (schedules) and execute them on time with minimal delay.
Key Components:
Time-Ordered Data Structure: Store tasks in a Min-Heap prioritized by execution time. Not scalable for millions in a single heap.
Scalable Architecture (Sharded Timer Wheel):
Sharding: Partition schedules by time or a unique ID. Each shard has its own heap or wheel.
Priority Queues in DB: Use a database with sorted sets (Redis ZSET). Score = execution timestamp.
ZRANGEBYSCORE tasks_queue <current_timestamp> <current_timestamp>to get due tasks.
Polling Workers: Multiple workers constantly poll the database for due tasks. To reduce latency, use a notification mechanism where the writer notifies a worker when a new task is added that is the next to execute.
Deduplication: Use idempotent task IDs to prevent duplicate execution.
Sample Redis-Based Design:
Adding a Task:
ZADD tasks_queue <execution_timestamp> <task_id>. Store task details in a separate hashtask:<task_id>.Polling Worker Loop:
pythonwhile True: current_time = time.time() # Get all tasks whose score (timestamp) is <= current_time due_task_ids = redis_client.zrangebyscore('tasks_queue', 0, current_time) for task_id in due_task_ids: task_data = redis_client.hgetall(f'task:{task_id}') execute_task(task_data) # Remove from the queue after execution redis_client.zrem('tasks_queue', task_id) time.sleep(0.1) # Poll every 100ms for low latency
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.