A special section on trees/heaps/graphs because these are the data structures I have used the least professionally. Find the introduction on data structures here.
Graphs can be implemented in three ways says Google/get that job at Google. These are objects and pointers, adjacency list and adjacency matrix, very nicely explained here. In this Leetcode question adjacency list is used backed up by a 2D array but it could have been a
Hashmap<Vertex, List<Vertex>, a
List<List<Integer>> (ok same as a 2D array) etc. Note that adjacency list as is is not appropriate for weighted graphs.
|Pointers||easy to recurse, (un)weighted|
|Matrix||dense graphs, (un)directed, (un)weighted|
|List||sparse graphs, unweighted|
For reference, there are other ways too, e.g. edge list (it is what the name suggests: a list of edges, the 3rd way listed here, incidence matrix? or this pointer-hell but you are not likely to encounter those.
Analysis is done based on edges and vertices (instead of just n).
|Operation||Add/delete edge||Are i-j connected||Find path i-j||Find successors of i|
|List of vertices to vertices||logn||logn||O(V+E) for DFS||O(k): k being the list of successors|
Pointers: easy to recurse, can support weights. Space analysis: for sparse graphs
TODO complete with Nutshell and wiki
We are mostly interested in binary trees: formally, a binary tree is either empty or a root with a left and right binary tree. Less formally, each node has at most two children (a special case of k-ary tree for k=2).
- Depth of root is 0. Height of the tree is the maximum depth of any node. A level consists of nodes on the same depth. A leaf is a node with no children (or descendants).
- Every node has a unique parent (except the root). The path to a node from the root is unique as well.
- Full binary tree: every node beside the leaves has 2 children.
- Perfect: a full tree in which all leaves are at the same depth
- Complete: every level except possibly the last is completely filled (so every node beside the leaves has 2 children=full tree??) and all nodes are as far left as possible.
- Balanced: a tree which for every node, the height of the left and right subtrees differ at most by 1. A perfect or complete tree is balanced but not vice versa. The height of a balanced binary tree is logn (which is also the minimum height a tree can take)
Contains reference to parent or not?
..to special trees
- Heap: a complete binary tree that fulfils the heap property: every node is >= then its children. Due to the complete-ness, it can be implemented as an array (left child if it exists is 2n and right 2n+1).
- Binary search tree: a binary tree (not complete!) where every node is greater than or equal to the left subtree (not just child) and the less than or equal to the right subtree.
- B-tree: a BST with more than 2 children
- Red-black and AVL: self-balanced BSTs
- Balancing a tree is probably off limits for an interview1 but checking for balance is a legit problem.
- Trie: a n-ary tree that stores characters
- BFS: uses a queue, iteratively or recursively (to print per layer)
- Good for optimization problems (find shortest path) or you are ready to quit if you go far from the root.
- Dijkstra is like BFS for a weighted graph.
- In fact it can be implemented with a priority queue BFS.
- DFS: 3 types depending by when the root/current node is processed
- In-order: left tree - node - right tree
- Pre-order: node - left - right, this is what people refer to usually when they say DFS.
- Post-order: left - right - node
- Recursive implementations are the easiest
Graphs in real Life
- Garbage collection algorithms (mark 'n' sweep) do a graph traversal on the mark part.
- Deadlock detection see 18.4 in 2
Just for reference
- Dijkstra and Bellman-Ford are single source shortest path
- Floyd-Warshall is to find shortest paths for all pairs of vertices.
- Prim's and Kruskal's are for constructing a MST (minimum spanning tree)
- The difference is that with a shortest path algorithm you might end up with something that is not a tree (e.g. has cycles)
Groups of problems
- You are given a tree or graph structure: check balance (4.1 in 1), connectivity of nodes (4.2 in 1), check BST property (4.5 in 1), sum (leetcode 112 or Ctci 4.9),
- Construct a tree from data structure e.g. 4.3 in 1
- Model a problem as a tree/graph e.g. the map in 18.10 1 is an adjacency list (Vertices are strings, edges are the key-value association (with direction))