A special section on trees/heaps/graphs because these are the data structures I have used the least professionally. Find the introduction on data structures here.

## From graphs..

Graphs can be implemented in three ways says Google/get that job at Google. These are objects and pointers, adjacency list and adjacency matrix, very nicely explained here. In this Leetcode question adjacency list is used backed up by a 2D array but it could have been a `Hashmap<Vertex, List<Vertex>`

, a `List<List<Integer>>`

(ok same as a 2D array) etc. Note that adjacency list as is is not appropriate for weighted graphs.

Method | Suitable for |
---|---|

Pointers | easy to recurse, (un)weighted |

Matrix | dense graphs, (un)directed, (un)weighted |

List | sparse graphs, unweighted |

For reference, there are other ways too, e.g. edge list (it is what the name suggests: a list of edges, the 3rd way listed here, incidence matrix? or this pointer-hell but you are not likely to encounter those.

### Analysis

Analysis is done based on edges and vertices (instead of just n).

Operation | Add/delete edge | Are i-j connected | Find path i-j | Find successors of i |
---|---|---|---|---|

Matrix | O(1) | O(1) | O(n^2) | O(n) |

List of vertices to vertices | logn | logn | O(V+E) for DFS | O(k): k being the list of successors |

Pointers: easy to recurse, can support weights. Space analysis: for sparse graphs

TODO complete with Nutshell and wiki

## ..to trees

Trees are special cases of graphs: an undirected graph that is connected but has no cycles 2. Terminology on trees is not universally agreed upon. Here I follow 2's names.

We are mostly interested in binary trees: formally, a binary tree is either empty or a root with a left and right binary tree. Less formally, each node has at most two children (a special case of k-ary tree for k=2).

- Depth of root is 0. Height of the tree is the maximum depth of any node. A level consists of nodes on the same depth. A leaf is a node with no children (or descendants).
- Every node has a unique parent (except the root). The path to a node from the root is unique as well.
- Full binary tree: every node beside the leaves has 2 children.
- Perfect: a full tree in which all leaves are at the same depth
- Complete: every level except possibly the last is completely filled (so every node beside the leaves has 2 children=full tree??) and all nodes are as far left as possible.
- Balanced: a tree which for every node, the height of the left and right subtrees differ at most by 1. A perfect or complete tree is balanced but not vice versa. The height of a balanced binary tree is logn (which is also the minimum height a tree can take)

Contains reference to parent or not?

## ..to special trees

- Heap: a complete binary tree that fulfils the heap property: every node is >= then its children. Due to the complete-ness, it can be implemented as an array (left child if it exists is 2n and right 2n+1).
- Binary
*search*tree: a binary tree (not complete!) where every node is greater than or equal to the left subtree (not just child) and the less than or equal to the right subtree. - B-tree: a BST with more than 2 children
- Red-black and AVL: self-balanced BSTs
- Balancing a tree is probably off limits for an interview
^{1}but checking for balance is a legit problem.

- Balancing a tree is probably off limits for an interview
- Trie: a n-ary tree that stores characters
- Suffix tree is a compressed trie
- Check solution of 18.8 on how to construct one.
- Video on how to use one.

- Suffix tree is a compressed trie

## Traversals

- BFS: uses a queue, iteratively or recursively (to print per layer)
- Good for optimization problems (find shortest path) or you are ready to quit if you go far from the root.
- Dijkstra is like BFS for a weighted graph.
- In fact it can be implemented with a priority queue BFS.

- DFS: 3 types depending by when the root/current node is processed
- In-order: left tree - node - right tree
- Pre-order: node - left - right, this is what people refer to usually when they say DFS.
- Post-order: left - right - node
- Recursive implementations are the easiest

### Graphs in real Life

- Garbage collection algorithms (mark 'n' sweep) do a graph traversal on the mark part.
- Deadlock detection see 18.4 in 2

### Just for reference

- Dijkstra and Bellman-Ford are
**single**source shortest path- A* is like Dijkstra with heuristics (when you have more information about the graph)
- An (admissible) heuristic can be the Hamming or Manhattan distance of the graph.

- A* is like Dijkstra with heuristics (when you have more information about the graph)
- Floyd-Warshall is to find shortest paths for
**all pairs**of vertices. - Prim's and Kruskal's are for constructing a MST (minimum spanning tree)
- The difference is that with a shortest path algorithm you might end up with something that is not a tree (e.g. has cycles)

## Groups of problems

Easy:

- You are given a tree or graph structure: check balance (4.1 in 1), connectivity of nodes (4.2 in 1), check BST property (4.5 in 1), sum (leetcode 112 or Ctci 4.9),
- Construct a tree from data structure e.g. 4.3 in 1

Hard:

- Model a problem as a tree/graph e.g. the map in 18.10 1 is an adjacency list (Vertices are strings, edges are the key-value association (with direction))

## Resources

- Cracking the Coding interview
^{1}and the accompanying repo - Elements of Programming Interviews in Java
^{2}and the repo - Algorithms in a Nutshell