# 14 numbers every developer should know

Jeff Dean , a famous Google engineer, popularized a list of latency numbers everyone should know. The list is a great resource for designing large scale infrastructure systems.

Algorithms and their complexity often occur in critical parts of computer systems, but I find that few engineers have a good understanding of how a O(n!) algorithm compares to a O(n^{5}) one.

In the coding contest world, competitors think about these tradeoffs all the time. No wonder, there's a set of numbers every algorithm designer should know.

The table below shows the limits that can be reached in a few seconds by algorithms of different complexities, n being the input size. I've added some algorithms and data structure examples for each complexity class.

maximum n | complexity | algorithms | data structures |
---|---|---|---|

1,000,000,000 and higher | log n, sqrt n | binary search, ternary search, fast exponentiation, euclid algorithm | |

10,000,000 | n, n log log n, n log* n | set intersection, Eratosthenes sieve, radix sort, KMP, topological sort, Euler tour, strongly connected components, 2sat | disjoint sets, tries, hash_map, rolling hash deque |

1,000,000 | n log n | sorting, divide and conquer, sweep line, Kruskal, Dijkstra | segment trees, range trees, heaps, treaps, binary indexed trees, suffix arrays |

100,000 | n log^{2} n | divide and conquer | 2d range trees |

50,000 | n^{1.585}, n sqrt n | Karatsuba, square root trick | two level tree |

1000 - 10,000 | n^{2} | largest empty rectangle, Dijkstra, Prim (on dense graphs) | |

300-500 | n^{3} | all pairs shortest paths, largest sum submatrix, naive matrix multiplication, matrix chain multiplication, gaussian elimination, network flow | |

30-50 | n^{4}, n^{5}, n^{6} | ||

25 - 40 | 3^{n/2}, 2^{n/2} | meet in the middle | hash tables (for set intersection) |

15 - 24 | 2^{n} | subset enumeration, brute force, dynamic programming with exponential states | |

15 - 20 | n^{2} 2^{n} | dynamic programming with exponential states | bitsets, hash_map |

13-17 | 3^{n} | dynamic programming with exponential states | hash_map (to store the states) |

11 | n! | brute force, backtracking, next_permutation | |

8 | n^{n} | brute force, cartesian product |

These numbers aren't very precise, they assume in memory operations and some varying constant factors, but they do give a good starting point in your search for a solution that fits your problem and your data size.

Let's go through an example.

Suppose you work for a GPS company and your project is to improve their directions feature. In school you've learned about using Dijkstra's algorithm to find the shortest path between two nodes in a graph. Knowing these numbers you will understand that it will take seconds to process a graph with millions of edges given that Dijkstra implementations have m log n time complexity (where m is the number of edges and n the number of nodes).

Now you face a few questions:

How fast do you want your code to be? seconds? hundreds of milliseconds?

A response on the web feels fast if it takes less then 500 milliseconds. So let's pick half a second.

How big is the graph? Do you want to solve the problem for a city, a coutry or even a continent?

Each is about a magnitude larger than the other and will be solved by different approaches. Let's say we want to solve the problem for the whole of Europe.

Here are sizes for a few input sets:

input | Europe | USA/CAN | USA (Tiger) |
---|---|---|---|

#nodes | 18 029 721 | 18 741 705 | 24 278 285 |

#directed edges | 42 199 587 | 47 244 849 | 58 213 192 |

#road categories | 13 | 13 | 4 |

Since we chose half a second to be our execution time and the size of our problem to be about 40 million edges it's clear from our table that m log n is too slow. So pure Dijkstra won't do. We need to look at how other algorithms like A star search or one based on Highway hierarchies behave for this problem.

**Categorii:**