Diferente pentru blog/numbers-everyone-should-know intre reviziile #29 si #35

Nu exista diferente intre titluri.

Diferente intre continut:

'Jeff Dean':http://research.google.com/people/jeff/ , a famous Google engineer, popularized a list of latency 'numbers everyone should know':http://highscalability.com/numbers-everyone-should-know. The list is a great resource for designing large scale infrastructure systems.
*sidenote* 'Jeff Dean':http://research.google.com/people/jeff/ , a famous Google engineer, popularized a list of latency 'numbers everyone should know':http://highscalability.com/numbers-everyone-should-know. The list is a great resource for designing large scale infrastructure systems.
Algorithms and their complexity often occur in critical parts of computer systems, but I find that few engineers have a good understanding of how a O(n!) algorithm compares to a O(n^5^) one.
These numbers aren't very precise, they assume in memory operations and some varying constant factors, but they do give a good starting point in your search for a solution that fits your problem and your data size.
Let's go through an example. Suppose you work for a GPS company and your project is to improve their directions feature. In school you've learned about using Dijkstra's algorithm to find the shortest path between two nodes in a graph. Knowing these numbers you will understand that it will take seconds to process a graph with millions of edges given that Dijkstra implementations have m log n time complexity (where m is the number of edges and n the number of nodes).
Let's go through an example.
 
Suppose you work for a GPS company and your project is to improve their directions feature. In school you've learned about using Dijkstra's algorithm to find the shortest path between two nodes in a graph. Knowing these numbers you will understand that it will take seconds to process a graph with millions of edges given that Dijkstra implementations have m log n time complexity (where m is the number of edges and n the number of nodes).
Now you face a few questions:
How fast do you want your code to be? seconds? hundreds of milliseconds?
Usually people feel something on the web is fast if it takes less then half a second to execute. So let's say half a second.
A response on the web feels fast if it takes less then 500 milliseconds. So let's pick half a second.
 
How big is the graph? Do you want to solve the problem for a city, a coutry or even a continent?
 
Each is about a magnitude larger than the other and will be solved by different approaches. Let's say we want to solve the problem for the whole of Europe.
How big is your graph? Do you want to solve the problem for a city, a coutry or even a continent?
Here are sizes for a few input sets:
Each is about a magnitude larger than the other and may mean you need different approaches. Here are sizes for a few input sets:
|_. input |_. Europe |_. USA/CAN |_. USA (Tiger) |
| #nodes | 18 029 721 | 18 741 705 | 24 278 285 |
| #directed edges | 42 199 587 | 47 244 849 | 58 213 192 |
| #road categories | 13 | 13 | 4 |
From the given time limit it's obvious that pure Dijkstra won't do and we need to look at how other algorithms behave like A star search or http://algo2.iti.kit.edu/schultes/hwy/esa06HwyHierarchies.pdf
Since we chose half a second to be our execution time and the size of our problem to be about 40 million edges it's clear from our table that m log n is too slow. So pure Dijkstra won't do. We need to look at how other algorithms like A star search or one based on 'Highway hierarchies':http://algo2.iti.kit.edu/schultes/hwy/esa06HwyHierarchies.pdf behave for this problem.

Diferente intre securitate:

public
protected

Topicul de forum nu a fost schimbat.