Diferente pentru blog/rolling-hash intre reviziile #14 si #15

Nu exista diferente intre titluri.

Diferente intre continut:

Let’s choose <tex>H(S[i..j]) = S[i]+S[i+1]a+S[i+2]a{2}+...+S[j]a^{j-i} </tex> Then we compute the array h which contains cummulative hashes: <tex>h[k]=S[0]+S[1]a+S[2]a^{2}+...+S[k]a^{k}</tex>. It’s easy to see that <tex>H(S[i..j])=(h[j]-h[i-1])/a^i</tex>

After spending O(n) time on preprocessing h, we have an efficient way to compute the hash code for any subsequence of S.
 
The division operation modulo a prime needs some number theory insight. You can do it using Fermat's little theorem or using the Extended Euclidean Algorithm. If you need to process a lot of sub strings it may be useful to pre process all the values for <tex>a^{-i} modulo p</tex>.

h2. Caveats
Hashing collisions are a problem. One way to deal with them is to check for a match when two fingerprints are equal, but this makes the solution inefficient if there are lots of matches. Another way is to use more hashing functions to decrease the collision probability.

infoarena informatica de performanta

Diferente pentru blog/rolling-hash intre reviziile #14 si #15

Nu exista diferente intre titluri.

Diferente intre continut:

Nu exista diferente intre securitate.

Topicul de forum nu a fost schimbat.