FUDforum: comp.lang.php » FYI performance of hash algorithms

Home » Imported messages » comp.lang.php » FYI performance of hash algorithms

Show: Today's Messages :: Polls :: Message Navigator

Re: FYI performance of hash algorithms [message #177853 is a reply to message #177852]

Mon, 23 April 2012 20:16

Jerry Stuckle
Messages: 2598
Registered: September 2010

Karma:

Senior Member

On 4/23/2012 3:42 PM, M. Strobel wrote:
> I am just asking myself which of the many hash algorithms performs best, so I can use
> it to index my translation strings.
>
> So I wrote a test script. It iterates over an array of 3.917.116 strings calculating
> the hash of each.
>
> Here you have the results, with run time, and run time relative to hash length
> (smaller is better).
> Interesting that the tiger hashes seem to perform well.
>
> There were only collisions with adler32 (many), crc32 and crc32b.
>
> /Str.
>

<snip>

Interesting, but only applicable on your system with your libraries and
your data. A different system could have different versions of the
libraries and completely different performance figures.

As for collisions - completely dependent on your data. While you can
say some things with certainty (i.e. hashing 2^16+1 values with CRC-16
is guaranteed to have at least one collision), by picking different
strings to be hashed your collisions will come out differently.

For instance, it's not really surprising that adler32 had a lot of
collisions. It was designed for CRC and works better with binary data;
it was never meant to be a general purpose hashing algorithm. Rather it
was designed for speed. It's also very short.

The same is true but to a lesser extent with the CRC-32 and CRC-32b
algorithms, so again I'm not surprised about the collisions.

But the real question here is - how much does it matter if hashing a
string takes 2 microseconds or 2.5 microseconds? Unless you're doing
millions of them at one time, no one is going to notice the difference.
Personally, when looking at such things I always elect the fewest
collisions over the fastest time.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex(at)attglobal(dot)net
==================

Report message to a moderator

[Message index]

		FYI performance of hash algorithms By: M. Strobel on Mon, 23 April 2012 19:42
		Re: FYI performance of hash algorithms By: Jerry Stuckle on Mon, 23 April 2012 20:16
		Re: FYI performance of hash algorithms By: M. Strobel on Mon, 23 April 2012 20:44

Previous Topic:	Stats comp.lang.php (last 7 days)
Next Topic:	Download now

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Wed May 07 23:46:50 GMT 2025

Total time taken to generate the page: 0.05133 seconds