Recently motivated out of curiosity, I and a coworker were benchmarking different approaches to determining if a hash is empty in Perl. Expanding on this a little I present these curious results. In summary, it seems that there are roughly two code paths we’re hitting here. One for if (scalar %hash) and if (%hash), and another for if (keys %hash), if (values %hash), if (scalar keys %hash), and if (scalar values %hash). The latter is much faster when you hit a non-empty hash. It feels somewhat unfortunate that the simplest approach is (approximately) slowest.
hash size: 0
Rate if_scalar if if_values if_keys if_scalar_values if_scalar_keys
if_scalar 3125000/s -- -3% -21% -23% -24% -24%
if 3205128/s 3% -- -19% -21% -22% -22%
if_values 3937008/s 26% 23% -- -2% -4% -5%
if_keys 4032258/s 29% 26% 2% -- -2% -2%
if_scalar_values 4098361/s 31% 28% 4% 2% -- -1%
if_scalar_keys 4132231/s 32% 29% 5% 2% 1% --
hash size: 3
Rate if_scalar if if_keys if_scalar_keys if_values if_scalar_values
if_scalar 967118/s -- -1% -86% -87% -87% -88%
if 978474/s 1% -- -86% -86% -87% -88%
if_keys 7142857/s 639% 630% -- -1% -7% -10%
if_scalar_keys 7246377/s 649% 641% 1% -- -6% -9%
if_values 7692308/s 695% 686% 8% 6% -- -3%
if_scalar_values 7936508/s 721% 711% 11% 10% 3% --
hash size: 500
Rate if if_scalar if_scalar_values if_values if_scalar_keys if_keys
if 912409/s -- -2% -88% -88% -88% -88%
if_scalar 927644/s 2% -- -87% -88% -88% -88%
if_scalar_values 7352941/s 706% 693% -- -1% -4% -4%
if_values 7462687/s 718% 704% 1% -- -3% -3%
if_scalar_keys 7692308/s 743% 729% 5% 3% -- -0%
if_keys 7692308/s 743% 729% 5% 3% 0% --
From the script:
#!/usr/bin/env perl
use strict;
use warnings;
use Benchmark qw(cmpthese);
my %hash_empty = ();
my %hash_small = (1..6);
my %hash_large = (1..1000);
foreach my $ref (\%hash_empty, \%hash_small, \%hash_large) {
print 'hash size: ' . scalar(keys(%$ref)) . "\n";
cmpthese(5_000_000, {
if => sub { if (%$ref) { 1 } else { 0 } },
if_scalar => sub { if (scalar %$ref) { 1 } else { 0 } },
if_keys => sub { if (keys %$ref) { 1 } else { 0 } },
if_values => sub { if (values %$ref) { 1 } else { 0 } },
if_scalar_keys => sub { if (scalar keys %$ref) { 1 } else { 0 } },
if_scalar_keys => sub { if (scalar keys %$ref) { 1 } else { 0 } },
if_scalar_values => sub { if (scalar values %$ref) { 1 } else { 0 } },
});
}
4 Comments
You don’t say what version of perl you’re using.
I have a couple extra methods to ponder.
if_array_cast => sub { my @a = %$ref; if (@a) { 1 } else { 0 } },
if_each => sub { if (each %$ref) { 1 } else { 0 } },
v5.10.1 on old 486 gives this:
hash size: 0
Rate if_array_cast if_keys if if_values if_scalar if_each if_scalar_values if_scalar_keys
if_array_cast 1724138/s — -28% -49% -53% -58% -62% -63% -67%
if_keys 2392344/s 39% — -29% -35% -41% -47% -48% -55%
if 3378378/s 96% 41% — -9% -17% -25% -27% -36%
if_values 3703704/s 115% 55% 10% — -9% -18% -20% -30%
if_scalar 4065041/s 136% 70% 20% 10% — -10% -12% -23%
if_each 4504505/s 161% 88% 33% 22% 11% — -3% -14%
if_scalar_values 4629630/s 169% 94% 37% 25% 14% 3% — -12%
if_scalar_keys 5263158/s 205% 120% 56% 42% 29% 17% 14% –
hash size: 3
Rate if_array_cast if if_scalar if_each if_scalar_values if_keys if_values if_scalar_keys
if_array_cast 407830/s — -60% -62% -92% -94% -96% -96% -96%
if 1016260/s 149% — -6% -79% -85% -90% -90% -90%
if_scalar 1079914/s 165% 6% — -78% -84% -89% -89% -90%
if_each 4901961/s 1102% 382% 354% — -25% -50% -52% -54%
if_scalar_values 6578947/s 1513% 547% 509% 34% — -33% -36% -38%
if_keys 9803922/s 2304% 865% 808% 100% 49% — -4% -8%
if_values 10204082/s 2402% 904% 845% 108% 55% 4% — -4%
if_scalar_keys 10638298/s 2509% 947% 885% 117% 62% 9% 4% –
hash size: 500
Rate if_array_cast if_scalar if if_each if_values if_scalar_keys if_keys if_scalar_values
if_array_cast 2853/s — -100% -100% -100% -100% -100% -100% -100%
if_scalar 948767/s 33151% — -0% -75% -83% -90% -92% -92%
if 952381/s 33277% 0% — -75% -83% -90% -92% -92%
if_each 3816794/s 133664% 302% 301% — -34% -60% -68% -68%
if_values 5747126/s 201315% 506% 503% 51% — -40% -52% -52%
if_scalar_keys 9615385/s 336883% 913% 910% 152% 67% — -19% -19%
if_keys 11904762/s 417117% 1155% 1150% 212% 107% 24% — 0%
if_scalar_values 11904762/s 417117% 1155% 1150% 212% 107% 24% 0% –
Not sure if you were involved on the recent thread on p5p, but it’s now been fixed in blead and there is a work around:
http://grokbase.com/t/perl/perl5-porters/128rpw08re/perl-114576-check-if-hash-500x-times-slower-than-if-keys-hash#20120826gltvul5jmg7acitqfviyz37ike
if(!!%hash) { … }
I was not. Thanks for the link. Interesting that I wasn’t the only one obsessing over the performance of this. :)
(Also looks like I need to set up a mail server on my web host, I totally wasn’t getting mail notifications from wordpress!)