Written by D. Jasmine Merced
Perl hashes are extremely useful data structures that allow you to associate
one piece of data (called a key) to another (its value). In this article, will
review hashes and introduce some of the more advanced uses of hashes.
|
|
|
|
Get the power, flexibility, and reliability you deserve at extremely affordable prices.
|
Assigning Key/Value Pairs
Hashes consist of one or more individual keys and its associated value. Each
key and value are called pairs. There are several ways to insert these pairs
into hashes, outline below.
If you know (at least some of) the key/value pairs that you would like to use,
the following is the most straightforward way to assign pairs to hashes:
%hash = (
apples => 6,
oranges => 5,
pears => 3,
grapes => 2,
);
The above is a more readable way to assign key/value pairs. Let's not forget
the importance of having easy to read code. A less readable way to assign
keys and values to hashes is below:
%hash = qw(apples 6 oranges 5 pears 3 grapes 2);
Perl will automatically convert the above to key/value pairs as if you used
the arrows => in the first example. We recommend the first format's example
for readability, though the formats can be used interchangeably.
You can also add each key/value pair individually. The following line adds
a new key/value pair to our original hash.
$hash{peach} = 3;
If the original hash did not exist, this line would have created a new hash and
inserted the first key/value pair as defined. The process by which a variable
can spring into life like this is called autovivification.
This is useful if you need to loop through a data file and would like to insert
data from the file to a hash.
open FILE, "fruits.txt" or die $!;
while (){
chomp;
my @line = split(/\t/);
$hash{$line[0]} = $line[1];
}
close FILE or die $!;
|
|
With the breakup of the domain registration monopoly, it's now less
expensive and easier than ever to register your domain name.
Register your .com, .net, .org, .biz, .info, and/or .us domain name
now and save more than 50% over the monopoly prices charged by other
domain registrars.
|
|
Removing Pairs from Hashes
Now that we know how to add pairs to hashes, we need to know how to get rid of
them. Deleting a pair is as easy as knowing the key of the pair you want
deleted:
delete $hash{peach};
Now, the pair whose key is peach is gone. But what if you wanted to delete
the entire hash? You can either loop through the entire hash and delete each
key (inefficient) or you can undef it:
undef %hash;
Do not use:
%hash = undef;
This will not obliterate the hash, it will assign a single key/value pair of
undef/undef. If you want to remove all keys from the hash, but still keep
%hash as an "active" variable, use:
%hash = ();
Looking inside Hashes
We now know how to add and remove pairs from hashes, but how to see what
pairs are there? As with nearly everything Perl, TIMTOWTDI (there is more
than one way to do it). Here, we'll look at a few examples on how to loop
inside hashes and take a peek at what's inside. These examples assume you're
already familiar with loops.
Using foreach
foreach my $key (keys %hash) {
print "$key = $hash{$key}\n";
}
The my $key localizes the scalar to this loop (and prevents the "uninitialized
variable" errors when running under warnings).
Using map
print map "$_ = $hash{$_}\n", keys %hash;
If this is confusing, be sure to check out Simon Cozens'
map article last issue.
Using while/each
while (($key,$value) = each %hash){
print "$key = $value\n";
}
Sorting Hashes
If you've actually tested the samples above, you'll have noticed that the
hashes printed out in seemingly random order. This is because hashes are
stored based on memory location, not alphabetically or numerically. But
have heart, it's easy to sort hashes.
There are 3 ways to sort in Perl: ASCIIbetically, numerically or
alphabetically.
Every character (number, letter or metacharacters) has an ASCII code
associated with it. Letters have separate ASCII codes for each of the cases
(upper and lower case). For example, the letter A is 065 and the letter a is
097. So A is "less than" a (065 < 097). With this in mind, let's create a
simple hash that uses both upper and lower cases in its keys:
%hash = (
Apples => 1,
apples => 4,
artichokes => 3,
Beets => 9,
);
foreach my $key (sort keys %hash) {
print "$key = $hash{$key}\n";
}
The above code will print:
Apples = 1
Beets = 9
apples = 4
artichokes = 3
Because the letter B is 066 in ASCII code, it is "less than" 097, the letter
for A. This yields some strange results, but you may wish to use it one day :)
"To sort strings without regard to case, run $a
and $b through lc
before comparing:"(2) using the cmp
comparison operator. This tells Perl to sort letters and ignore case.
foreach my $key (sort {lc($a) cmp lc($b)} keys %hash) {
print "$key = $hash{$key}\n";
}
This correctly prints:
Apples = 1
apples = 4
artichokes = 3
Beets = 9
Hash Slices
"A slice is a section or number of elements from a list, array or hash."(1)
Essentially, you can add or delete key/value pairs en masse using slices,
which are named using the @ at symbol. To give an example of slices,
consider the following:
@months = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
@monthnums{@months}= 1..$#months+1;
Here, we've just created a hash named %monthsnum using a hash slice. It added
each of the elements of the @months array as keys, and the values are 1
through 12 to each month. Because @months are in order, adding 1 through 12
assigns the correct month number value to each key.
So you've done the hash slice and now want to print out the results to make
sure it's correct.
foreach my $key (sort {$nums{$a} <=> $nums{$b}} keys %nums){
print "$key = $nums{$key}\n";
};
Prints:
Jan = 1
Feb = 2
Mar = 3
Apr = 4
May = 5
Jun = 6
Jul = 7
Aug = 8
Sep = 9
Oct = 10
Nov = 11
Dec = 12
We've already seen how to sort hashes in Sorting Hashes above, and we've
just added to it.
sort {$nums{$a} <=> $nums{$b}}
sorts the hash numerically based on the values of the hash instead of the
keys. This way, our months appear in the correct year order.
For more on hash slices, please refer to Uri Guttman's article on this
topic.
Subbing out sorting
Think it will introduce too many typos to remember the construct for alphabetical sorting?
Then "sub" it! Let's say you use sorting very frequently in your programming and find it
cumbersome to type in the above operations each time.
Let's handle this by example:
foreach my $key (sort ascend_alpha keys %hash){
print "$key = $hash{$key}\n";
}
You can easily see that the lc($a) cmp lc($b) has been
replaced by a subroutine call. Now, let's consider the following:
sub ascend_num {$a <=> $b}
sub descend_num {$b <=> $a}
sub ascend_alpha {lc($a) cmp lc($b)}
sub descend_alpha {lc($b) cmp lc($a)}
sub ascend_alphanum {$a <=> $b || lc($a) cmp lc($b)}
sub descend_alphanum {$b <=> $a || lc($b) cmp lc($a)}
The ascend_alphanum and descend_alphanum
routines sort both alphabetically and numerically, so if you added "5 Spice Seasoning", "1 Star Flour", and
"911 Hot Sauce" to %hash, it will sort the numbers
numerically in addition to letters alphabetically.
Apples = 1
apples = 4
artichokes = 3
Beets = 9
canadian = 9
5 Spice Seasoning = 1
10 Star Flour = 1
911 Hot Sauce = 1
Working with Hash References
Have a hash reference and don't want to duplicate the subroutines to deal with them? It's easy...
just pass the dereferenced keys to the sort routines:
$hashref = \%hash;
foreach my $key (sort ascend_alpha keys %{$hashref}){
print "$key = $hashref->{$key}\n";
}
Notice the hash deference %{$hashref} and the arrow dereferencer for the value
$hashref->{$key}.
Sorting by Hash Values
What if you wanted to sort the values instead of the keys? Consider the following:
foreach my $key (sort {$hash{$a} <=> $hash{$b}} keys %hash){
print "$key = $hash{$key}\n";
}
This will print out:
Apples = 1
5 Spice Seasoning = 1
10 Star Flour = 1
APples = 2
artichokes = 3
apples = 4
911 Hot Sauce = 4
canadian = 9
Beets = 9
But let's say you also had text values -- let's change $hash{'Beets'} to "cans: 4 - 8.oz."
and $hash{'Apples'} to "Delicious Red - 4 medium sized". You can have the hash sorted
numerically and alphabetically by using the following:
foreach my $key (sort {$hash{$a} <=> $hash{$b} || $hash{$a} cmp $hash{$b}} keys %hash){
print "$key = $hash{$key}\n";
}
This will correctly print out:
Beets = cans: 4 - 8.oz.
Apples = Delicious Red - 4 medium sized
5 Spice Seasoning = 1
10 Star Flour = 1
APples = 2
artichokes = 3
apples = 4
911 Hot Sauce = 4
canadian = 9
Multidimensional Hashes
Using key/value pairs is great, but what if you wanted to associate more than
one value to a key? Using a slightly different construct, you can essentially use
an array as a key's value:
%hash = (
Apples => [4, "Delicious red", "medium"],
"Canadian Bacon" => [1, "package", "1/2 pound"],
artichokes => [3, "cans", "8.oz."],
Beets => [4, "cans", "8.oz."],
"5 Spice Seasoning" => [1, "bottle", "3 oz."],
"10 Star Flour" => [1, "package", "16 oz."],
"911 Hot Sauce" => [1, "bottle", "8 oz."],
);
Now, to extract the values, you can treat them as an array of the hash:
print $hash{"Canadian Bacon"}[1];
will print package, because package is
the second element of Canadian Bacon's "array". You can also add an predefined array to a hash value:
@garlicstuff = (4, "cloves", "medium");
$hash{"Garlic"} = [@garlicstuff];
print $hash{"Garlic"}[1]; # prints cloves
But what if @garlicstuff had more elements than others? Let's say that
@garlicstuff is
@garlicstuff = (4, "cloves", "medium", "chopped");
instead? How do we print out all values for a key if one key can have 3 values, and another has 4 (or more)
values?
foreach my $key (sort ascend_alpha keys %hash){
print "$key: \n";
foreach my $val (@{$hash{$key}}){
print "\t$val\n";
}
print "\n";
}
Because a multidimensional array's values are essentially arrays, a key's group of
values can be dereferenced by using @{$hash{$key}}. The above code prints:
10 Star Flour:
1
package
16 oz.
5 Spice Seasoning:
1
bottle
3 oz.
911 Hot Sauce:
1
bottle
8 oz.
Apples:
4
Delicious red
medium
artichokes:
3
cans
8.oz.
Beets:
4
cans
8.oz.
Canadian Bacon:
1
package
1/2 pound
Garlic:
4
cloves
medium
chopped
Resources
(1) Learning Perl, 3rd Edition. O'Reilly & Associates © 2000
(2) Programming Perl, 3rd Edition (page 790). O'Reilly & Associates © 2000
Written by Jasmine Merced-Ownbey
Jasmine Merced-Ownbey is the President/CEO of Tintagel Net Solutions
Group, Inc. and the administrator of The Perl Archive.
|