/usr/share/perl5/Tie/ShadowHash.pm is in libtie-shadowhash-perl 1.00-1.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 | # Tie::ShadowHash -- Merge multiple data sources into a hash.
#
# Copyright 1999, 2002, 2010 by Russ Allbery <rra@stanford.edu>
#
# This program is free software; you may redistribute it and/or modify it
# under the same terms as Perl itself.
#
# This module combines multiple sources of data into a single tied hash, so
# that they can all be queried simultaneously, the source of any given
# key-value pair irrelevant to the client script. Data sources are searched
# in the order that they're added to the shadow hash. Changes to the hashed
# data aren't propagated back to the actual data files; instead, they're saved
# within the tied hash and override any data obtained from the data sources.
##############################################################################
# Modules and declarations
##############################################################################
package Tie::ShadowHash;
require 5.006;
use strict;
use vars qw($VERSION);
$VERSION = '1.00';
##############################################################################
# Regular methods
##############################################################################
# This should pretty much never be called; tie calls TIEHASH.
sub new {
my $class = shift;
return $class->TIEHASH (@_);
}
# Given a file name and optionally a split regex, builds a hash out of the
# contents of the file. If the split sub exists, use it to split each line
# into an array; if the array has two elements, those are taken as the key and
# value. If there are more, the value is an anonymous array containing
# everything but the first. If there's no split sub, take the entire line
# modulo the line terminator as the key and the value the number of times it
# occurs in the file.
sub text_source {
my ($self, $file, $split) = @_;
unless (open (HASH, '<', $file)) {
require Carp;
Carp::croak ("Can't open file $file: $!");
}
local $_;
my ($key, @rest, %hash);
while (<HASH>) {
chomp;
if (defined $split) {
($key, @rest) = &$split ($_);
$hash{$key} = (@rest == 1) ? $rest[0] : [ @rest ];
} else {
$hash{$_}++;
}
}
close HASH;
return \%hash;
}
# Add data sources to the shadow hash. This takes a list of either anonymous
# arrays (in which case the first element is the type of source and the rest
# are arguments), filenames (in which case it's taken to be a text file with
# each line being a key), or hash references (possibly to tied hashes).
sub add {
my ($self, @sources) = @_;
for my $source (@sources) {
if (ref $source eq 'ARRAY') {
my ($type, @args) = @$source;
if ($type eq 'text') {
$source = $self->text_source (@args);
} else {
require Carp;
Carp::croak ("Invalid source type $type");
}
} elsif (!ref $source) {
$source = $self->text_source ($source);
}
push (@{ $$self{SOURCES} }, $source);
}
return 1;
}
##############################################################################
# Tie methods
##############################################################################
# DELETED is a hash holding all keys that have been deleted; it's checked
# first on any access. EACH is a pointer to the current structure being
# traversed on an "each" of the shadow hash, so that they can all be traversed
# in order. OVERRIDE is a hash containing values set directly by the user,
# which override anything in the shadow hash's underlying data structures.
# And finally, SOURCES is an array of the data structures (all Perl hashes,
# possibly tied).
sub TIEHASH {
my $class = shift;
$class = ref $class || $class;
my $self = {
DELETED => {},
EACH => -1,
OVERRIDE => {},
SOURCES => []
};
bless ($self, $class);
$self->add (@_) if @_;
return $self;
}
# Note that this doesn't work quite right in the case of keys with undefined
# values, but we can't make it work right since that would require using
# exists and a lot of common data sources (such as NDBM_File tied hashes)
# don't implement exists.
sub FETCH {
my ($self, $key) = @_;
return if $self->{DELETED}{$key};
return $self->{OVERRIDE}{$key} if exists $self->{OVERRIDE}{$key};
for my $source (@{ $self->{SOURCES} }) {
return $source->{$key} if defined $source->{$key};
}
return;
}
sub STORE {
my ($self, $key, $value) = @_;
delete $self->{DELETED}{$key};
$self->{OVERRIDE}{$key} = $value;
}
sub DELETE {
my ($self, $key) = @_;
delete $self->{OVERRIDE}{$key};
$self->{DELETED}{$key} = 1;
}
sub CLEAR {
my ($self) = @_;
$self->{DELETED} = {};
$self->{OVERRIDE} = {};
$self->{SOURCES} = [];
$self->{EACH} = -1;
}
# This could throw an exception if any underlying source doesn't support
# exists (like NDBM_File).
sub EXISTS {
my ($self, $key) = @_;
return if exists $self->{DELETED}{$key};
for my $source ($self->{OVERRIDE}, @{ $self->{SOURCES} }) {
return 1 if exists $source->{$key};
}
return;
}
# We have to reset the each counter on all hashes. For tied hashes, we call
# FIRSTKEY directly because it's potentially more efficient than calling keys
# on the hash.
sub FIRSTKEY {
my ($self) = @_;
keys %{ $self->{OVERRIDE} };
for my $source (@{ $self->{SOURCES} }) {
my $tie = tied $source;
if ($tie) {
$tie->FIRSTKEY;
} else {
keys %$source;
}
}
$self->{EACH} = -1;
return $self->NEXTKEY;
}
# Walk the sources by calling each on each one in turn, skipping deleted
# keys and keys shadowed by earlier hashes and using $self->{EACH} to
# store the number of source we're at.
sub NEXTKEY {
my ($self) = @_;
my @result = ();
SOURCE:
while (!@result && $self->{EACH} < @{ $self->{SOURCES} }) {
if ($self->{EACH} == -1) {
@result = each %{ $self->{OVERRIDE} };
} else {
@result = each %{ $self->{SOURCES}[$self->{EACH}] };
}
if (@result && $self->{DELETED}{$result[0]}) {
undef @result;
next;
}
if (@result && $self->{EACH} > -1) {
my $key = $result[0];
if (exists $self->{OVERRIDE}{$key}) {
undef @result;
next;
}
for (my $index = $self->{EACH} - 1; $index >= 0; $index--) {
if (defined $self->{SOURCES}[$index]{$key}) {
undef @result;
next SOURCE;
}
}
}
return (wantarray ? @result : $result[0]) if @result;
$self->{EACH}++;
}
return;
}
##############################################################################
# Module return value and documentation
##############################################################################
# Make sure the module returns true.
1;
__DATA__
=head1 NAME
Tie::ShadowHash - Merge multiple data sources into a hash
=for stopwords
DBM Allbery
=head1 SYNOPSIS
use Tie::ShadowHash;
use DB_File;
tie (%db, 'DB_File', 'file.db');
$obj = tie (%hash, 'Tie::ShadowHash', \%db, "otherdata.txt");
# Accesses search %db first, then the hashed "otherdata.txt".
print "$hash{key}\n";
# Changes override data sources, but don't change them.
$hash{key} = 'foo';
delete $hash{bar};
# Add more data sources on the fly.
%extra = (fee => 'fi', foe => 'fum');
$obj->add (\%extra);
# Add a text file as a data source, taking the first "word" up
# to whitespace on each line as the key and the rest of the line
# as the value.
$split = sub { split (' ', $_[0], 2) };
$obj->add ([text => "pairs.txt", $split]);
# Add a text file as a data source, splitting each line on
# whitespace and taking the first "word" to be the key and an
# anonymous array consisting of the remaining words to be the
# data.
$split = sub { split (' ', $_[0]) };
$obj->add ([text => "triples.txt", $split]);
=head1 DESCRIPTION
This module merges together multiple sets of data in the form of hashes
into a data structure that looks to Perl like a single simple hash. When
that hash is accessed, the data structures managed by that shadow hash are
searched in order they were added for that key. This allows the rest of a
program simple and convenient access to a disparate set of data sources.
Tie::ShadowHash can handle anything that looks like a hash; just give it a
reference as one of the additional arguments to tie(). This includes
other tied hashes, so you can include DB and DBM files as data sources for
a shadow hash. If given a plain file name instead of a reference, it will
build a hash to use internally, with each chomped line of the file being
the key and the number of times that line is seen in the file being the
value.
Tie::Shadowhash also supports special tagged data sources that can take
options specifying their behavior. The only tagged data source currently
supported is C<text>, which takes a file name of a text file and a
reference to a sub. The sub is called for every line of the file, with
that line as an argument, and is expected to return a list. The first
element of the list will be the key, and the second and subsequent
elements will be the value or values. If there is more than one value,
the value stored in the hash and associated with that key is an anonymous
array containing all of them.
Tagged data sources are distinguished from normal data sources by passing
them to tie() (or to add() -- see below) as an anonymous array. The first
element is the data source tag and the remaining elements are arguments
for that data source. For a text data source, see the usage summary above
for examples.
The shadow hash can be modified, and the modifications override the data
sources, but modifications aren't propagated back to the data sources. In
other words, the shadow hash treats all data sources as read-only and
saves your modifications only in internal memory. This lets you make
changes to the shadow hash for the rest of your program without affecting
the underlying data in any way (and this behavior is the main reason why
this is called a shadow hash).
If the shadow hash is cleared, by assigning the empty list to it, by
explicitly calling CLEAR(), or by some other method, all data sources are
dropped from the shadow hash. There is no other way of removing a data
source from a shadow hash after it's been added (you can, of course,
always untie the shadow hash and dispose of the underlying object if you
saved it to destroy the shadow hash completely).
=head1 INSTANCE METHODS
=over 4
=item add(SOURCE [, SOURCE ...])
Adds the given sources to an existing shadow hash. This method can be
called on the object returned by the initial tie() call. It takes the
same arguments as the initial tie() and interprets them the same way.
=back
=head1 DIAGNOSTICS
=over 4
=item Can't open file %s: %s
Tie::ShadowHash was given a file name to use as a source, but when it
tried to open that file, the open failed with that system error message.
=item Invalid source type %s
Tie::Shadowhash was given a tagged data source of an unknown type. The
only currently supported tagged data source is C<text>.
=back
=head1 CAVEATS
It's worth paying very careful attention to L<perltie/"The untie Gotcha">
when using this module. It's also important to be careful about what you
do with tied hashes that are included in a shadow hash. Tie::ShadowHash
stores a reference to such arrays; if you untie them out from under a
shadow hash, you may not get the results you expect. Remember that if you
put something in a shadow hash, you'll need to clean out the shadow hash
as well as everything else that references a variable if you want to free
it completely.
Not all tied hashes implement EXISTS; in particular, ODBM_File, NDBM_File,
and some old versions of GDBM_File don't, and therefore AnyDBM_File
doesn't either. Calling exists on a shadow hash that includes one of
those tied hashes as a data source may therefore result in an exception.
Tie::ShadowHash doesn't use exists except to implement the EXISTS method
because of this.
Because it can't use EXISTS due to the above problem, Tie::ShadowHash
cannot correctly distinguish between a non-existent key and an existing
key associated with an undefined value. This isn't a large problem, since
many tied hashes can't store undefined values anyway, but it means that if
one of your data sources contains a given key associated with an undefined
value and one of your later data sources contains the same key but with a
defined value, when the shadow hash is accessed using that key, it will
return the first defined value it finds. This is an exception to the
normal rule that all data sources are searched in order and the value
returned by an access is the first value found. (Tie::ShadowHash does
correctly handle undefined values stored directly in the shadow hash.)
=head1 AUTHOR
Russ Allbery <rra@stanford.edu>
=head1 COPYRIGHT AND LICENSE
Copyright 1999, 2002, 2010 by Russ Allbery <rra@stanford.edu>
This program is free software; you may redistribute it and/or modify it
under the same terms as Perl itself.
=head1 SEE ALSO
L<perltie>
The current version of this module is always available from its web site
at L<http://www.eyrie.org/~eagle/software/shadowhash/>.
=cut
|