This file is indexed.

/usr/share/perl5/MediaWiki/DumpFile/FastPages.pm is in libmediawiki-dumpfile-perl 0.2.2-1.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
#!/usr/bin/env perl

package MediaWiki::DumpFile::FastPages;

our $VERSION = '0.2.0';

use base qw(MediaWiki::DumpFile::Pages);
use strict;
use warnings;
use Data::Dumper;

sub new {
	my ($class, $input) = @_;
	use Carp qw(croak);
	my $self;
	
	if (! defined($input)) {
		croak "you must provide either a filename or an already open file handle";
	}
	
	$self = $class->SUPER::new(input => $input, fast_mode => 1);
	bless($self, $class);
	
	return $self;
}

sub next {
	my ($self) = @_;

	return $self->_fast_next;	
}

1;

__END__

=head1 NAME

MediaWiki::DumpFile::FastPages - Fastest way to parse a page dump file

=head1 SYNOPSIS

  use MediaWiki::DumpFile::FastPages;
  
  $pages = MediaWiki::DumpFile::FastPages->new($file);
  $pages = MediaWiki::DumpFile::FastPages->new(\*FH);
  
  while(($title, $text) = $pages->next) {
    print "Title: $title\n";
    print "Text: $text\n";
  }
 
=head1 ABOUT

This is a subclass of MediaWiki::DumpFile::Pages that configures
it to run in fast mode and uses a custom iterator
that dispenses with the duck-typed MediaWiki::DumpFile::Pages::Page
object that fast mode uses giving a slight processing speed boost.

See the MediaWiki::DumpFile::Pages documentation for information about fast mode. 

=head1 METHODS

All of the methods of MediaWiki::DumpFile::Pages are also available on this
subclass.

=head2 new

This is the constructor for this package. It is called with a single parameter: the location of
a MediaWiki pages dump file or a reference to an already open file handle. 

=head2 next

Returns a two element list where the first element is the article title and the second element
is the article text. Returns an empty list when there are no more pages available.

=head1 AUTHOR

Tyler Riddle, C<< <triddle at gmail.com> >>

=head1 BUGS

Please see MediaWiki::DumpFile for information on how to report bugs in 
this software. 

=head1 HISTORY

This package originally started life as a very limited hack using only 
XML::LibXML::Reader and seeking to text and title nodes in the document.
Implementing a parser for the full document was a daunting task and
this package sat in the hopes that other people might find it useful. 

Because XML::TreePuller can expose the underlying XML::LibXML::Reader
object and sync itself back up after the cursor was moved out from
underneath it, I was able to integrate the logic from this package
into the main ::Pages parser. 

=head1 COPYRIGHT & LICENSE

Copyright 2009 "Tyler Riddle".

This program is free software; you can redistribute it and/or modify it
under the terms of either: the GNU General Public License as published
by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.