PDL::Pod::Parser
Parser(r)      User Contributed Perl Documentation      Parser(r)



NAME
       PDL::Pod::Parser - base class for creating pod filters and
       translators

SYNOPSIS
           use PDL::Pod::Parser;
           package PDL::MyParser;
           @ISA = qw(PDL::Pod::Parser);

           sub new {
               ## constructor code ...
           }

           ## implementation of appropriate subclass methods ...

           package main;
           $parser = new PDL::MyParser;
           @ARGV = ('-')  unless (@ARGV > 0);
           for (@ARGV) {
               $parser->parse_from_file($_);
           }


DESCRIPTION
       PDL::Pod::Parser is an abstract base class for implement-
       ing filters and/or translators to parse pod documentation
       into other formats. It handles most of the difficulty of
       parsing the pod sections in a file and leaves it to the
       subclasses to override various methods to provide the
       actual translation. The other thing that PDL::Pod::Parser
       provides is the ability to process only selected sections
       of pod documentation from the input.

       SECTION SPECIFICATIONS

       Certain methods and functions provided by PDL::Pod::Parser
       may be given one or more "section specifications" to
       restrict the text processed to only the desired set of
       sections and their corresponding subsections.  A section
       specification is a string containing one or more Perl-
       style regular expressions separated by forward slashes
       ("/").  If you need to use a forward slash literally
       within a section title you can escape it with a backslash
       ("\/").

       The formal syntax of a section specification is:

       head1-title-regexp/head2-title-regexp/...

       Any omitted or empty regular expressions will default to
       ".*".  Please note that each regular expression given is
       implicitly anchored by adding "^" and "$" to the beginning
       and end.  Also, if a given regular expression starts with
       a "!" character, then the expression is negated (so "!foo"
       would match anything except "foo").

       Some example section specifications follow.

       Match the "NAME" and "SYNOPSIS" sections and all of their
       subsections:
           "NAME|SYNOPSIS"

       Match only the "Question" and "Answer" subsections of the
       "DESCRIPTION" section:
           "DESCRIPTION/Question|Answer"

       Match the "Comments" subsection of all sections:
           "/Comments"

       Match all subsections of "DESCRIPTION" except for "Com-
       ments":
           "DESCRIPTION/!Comments"

       Match the "DESCRIPTION" section but do not match any of
       its subsections:
           "DESCRIPTION/!.+"

       Match all top level sections but none of their subsec-
       tions:
           "/!.+"

FUNCTIONS
       PDL::Pod::Parser provides the following functions (please
       note that these are functions and not methods, they do not
       take an object reference as an implicit first parameter):

       version()

       Return the current version of this package.

INSTANCE METHODS
       PDL::Pod::Parser provides several methods, some of which
       should be overridden by subclasses.  They are as follows:

       new()

       This is the the constructor for the base class. You should
       only use it if you want to create an instance of a
       PDL::Pod::Parser instead of one of its subclasses. The
       constructor for this class and all of its subclasses
       should return a blessed reference to an associative array
       (hash).

       initialize()

       This method performs any necessary base class initializa-
       tion.  It takes no arguments (other than the object
       instance of course).  If subclasses override this method
       then they must be sure to invoke the superclass' initial-
       ize() method.

       select($section_spec1, $section_spec2, ...)

       This is the method that is used to select the particular
       sections and subsections of pod documentation that are to
       be printed and/or processed. If the first argument is the
       string "+", then the remaining section specifications are
       added to the current list of selections; otherwise the
       given section specifications will replace the current list
       of selections.

       Each of the $section_spec arguments should be a section
       specification as described in "SECTION SPECIFICATIONS".
       The section specifications are parsed by this method and
       the resulting regular expressions are stored in the array
       referenced by "$self->{SELECTED}" (please see the descrip-
       tion of this member variable in "INSTANCE DATA").

       This method should not normally be overridden by sub-
       classes.

       want_section($head1_title, $head2_title, ...)

       Returns a value of true if the given section and subsec-
       tion titles match any of the section specifications passed
       to the select() method (or if no section specifications
       were given). Returns a value of false otherwise. If
       $headN_title is ommitted then it defaults to the current
       "headN" section title in the input.

       This method should not normally be overridden by sub-
       classes.

       begin_input()

       This method is invoked by parse_from_filehandle() immedi-
       ately before processing input from a filehandle. The base
       class implementation does nothing but subclasses may over-
       ride it to perform any per-file intializations.

       end_input()

       This method is invoked by parse_from_filehandle() immedi-
       ately after processing input from a filehandle. The base
       class implementation does nothing but subclasses may over-
       ride it to perform any per-file cleanup actions.

       preprocess_line($text)

       This methods should be overridden by subclasses that wish
       to perform any kind of preprocessing for each line of
       input (before it has been determined whether or not it is
       part of a pod paragraph). The parameter $text is the input
       line and the value returned should correspond to the new
       text to use in its place. If the empty string or an unde-
       fined value is returned then no further process will be
       performed for this line. If desired, this method can call
       the parse_paragraph() method directly with any prepro-
       cessed text and return an empty string (to indicate that
       no further processing is needed).

       Please note that the preprocess_line() method is invoked
       before the preprocess_paragraph() method. After all (pos-
       sibly preprocessed) lines in a paragraph have been assem-
       bled together and it has been determined that the para-
       graph is part of the pod documentation from one of the
       selected sections, then preprocess_paragraph() is invoked.

       The base class implementation of this method returns the
       given text.

       preprocess_paragraph($text)

       This method should be overridden by subclasses that wish
       to perform any kind of preprocessing for each block (para-
       graph) of pod documentation that appears in the input
       stream.  The parameter $text is the pod paragraph from the
       input file and the value returned should correspond to the
       new text to use in its place.  If the empty string is
       returned or an undefined value is returned, then the given
       $text is ignored (not processed).

       This method is invoked by parse_paragraph(). After it
       returns, parse_paragraph() examines the current cutting
       state (which is stored in "$self->{CUTTING}"). If it eval-
       uates to false then input text (including the given $text)
       is cut (not processed) until the next pod directive is
       encountered.

       Please note that the preprocess_line() method is invoked
       before the preprocess_paragraph() method. After all (pos-
       sibly preprocessed) lines in a paragraph have been assem-
       bled together and it has been determined that the para-
       graph is part of the pod documentation from one of the
       selected sections, then preprocess_paragraph() is invoked.

       The base class implementation of this method returns the
       given text.

       parse_pragmas($cmd, $text, $sep)

       This method is called when an "=pod" directive is encoun-
       tered. When such a pod directive is seen in the input,
       this method is called and is passed the command name $cmd
       (which should be "pod") and the remainder of the text
       paragraph $text which appeared immediately after the com-
       mand name. If desired, the text which separated the "=pod"
       directive from its corresponding text may be found in
       $sep.  Each word in $text is examined to see if it is a
       pragma specification.  Pragma specifications are of the
       form "pragma_name=pragma_value".

       Unless the given object is an instance of the
       PDL::Pod::Parser class, the base class implementation of
       this method will invoke the pragma() method for each
       pragma specification in $text.  If and only if the given
       object is an instance of the PDL::Pod::Parser class, the
       base class version of this method will simply reproduce
       the "=pod" command exactly as it appeared in the input.

       Derived classes should not usually need to reimplement
       this method.

       pragma($pragma_name, $pragma_value)

       This method is invoked for each pragma encountered inside
       an "=pod" paragraph (see the description of the
       parse_pragmas() method). The pragma name is passed in
       $pragma_name (which should always be lowercase) and the
       corresponding value is $pragma_value.

       The base class implementation of this method does nothing.
       Derived class implementations of this method should be
       able to recognize at least the following pragmas and take
       any necessary actions when they are encountered:

       fill=value
           The argument value should be one of "on", "off", or
           "previous".  Specifies that "filling-mode" should set
           to 1, 0, or its previous value (respectively). If
           value is omitted then the default is "on".  Derived
           classes may use this to decide whether or not to per-
           form any filling (wrapping) of subsequent text.

       style=value
           The argument value should be one of "bold", "italic",
           "code", "plain", or "previous". Specifies that the
           current default paragraph font should be set to
           "bold", "italic", "code", the empty string , or its
           previous value (respectively).  If value is omitted
           then the default is "plain".  Derived classes may use
           this to determine the default font style to use for
           subsequent text.

       indent=value
           The argument value should be an integer value (with an
           optional sign).  Specifies that the current indenta-
           tion level should be reset to the given value. If a
           plus (minus) sign precedes the number then the inden-
           tation level should be incremented (decremented) by
           the given number. If only a plus or minus sign is
           given (without a number) then the current indentation
           level is incremented or decremented by some default
           amount (to be determined by subclasses).

       The value returned will be 1 if the pragma name was recog-
       nized and 0 if it wasnt (in which case the pragma was
       ignored).

       Derived classes should override this method if they wish
       to implement any pragmas. The base class implementation of
       this method does nothing but it does contain some com-
       mented-out code which subclasses may want to make use of
       when implementing pragmas.

       command($cmd, $text, $sep)

       This method should be overridden by subclasses to take the
       appropriate action when a pod command paragraph (denoted
       by a line beginning with "=") is encountered.  When such a
       pod directive is seen in the input, this method is called
       and is passed the command name $cmd and the remainder of
       the text paragraph $text which appears immediately after
       the command name. If desired, the text which separated the
       command from its corresponding text may be found in $sep.
       Note that this method is not called for "=pod" paragraphs.

       The base class implementation of this method simply prints
       the raw pod command to the output filehandle and then
       invokes the textblock() method, passing it the $text
       parameter.

       verbatim($text)

       This method may be overridden by subclasses to take the
       appropriate action when a block of verbatim text is
       encountered. It is passed the text block $text as a param-
       eter.

       The base class implementation of this method simply prints
       the textblock (unmodified) to the output filehandle.

       textblock($text)

       This method may be overridden by subclasses to take the
       appropriate action when a normal block of pod text is
       encountered (although the base class method will usually
       do what you want). It is passed the text block $text as a
       parameter.

       In order to process interior sequences, subclasses imple-
       mentations of this method will probably want invoke the
       interpolate() method, passing it the text block $text as a
       parameter and then perform any desired processing upon the
       returned result.

       The base class implementation of this method simply prints
       the text block as it occurred in the input stream).

       interior_sequence($seq_cmd, $seq_arg)

       This method should be overridden by subclasses to take the
       appropriate action when an interior sequence is encoun-
       tered. An interior sequence is an embedded command within
       a block of text which appears as a command name (usually a
       single uppercase character) followed immediately by a
       string of text which is enclosed in angle brackets. This
       method is passed the sequence command $seq_cmd and the
       corresponding text $seq_arg and is invoked by the interpo-
       late() method for each interior sequence that occurs in
       the string that it is passed.  It should return the
       desired text string to be used in place of the interior
       sequence.

       Subclass implementationss of this method may wish to exam-
       ine the the array referenced by "$self->{SEQUENCES}" which
       is a stack of all the interior sequences that are cur-
       rently being processed (they may be nested). The current
       interior sequence (the one given by "$seq_cmd<$seq_arg>")
       should always be at the top of this stack.

       The base class implementation of the interior_sequence()
       method simply returns the raw text of the of the interior
       sequence (as it occurred in the input) to the output file-
       handle.

       interpolate($text, $end_re)

       This method will translate all text (including any embed-
       ded interior sequences) in the given text string $text and
       return the interpolated result.  If a second argument is
       given, then it is taken to be a regular expression that
       indicates when to quit interpolating the string.  Upon
       return, the $text parameter will have been modified to
       contain only the un-processed portion of the given string
       (which will not contain any text matched by $end_re).

       This method should probably not be overridden by sub-
       classes.  It should be noted that this method invokes
       itself recursively to handle any nested interior
       sequences.

       parse_paragraph($text)

       This method takes the text of a pod paragraph to be pro-
       cessed and invokes the appropriate method (one of com-
       mand(), verbatim(), or textblock()).

       This method does not usually need to be overridden by sub-
       classes.

       parse_from_filehandle($infilehandle, $outfilehandle)

       This method takes a glob to a filehandle (which is assumed
       to already be opened for reading) and reads the entire
       input stream looking for blocks (paragraphs) of pod docu-
       mentation to be processed. For each block of pod documen-
       tation encountered it will call the parse_paragraph()
       method.

       If a second argument is given then it should be a filehan-
       dle glob where output should be sent (otherwise the
       default output filehandle is "STDOUT"). If no first argu-
       ment is given the default input filehandle "STDIN" is
       used.

       The input filehandle that is currently in use is stored in
       the member variable whose key is "INPUT" (e.g.
       "$self->{INPUT}").

       The output filehandle that is currently in use is stored
       in the member variable whose key is "OUTPUT" (e.g.
       "$self->{OUTPUT}").

       Input is read line-by-line and assembled into paragraphs
       (which are separated by lines containing nothing but
       whitespace). The current line number is stored in the mem-
       ber variable whose key is "LINE" (e.g.  "$self->{LINE}")
       and the current paragraph number is stored in the member
       variable whose key is "PARAGRAPH" (e.g.  "$self->{PARA-
       GRAPH}").

       This method does not usually need to be overridden by sub-
       classes.

       parse_from_file($filename, $outfile)

       This method takes a filename and does the following:

       o   opens the input and output files for reading (creating
           the appropriate filehandles)

       o   invokes the parse_from_filehandle() method passing it
           the corresponding input and output filehandles.

       o   closes the input and output files.

       If the special input filename "-" or "<&STDIN" is given
       then the STDIN filehandle is used for input (and no open
       or close is performed).  If no input filename is specified
       then "-" is implied.  If a reference is passed instead of
       a filename then it is assumed to be a glob-style reference
       to a filehandle.

       If a second argument is given then it should be the name
       of the desired output file.  If the special output file-
       name "-" or ">&STDOUT" is given then the STDOUT filehandle
       is used for output (and no open or close is performed). If
       the special output filename ">&STDERR" is given then the
       STDERR filehandle is used for output (and no open or close
       is performed).  If no output filename is specified then
       "-" is implied.  If a reference is passed instead of a
       filename then it is assumed to be a glob-style reference
       to a filehandle.

       The name of the input file that is currently being read is
       stored in the member variable whose key is "INFILE" (e.g.
       "$self->{INFILE}").

       The name of the output file that is currently being writ-
       ten is stored in the member variable whose key is "OUT-
       FILE" (e.g.  "$self->{OUTFILE}").

       This method does not usually need to be overridden by sub-
       classes.

INSTANCE DATA
       PDL::Pod::Parser uses the following data members for each
       of its instances (where $self is a reference to such an
       instance):

       $self->{INPUT}

       The current input filehandle.

       $self->{OUTPUT}

       The current output filehandle.

       $self->{INFILE}

       The name of the current input file.

       $self->{OUTFILE}

       The name of the current output file.

       $self->{LINE}

       The current line number from the input stream.

       $self->{PARAGRAPH}

       The current paragraph number from the input stream (which
       includes input paragraphs that are not part of the pod
       documentation).

       $self->{HEADINGS}

       A reference to an array of the current section heading
       titles for each heading level (note that the first heading
       level title is at index 0).

       $self->{SELECTED}

       A reference to an array of references to arrays. Each sub-
       array is a list of anchored regular expressions (preceded
       by a "!" if the regexp is to be negated). The index of the
       expression in the subarray should correspond to the index
       of the heading title in $self->{HEADINGS} that it is to be
       matched against.

       $self->{CUTTING}

       A boolean-valued scalar which evaluates to true if text
       from the input file is currently being "cut".

       $self->{SEQUENCES}

       An array reference to the stack of interior sequence com-
       mands that are currently in the middle of being processed.

NOTES
       To create a pod translator to translate pod documentation
       to some other format, you usually only need to create a
       subclass of PDL::Pod::Parser which overrides the base
       class implementation for the following methods:

       o   pragma()

       o   command()

       o   verbatim()

       o   textblock()

       o   interior_sequence()

       You may also want to implement the begin_input() and
       end_input() methods for your subclass (to perform any
       needed per-file intialization or cleanup).

       If you need to perform any preprocesssing of input before
       it is parsed you may want to implement one or both of the
       preprocess_line() and/or preprocess_paragraph() methods.

       Also, don't forget to make sure your subclass constructor
       invokes the base class' initialize() method.

       Sometimes it may be necessary to make more than one pass
       over the input files. This isn't a problem as long as none
       of the input files correspond to "STDIN". You can override
       either the parse_from_filehandle() method or the
       parse_from_file() method to make the first pass yourself
       to collect all the information you need and then invoke
       the base class method to do the rest of the standard pro-
       cessing.

       Feel free to add any member data fields you need to keep
       track of things like current font, indentation, horizontal
       or vertical position, or whatever else you like.

       For the most part, the PDL::Pod::Parser base class should
       be able to do most of the input parsing for you and leave
       you free to worry about how to intepret the commands and
       translate the result.

AUTHOR
       Brad Appleton <Brad_Appleton-GBDA001@email.mot.com>

       Based on code for Pod::Text written by Tom Christiansen
       <tchrist@mox.perl.com>



perl v5.6.1                 1999-12-09                  Parser(r)