PDL::PP
PP(P)          User Contributed Perl Documentation          PP(P)



NAME
       PDL::PP - Generate PDL routines from concise descriptions

SYNOPSIS
       e.g.

               pp_def(
                       'sumover',
                       Pars => 'a(a); [o]b();',
                       Code => 'double tmp=0;
                               loop(p) %{ tmp += $a(); %}
                               $b() = tmp;
                               '
               );

               pp_done();


DESCRIPTION
       In much of what follows we will assume familiarity of the
       reader with the concepts of implicit and explicit thread-
       ing and index manipulations within PDL. If you have not
       yet heard of these concepts or are not very comfortable
       with them it is time to check PDL::Indexing.

       As you may appreciate from its name PDL::PP is a Pre-Pro-
       cessor, i.e.  it expands code via substitutions to make
       real C-code (well, actually it outputs XS code (See per-
       lxs) but that is very close to C).

Overview
       Why do we need PP? Several reasons: firstly, we want to be
       able to generate subroutine code for each of the PDL
       datatypes (PDL_Byte, PDL_Short,. etc).  AUTOMATICALLY.
       Secondly, when referring to slices of PDL arrays in Perl
       (e.g. "$a->slice('0:10:2,:')" or other things such as
       transposes) it is nice to be able to do this transparently
       and to be able to do this 'in-place' - i.e, not to have to
       make a memory copy of the section. PP handles all the nec-
       essary element and offset arithmetic for you. There are
       also the notions of threading (repeated calling of the
       same routine for multiple slices, see PDL::Indexing) and
       dataflow (see PDL::Dataflow) which use of PP allows.

       So how do you use PP? Well for the most part you just
       write ordinary C code except for special PP constructs
       which take the form:

          $something(something else)

       or:

          PPfunction %{
            <stuff>
          %}

       The most important PP construct is the form "$array()".
       Consider the very simple PP function to sum the elements
       of a 1D vector (in fact this is very similar to the actual
       code used by 'sumover'):


          pp_def('sumit',
                  Pars => 'a(a);  [o]b();',
                  Code => '
                       double tmp;
                       tmp = 0;
                       loop(p) %{
                         tmp += $a();
                       %}
                       $b() = tmp;
          ');

       What's going on? The "Pars =>" line is very important for
       PP - it specifies all the arguments and their dimensional-
       ity. We call this the signature of the PP function (com-
       pare also the explanations in PDL::Indexing).  In this
       case the routine takes a 1-D function as input and returns
       a 0-D scalar as output.  The "$a()" PP construct is used
       to access elements of the array a(a) for you - PP fills in
       all the required C code.

       [Aside: since PP used "$var()" for its parsing you must
       single-quote all Code=> arguments since you don't want
       perl to interpolate "$var()" into another string - i.e.
       don't use "" unless you know what you are doing! Tjl: it's
       usually easiest to use single quotes and 'some-
       thing'.$interpolatable.'somethingelse']

       In the simple case here where all elements are accessed
       the PP construct "loop(p) %{ ... %}" is used to loop over
       all elements in dimension "n".  Note this feature of PP:
       ALL DIMENSIONS ARE SPECIFIED BY NAME.

       This is made clearer if we avoid the PP loop() construct
       and write the loop explicitly using conventional C:

          pp_def('sumit',
                  Pars => 'a(a);  [o]b();',
                  Code => '
                       int i,n_size;
                       double tmp;
                       n_size = $SIZE(E);
                       tmp = 0;
                       for(i=0; i<n_size; i++) {
                         tmp += $a(n=>i);
                       }
                       $b() = tmp;
          ');

       which does the same as before, except more long-windedly.
       You can see to get element "i" of a() we say "$a(n=>i)" -
       we are specifying the dimension by name "n". In 2D we
       might say:

          Pars=>'a(m,n);',
             ...
             tmp += $a(m=>i,n=>j);
             ...

       The syntax 'm=>i' borrows from Perl hashes (which are in
       fact used in the implementation of PP). One could also say
       "$a(n=>j,m=>i)" as order is not important.

       You can also see in the above example the use of another
       PP construct - $SIZE(E) to get the length of the dimension
       "n".

       It should, however, be noted that you shouldn't write an
       explicit C-loop when you could have used the PP "loop"
       construct since PDL::PP checks automatically the loop lim-
       its for you, usage of "loop" makes the code more concise,
       etc. But there are certainly situations where you need
       explicit control of the loop and now you know how to do it
       ;).

       To revisit 'Why PP?' - the above code for sumit() will be
       generated for each data-type. It will operate on slices of
       arrays 'in-place'. It will thread automatically - e.g. if
       a 2D array is given it will be called repeatedly for each
       1D row (again check PDL::Indexing for the details of
       threading).  And then b() will be a 1D array of sums of
       each row.  We could call it with $a->xchg(0,1) to sum the
       colums instead.  And Dataflow tracing etc. will be avail-
       able.

       You can see PP saves the programmer from writing a lot of
       needlessly repetitive C-code -- in our opinion this is one
       of the best features of PDL making writing new C subrou-
       tines for PDL an amazingly concise exercise. A second rea-
       son is the ability to make PP expand your concise code
       definitions into different C code based on the needs of
       the computer architecture in question. Imagine for example
       you are lucky to have a supercomputer at your hands; in
       that case you want PDL::PP certainly to generate code that
       takes advantage of the vectorising/parallel computing fea-
       tures of your machine (this a project for the future). In
       any case, the bottom line is that your unchanged code
       should still expand to working XS code even if the inter-
       nals of PDL changed.

       Also, because you are generating the code in an actual
       Perl script, there are many fun things that you can do.
       Let's say that you need to write both sumit (as above) and
       multit. With a little bit of inventivity, we can do

          for({Name => 'sumit', Init => '0', Op => '+='},
              {Name => 'multit', Init => '1', Op => '*='}) {
                  pp_def($_->{Name},
                          Pars => 'a(a);  [o]b();',
                          Code => '
                               double tmp;
                               tmp = '.$_->{Init}.';
                               loop(p) %{
                                 tmp '.$_->{Op}.' $a();
                               %}
                               $b() = tmp;
                  ');
          }

       which defines both the functions easily. Now, if you later
       need to change the signature or dimensionality or what-
       ever, you only need to change one place in your code.
       Yeah, sure, your editor does have 'cut and paste' and
       'search and replace' but it's still less bothersome and
       definitely more difficult to forget just one place and
       have strange bugs creep in.  Also, adding 'orit' (bitwise
       or) later is a one-liner.

       And remember, you really have perl's full abilities with
       you - you can very easily read any input file and make
       routines from the information in that file. For simple
       cases like the above, the author (Tjl) currently favors
       the hash syntax like the above - it's not too much more
       characters than the corresponding array syntax but much
       easier to understand and change.

       We should mention here also the ability to get the pointer
       to the beginning of the data in memory - a prerequisite
       for interfacing PDL to some libraries. This is handled
       with the "$P(P)" directive, see below.

       So, after this quick overview of the general flavour of
       programming PDL routines using PDL::PP let's summarise in
       which circumstances you should actually use this prepro-
       cessor/precompiler. You should use PDL::PP if you want to

       o  interface PDL to some external library

       o  write some algorithm that would be slow if coded in
          perl (this is not as often as you think; take a look at
          threading and dataflow first).

       o  be a PDL developer (and even then it's not obligatory)

WARNING
       Because of its architecture, PDL::PP can be both flexible
       and easy to use (yet exuberantly complicated) at the same
       time. Currently, part of the problem is that error mes-
       sages are not very informative and if something goes
       wrong, you'd better know what you are doing and be able to
       hack your way through the internals (or be able to figure
       out by trial and error what is wrong with your args to
       "pp_def").

       An alternative, of course, is to ask someone about it
       (e.g., through the mailing lists).

ABANDON ALL HOPE, YE WHO ENTER HERE (DESCRIPTION)
       Now that you have some idea how to use "pp_def" to define
       new PDL functions it is time to explain the general syntax
       of "pp_def".  "pp_def" takes as arguments first the name
       of the function you are defining and then a hash list that
       can contain various keys.

       Based on these keys PP generates XS code and a .pm file.
       The function "pp_done" (see example in the SYNOPSIS) is
       used to tell PDL::PP that there are no more definitions in
       this file and it is time to generate the .xs and
        .pm file.

       As a consequence, there may be several pp_def() calls
       inside a file (by convention files with PP code have the
       extension .pd or .pp) but generally only one pp_done().

       There are two main different types of usage of pp_def(),
       the 'data operation' and 'slice operation' prototypes.

       The 'data operation' is used to take some data, mangle it
       and output some other data; this includes for example the
       '+' operation, matrix inverse, sumover etc and all the
       examples we have talked about in this document so far.
       Implicit and explicit threading and the creation of the
       result are taken care of automatically in those
       opeartions. You can even do dataflow with "sumit",
       "sumover", etc (don't be dismayed if you don't understand
       the concept of dataflow in PDL very well yet; it is still
       very much experimental).

       The 'slice operation' is a different kind of operation: in
       a slice operation, you are not changing any data, you are
       defining correspondences between different elements of two
       piddles (examples include the index manipulation/slicing
       function definitions in the file slices.pd that is part of
       the PDL distribution; but beware, this is not introductory
       level stuff).

       If PDL was compiled with support for bad values (ie
       "WITH_BADVAL => 1"), then additional keys are required for
       "pp_def", as explained below.

       If you are just interested in communicating with some
       external library (for example some linear algebra/matrix
       library), you'll usually want the 'data operation' so we
       are going to discuss that first.

Data operation
       A simple example

       In the data operation, you must know what dimensions of
       data you need. First, an example with scalars:

               pp_def('add',
                       Pars => 'a(); b(); [o]c();',
                       Code => '$c() = $a() + $b();'
               );

       That looks a little strange but let's dissect it. The
       first line is easy: we're defining a routine with the name
       'add'.  The second line simply declares our parameters and
       the parentheses mean that they are scalars. We call the
       string that defines our parameters and their dimensional-
       ity the signature of that function. For its relevance with
       regard to threading and index manipulations check the
       PDL::Indexing manpage.

       The third line is the actual operation. You need to use
       the dollar signs and parentheses to refer to your parame-
       ters (this will probably change at some point in the
       future, once a good syntax is found).

       These lines are all that is necessary to actually define
       the function for PDL (well, actually it isn't; you adi-
       tionally need to write a Makefile.PL (see below) and build
       the module (something like 'perl Makefile.PL; make'); but
       let's ignore that for the moment). So now you can do

               use MyModule;
               $a = pdl 2,3,4;
               $b = pdl 5;

               $c = add($a,$b);
               # or
               add($a,$b,($c=null)); # Alternative form, useful if $c has been
                                     # preset to something big, not useful here.

       and have threading work correctly (the result is $c == [7
       8 9]).

       The Pars section: the signature of a PP function

       Seeing the above example code you will most probably ask:
       what is this strange "$c=null" syntax in the second call
       to our new "add" function? If you take another look at the
       definition of "add" you will notice that the third argu-
       ment "c" is flagged with the qualifier "[o]" which tells
       PDL::PP that this is an output argument. So the above call
       to add means 'create a new $c from scratch with correct
       dimensions' - "null" is a special token for 'empty piddle'
       (you might ask why we haven't used the value "undef" to
       flag this instead of the PDL specific "null"; we are cur-
       rently thinking about it ;).

       [This should be explained in some other section of the
       manual as well!!]  The reason for having this syntax as an
       alternative is that if you have really huge piddles, you
       can do

               $c = PDL->null;
               for(some long loop) {
                       # munge a,b
                       add($a,$b,$c);
                       # munge c, put something back to a,b
               }

       and avoid allocating and deallocating $c each time. It is
       allocated once at the first add() and thereafter the mem-
       ory stays until $c is destroyed.

       If you just say

         $c =  add($a,$b);

       the code generated by PP will automatically fill in
       "$c=null" and return the result. If you want to learn more
       about the reasons why PDL::PP supports this style where
       output arguments are given as last arguments check the
       PDL::Indexing manpage.

       "[o]" is not the only qualifier a pdl argument can have in
       the signature.  Another important qualifier is the "[t]"
       option which flags a pdl as temporary.  What does that
       mean? You tell PDL::PP that this pdl is only used for tem-
       porary results in the course of the calculation and you
       are not interested in its value after the computation has
       been completed. But why should PDL::PP want to know about
       this in the first place?  The reason is closely related to
       the concepts of pdl auto creation (you heard about that
       above) and implicit threading. If you use implicit thread-
       ing the dimensionality of automatically created pdls is
       actually larger than that specified in the signature. With
       "[o]" flagged pdls will be created so that they have the
       additional dimensions as required by the number of
       implicit thread dimensions. When creating a temporary pdl,
       however, it will always only be made big enough so that it
       can hold the result for one iteration in a threadloop,
       i.e. as large as required by the signature.  So less mem-
       ory is wasted when you flag a pdl as temporary. Secondly,
       you can use output auto creation with temporary pdls even
       when you are using explicit threading which is forbidden
       for normal output pdls flagged with "[o]" (see PDL::Index-
       ing).

       Here is an example where we use the [t] qualifier. We
       define the function "callf" that calls a C routine "f"
       which needs a temporary array of the same size and type as
       the array "a" (sorry about the forward reference for $P;
       it's a pointer access, see below) :




         pp_def('callf',
               Pars => 'a(a); [t] tmp(p); [o] b()',
               Code => 'int ns = $SIZE(E);
                        f($P(P),$P(P),$P(P),ns);
                       '
         );


       Argument dimensions and the signature

       Now we have just talked about dimensions of pdls and the
       signature. How are they related? Let's say that we want to
       add a scalar + the index number to a vector:

               pp_def('add2',
                       Pars => 'a(a); b(); [o]c(c);',
                       Code => 'loop(p) %{
                                       $c() = $a() + $b() + n;
                                %}'
               );

       There are several points to notice here: first, the "Pars"
       argument now contains the n arguments to show that we have
       a single dimensions in a and c. It is important to note
       that dimensions are actual entities that are accessed by
       name so this declares a and c to have the same first
       dimensions. In most PP definitions the size of named
       dimensions will be set from the respective dimensions of
       non-output pdls (those with no "[o]" flag) but sometimes
       you might want to set the size of a named dimension
       explicitly through an integer parameter. See below in the
       description of the "OtherPars" section how that works.

       Type conversions and the signature

       The signature also determines the type conversions that
       will be performed when a PP function is invoked. So what
       happens when we invoke one of our previously defined func-
       tions with pdls of different type, e.g.

         add2($a,$b,($ret=null));

       where $a is of type "PDL_Float" and $b of type
       "PDL_Short"? With the signature as shown in the definition
       of "add2" above the datatype of the operation (as deter-
       mined at runtime) is that of the pdl with the 'highest'
       type (sequence is byte < short < ushort < long < float <
       double). In the add2 example the datatype of the operation
       is float ($a has that datatype). All pdl arguments are
       then type converted to that datatype (they are not con-
       verted inplace but a copy with the right type is created
       if a pdl argument doesn't have the type of the operation).
       Null pdls don't contribute a type in the determination of
       the type of the operation.  However, they will be created
       with the datatype of the operation; here, for example,
       $ret will be of type float. You should be aware of these
       rules when calling PP functions with pdls of different
       types to take the additional storage and runtime require-
       ments into account.

       These type conversions are correct for most functions you
       normally define with "pp_def". However, there are certain
       cases where slightly modified type conversion behaviour is
       desired. For these cases additional qualifiers in the sig-
       nature can be used to specify the desired properties with
       regard to type conversion. These qualifiers can be
       combined with those we have encountered already (the cre-
       ation qualifiers "[o]" and "[t]"). Let's go through the
       list of qualifiers that change type conversion behaviour.

       The most important is the "int" qualifier which comes in
       handy when a pdl argument represents indices into another
       pdl. Let's take a look at an example from "PDL::Ufunc":

          pp_def('maximum_ind',
                 Pars => 'a(a); int [o] b()',
                 Code => '$GENERIC() cur;
                          int curind;
                          loop(p) %{
                           if (!n || $a() > cur) {cur = $a(); curind = n;}
                          %}
                          $b() = curind;',
          );

       The function "maximum_ind" finds the index of the largest
       element of a vector. If you look at the signature you
       notice that the output argument "b" has been declared with
       the additional "int" qualifier.  This has the following
       consequences for type conversions: regardless of the type
       of the input pdl "a" the output pdl "b" will be of type
       "PDL_Long" which makes sense since "b" will represent an
       index into "a". Furthermore, if you call the function with
       an existing output pdl "b" its type will not influence the
       datatype of the operation (see above). Hence, even if "a"
       is of a smaller type than "b" it will not be converted to
       match the type of "b" but stays untouched, which saves
       memory and CPU cycles and is the right thing to do when
       "b" represents indices. Also note that you can use the
       'int' qualifier together with other qualifiers (the "[o]"
       and "[t]" qualifiers). Order is significant -- type quali-
       fiers precede creation qualifiers ("[o]" and "[t]").

       The above example also demonstrates typical usage of the
       "$GENERIC()" macro.  It expands to the current type in a
       so called generic loop. What is a generic loop? As you
       already heard a PP function has a runtime datatype as
       determined by the type of the pdl arguments it has been
       invoked with.  The PP generated XS code for this function
       therefore contains a switch like "switch (type) {case
       PDL_Byte: ... case PDL_Double: ...}" that selects a case
       based on the runtime datatype of the function (it's called
       a type ``loop'' because there is a loop in PP code that
       generates the cases).  In any case your code is inserted
       once for each PDL type into this switch statement. The
       "$GENERIC()" macro just expands to the respective type in
       each copy of your parsed code in this "switch" statement,
       e.g., in the "case PDL_Byte" section "cur" will expand to
       "PDL_Byte" and so on for the other case statements. I
       guess you realise that this is a useful macro to hold val-
       ues of pdls in some code.

       There are a couple of other qualifiers with similar
       effects as "int".  For your convenience there are the
       "float" and "double" qualifiers with analogous conse-
       quences on type conversions as "int". Let's assume you
       have a very large array for which you want to compute row
       and column sums with an equivalent of the "sumover" func-
       tion.  However, with the normal definition of "sumover"
       you might run into problems when your data is, e.g. of
       type short. A call like

         sumover($large_pdl,($sums = null));

       will result in $sums be of type short and is therefore
       prone to overflow errors if $large_pdl is a very large
       array. On the other hand calling

         @dims = $large_pdl->dims; shift @dims;
         sumover($large_pdl,($sums = zeroes(double,@dims)));

       is not a good alternative either. Now we don't have over-
       flow problems with $sums but at the expense of a type con-
       version of $large_pdl to double, something bad if this is
       really a large pdl. That's where "double" comes in handy:

         pp_def('sumoverd',
                Pars => 'a(a); double [o] b()',
                Code => 'double tmp=0;
                         loop(p) %{ tmp += a(); %}
                         $b() = tmp;',
         );

       This gets us around the type conversion and overflow prob-
       lems. Again, analogous to the "int" qualifier "double"
       results in "b" always being of type double regardless of
       the type of "a" without leading to a typeconversion of "a"
       as a side effect.

       Finally, there are the "type+" qualifiers where type is
       one of "int" or "float". What shall that mean. Let's
       illustrate the "int+" qualifier with the actual definition
       of sumover:

         pp_def('sumover',
                Pars => 'a(a); int+ [o] b()',
                Code => '$GENERIC(C) tmp=0;
                         loop(p) %{ tmp += a(); %}
                         $b() = tmp;',
         );

       As we had already seen for the "int", "float" and "double"
       qualifiers, a pdl marked with a "type+" qualifier does not
       influence the datatype of the pdl operation. Its meaning
       is "make this pdl at least of type "type" or higher, as
       required by the type of the operation". In the sumover
       example this means that when you call the function with an
       "a" of type PDL_Short the output pdl will be of type
       PDL_Long (just as would have been the case with the "int"
       qualifier). This again tries to avoid overflow problems
       when using small datatypes (e.g. byte images).  However,
       when the datatype of the operation is higher than the type
       specified in the "type+" qualifier "b" will be created
       with the datatype of the operation, e.g. when "a" is of
       type double then "b" will be double as well. We hope you
       agree that this is sensible behaviour for "sumover". It
       should be obvious how the "float+" qualifier works by
       analogy.  It may become necessary to be able to specify a
       set of alternative types for the parameters. However, this
       will probably not be implemented until someone comes up
       with a reasonable use for it.

       Note that we now had to specify the $GENERIC macro with
       the name of the pdl to derive the type from that argument.
       Why is that? If you carefully followed our explanations
       you will have realised that in some cases "b" will have a
       different type than the type of the operation.  Calling
       the '$GENERIC' macro with "b" as argument makes sure that
       the type will always the same as that of "b" in that part
       of the generic loop.

       This is about all there is to say about the "Pars" section
       in a "pp_def" call. You should remember that this section
       defines the signature of a PP defined function, you can
       use several options to qualify certain arguments as output
       and temporary args and all dimensions that you can later
       refer to in the "Code" section are defined by name.

       It is important that you understand the meaning of the
       signature since in the latest PDL versions you can use it
       to define threaded functions from within perl, i.e. what
       we call perl level threading. Please check PDL::Indexing
       for details.

       The Code section

       The "Code" section contains the actual XS code that will
       be in the innermost part of a threadloop (if you don't
       know what a thread loop is then you still haven't read
       PDL::Indexing; do it now ;) after any PP macros (like
       $GENERIC) and PP functions have been expanded (like the
       "loop" function we are going to explain next).

       Let's quickly reiterate the "sumover" example:

         pp_def('sumover',
                Pars => 'a(a); int+ [o] b()',
                Code => '$GENERIC(C) tmp=0;
                         loop(p) %{ tmp += a(); %}
                         $b() = tmp;',
         );

       The "loop" construct in the "Code" section also refers to
       the dimension name so you don't need to specify any lim-
       its: the loop is correctly sized and everything is done
       for you, again.

       Next, there is the surprising fact that "$a()" and "$b()"
       do not contain the index. This is not necessary because
       we're looping over n and both variables know which dimen-
       sions they have so they automatically know they're being
       looped over.

       This feature comes in very handy in many places and makes
       for much shorter code. Of course, there are times when you
       want to circumvent this; here is a function which sym-
       metrizes a matrix and serves as an example of how to code
       explicit looping:

               pp_def('symm',
                       Pars => 'a(n,n); [o]c(n,n);',
                       Code => 'loop(p) %{
                                       int n2;
                                       for(n2=n; n2<$SIZE(E); n2++) {
                                               $c(n0 => n, n1 => n2) =
                                               $c(n0 => n2, n1 => n) =
                                                $a(n0 => n, n1 => n2);
                                       }
                               %}
                       '
               );

       Let's dissect what is happening. Firstly, what is this
       function supposed to do? From its signature you see that
       it takes a 2D matrix with equal numbers of columns and
       rows and outputs a matrix of the same size. From a given
       input matrix $a it computes a symmetric output matrix $c
       (symmetric in the matrix sense that A^T = A where ^T means
       matrix transpose, or in PDL parlance $c == $c->xchg(0,1)).
       It does this by using only the values on and below the
       diagonal of $a. In the output matrix $c all values on and
       below the diagonal are the same as those in $a while those
       above the diagonal are a mirror image of those below the
       diagonal (above and below are here interpreted in the way
       that PDL prints 2D pdls). If this explanation still sounds
       a bit strange just go ahead, make a little file into which
       you write this definition, build the new PDL extension
       (see section on Makefiles for PP code) and try it out with
       a couple of examples.

       Having explained what the function is supposed to do there
       are a couple of points worth noting from the syntactical
       point of view. First, we get the size of the dimension
       named "n" again by using the $SIZE macro. Second, there
       are suddenly these funny "n0" and "n1" index names in the
       code though the signature defines only the dimension "n".
       Why this? The reason becomes clear when you note that both
       the first and second dimension of $a and $b are named "n"
       in the signature of "symm". This tells PDL::PP that the
       first and second dimension of these arguments should have
       the same size. Otherwise the generated function will raise
       a runtime error.  However, now in an access to $a and $c
       PDL::PP cannot figure out which index "n" refers to any
       more just from the name of the index.  Therefore, the
       indices with equal dimension names get numbered from left
       to right starting at 0, e.g. in the above example "n0"
       refers to the first dimension of $a and $c, "n1" to the
       second and so on.

       In all examples so far, we have only used the "Pars" and
       "Code" members of the hash that was passed to "pp_def".
       There are certainly other keys that are recognised by
       PDL::PP and we will hear about some of them in the course
       of this document. Find a (non-exhaustive) list of keys in
       Appendix A.  A list of macros and PPfunctions (we have
       only encountered some of those in the examples above yet)
       that are expanded in values of the hash argument to
       "pp_def" is summarised in Appendix B.

       At this point, it might be appropriate to mention that
       PDL::PP is not a completely static, well designed set of
       routines (as Tuomas puts it: "stop thinking of PP as a set
       of routines carved in stone") but rather a collection of
       things that the PDL::PP author (Tuomas J. Lukka) consid-
       ered he would have to write often into his PDL extension
       routines. PP tries to be expandable so that in the future,
       as new needs arise, new common code can be abstracted back
       into it. If you want to learn more on why you might want
       to change PDL::PP and how to do it check the section on
       PDL::PP internals.

       Handling bad values

       If you do not have bad-value support compiled into PDL you
       can ignore this section and the related keys: "BadCode",
       "HandleBad", ...  (try printing out the value of
       $PDL::Bad::Status - if it equals 0 then move straight on).

       There are several keys and macros used when writing code
       to handle bad values. The first one is the "HandleBad"
       key:

       HandleBad => 0
           This flags a pp-routine as NOT handling bad values. If
           this routine is sent piddles with their "badflag" set,
           then a warning message is printed to STDOUT and the
           piddles are processed as if the value used to repre-
           sent bad values is a valid number. The "badflag" value
           is not propogated to the output piddles.

           An example of when this is used is for FFT routines,
           which generally do not have a way of ignoring part of
           the data.

       HandleBad => 1
           This causes PDL::PP to write extra code that ensures
           the BadCode section is used, and that the "$ISBAD()"
           macro (and its brethren) work.

       HandleBad is not given
           If any of the input piddles have their "badflag" set,
           then the output piddles will have their "badflag" set,
           but any supplied BadCode is ignored.

       The value of "HandleBad" is used to define the contents of
       the "BadDoc" key, if it is not given.

       To handle bad values, code must be written somewhat dif-
       ferently; for instance,

        $c() = $a() + $b();

       becomes something like

        if ( $a() != BADVAL && $b() != BADVAL ) {
           $c() = $a() + $b();
        } else {
           $c() = BADVAL;
        }

       However, we only want the second version if bad values are
       present in the input piddles (and that bad-value support
       is wanted!) - otherwise we actually want the original
       code. This is where the "BadCode" key comes in; you use it
       to specify the code to execute if bad values may be pre-
       sent, and PP uses both it and the "Code" section to create
       something like:

        if ( bad_values_are_present ) {
           fancy_threadloop_stuff {
              BadCode
           }
        } else {
           fancy_threadloop_stuff {
              Code
           }
        }

       This approach means that there is virtually no overhead
       when bad values are not present (ie the badflag routine
       returns 0).

       The BadCode section can use the same macros and looping
       constructs as the Code section.  However, it wouldn't be
       much use without the following additional macros:

       $ISBAD(D)
           To check whether a piddle's value is bad, use the
           $ISBAD macro:

            if ( $ISBAD(a()) ) { printf("a() is bad\n"); }

           You can also access given elements of a piddle:

            if ( $ISBAD(a(n=>l)) ) { printf("element %d of a() is bad\n", l); }


       $ISGOOD(D)
           This is the opposite of the $ISBAD macro.

       $SETBAD(D)
           For when you want to set an element of a piddle bad.

       $ISBADVAR(c_var,pdl)
           If you have cached the value of a piddle "$a()" into a
           c-variable ("foo" say), then to check whether it is
           bad, use "$ISBADVAR(foo,a)".

       $ISGOODVAR(c_var,pdl)
           As above, but this time checking that the cached value
           isn't bad.

       $SETBADVAR(c_var,pdl)
           To copy the bad value for a piddle into a c variable,
           use "$SETBADVAR(foo,a)".

       TODO: mention "$PPISBAD()" etc macros.

       Using these macros, the above code could be specified as:

        Code => '$c() = $a() + $b();',
        BadCode => '
           if ( $ISBAD(a()) || $ISBAD(b()) ) {
              $SETBAD(c());
           } else {
              $c() = $a() + $b();
           }',

       Since this is perl, TMTOWTDI, so you could also write:

        BadCode => '
           if ( $ISGOOD(a()) && $ISGOOD(b()) ) {
              $c() = $a() + $b();
           } else {
              $SETBAD(c());
           }',

       If you want access to the value of the badflag for a given
       piddle, you can use the "$PDLSTATExxxx()" macros:

       $PDLSTATEISBAD(D)
       $PDLSTATEISGOOD(D)
       $PDLSTATESETBAD(D)
       $PDLSTATESETGOOD(D)

       TODO: mention the "FindBadStatusCode" and "CopyBadStatus-
       Code" options to "pp_def", as well as the "BadDoc" key.

       Interfacing your own/library functions using PP

       Now, consider the following: you have your own C function
       (that may in fact be part of some library you want to
       interface to PDL) which takes as arguments two pointers to
       vectors of double:

               void myfunc(int n,double *v1,double *v2);

       The correct way of defining the PDL function is

               pp_def('myfunc',
                       Pars => 'a(a); [o]b(b);',
                       GenericTypes => [D],
                       Code => 'myfunc($SIZE(E),$P(P),$P(P));'
               );

       The "$P("par")" syntax returns a pointer to the first ele-
       ment and the other elements are guaranteed to lie after
       that.

       Notice that here it is possible to make many mistakes.
       First, $SIZE(E) must be used instead of "n". Second, you
       shouldn't put any loops in this code. Third, here we
       encounter a new hash key recognised by PDL::PP : the
       "GenericTypes" declaration tells PDL::PP to ONLY GENERATE
       THE TYPELOOP FOP THE LIST OF TYPES SPECIFIED. In this case
       "double". This has two advantages. Firstly the size of the
       compiled code is reduced vastly, secondly if non-double
       arguments are passed to "myfunc()" PDL will automatically
       convert them to double before passing to the external C
       routine and convert them back afterwards.

       One can also use "Pars" to qualify the types of individual
       arguments. Thus one could also write this as:

               pp_def('myfunc',
                       Pars => 'double a(a); double [o]b(b);',
                       Code => 'myfunc($SIZE(E),$P(P),$P(P));'
               );

       The type specification in "Pars" exempts the argument from
       variation in the typeloop - rather it is automatically
       converted too and from the type specified. This is obvi-
       ously useful in a more general example, e.g.:

               void myfunc(int n,float *v1,long *v2);

               pp_def('myfunc',
                       Pars => 'float a(a); long [o]b(b);',
                       GenericTypes => [F],
                       Code => 'myfunc($SIZE(E),$P(P),$P(P));'
               );

       Note we still use "GenericTypes" to reduce the size of the
       type loop, obviously PP could in principle spot this and
       do it automatically though the code has yet to attain that
       level of sophistication!

       Finally note when types are converted automatically one
       MUST use the "[o]" qualifier for output variables or you
       hard one changes will get optimised away by PP!

       If you interface a large library you can automate the
       interfacing even further. Perl can help you again(!) in
       doing this. In many libraries you have certain calling
       conventions. This can be exploited. In short, you can
       write a little parser (which is really not difficult in
       perl) that then generates the calls to "pp_def" from
       parsed descriptions of the functions in that library. For
       an example, please check the Slatec interface in the "Lib"
       tree of the PDL distribution. If you want to check (during
       debugging) which calls to PP functions your perl code gen-
       erated a little helper package comes in handy which
       replaces the PP functions by identically named ones that
       dump their arguments to stdout.

       Just say

          perl -MPDL::PP::Dump myfile.pd

       to see the calls to "pp_def" and friends. Try it with
       ops.pd and slatec.pd. If you're interested (or want to
       enhance it), the source is in Basic/Gen/PP/Dump.pm

       Other macros and functions in the Code section

       Macros: So far we have encountered the $SIZE, $GENERIC and
       $P macros.  Now we are going to quickly explain the other
       macros that are expanded in the "Code" section of PDL::PP
       along with examples of their usage.

       $T The $T macro is used for type switches. This is very
          useful when you have to use different external (e.g.
          library) functions depending on the input type of argu-
          ments. The general syntax is

                  $Ttypeletters(type_alternatives)

          where "typeletters" is a permutation of a subset of the
          letters "BSULFD" which stand for Byte, Short, Ushort,
          etc. and "type_alternatives" are the expansions when
          the type of the PP operation is equal to that indicated
          by the respective letter. Let's illustrate this incom-
          prehensible description by an example. Assuming you
          have two C functions with prototypes

            void float_func(float *in, float *out);
            void double_func(double *in, double *out);

          which do basically the same thing but one accepts float
          and the other double pointers. You could interface them
          to PDL by defining a generic function "foofunc" (which
          will call the correct function depending on the type of
          the transformation):

            pp_def('foofunc',
                  Pars => ' a(a); [o] b();',
                  Code => ' $TFD(float_func,double_func) ($P(P),$P(P));'
                  GenericTypes => [F,D],
            );

          Please note that you can't say

                 Code => ' $TFD(float,double)_func ($P(P),$P(P));'

          since the $T macro expands with trailing spaces, analo-
          gously to C preprocessor macros.  The slightly longer
          form illustrated above is correct.  If you really want
          brevity, you can of course do

                  '$TBSULFD('.(join ',',map {"long_identifier_name_$_"}
                          qw/byt short unseigned lounge flotte dubble/).');'


       $PP
          The $PP macro is used for a so called physical pointer
          access. The physical refers to some internal optimisa-
          tions of PDL (for those who are familiar with the PDL
          core we are talking about the vaffine optimisations).
          This macro is mainly for internal use and you shouldn't
          need to use it in any of your normal code.

       $COMP (and the "OtherPars" section)
          The $COMP macro is used to access non-pdl values in the
          code section. Its name is derived from the implementa-
          tion of transformations in PDL. The variables you can
          refer to using $COMP are members of the ``compiled''
          structure that represents the PDL transformation in
          question but does not yet contain any information about
          dimensions (for further details check PDL::Internals).
          However, you can treat $COMP just as a black box with-
          out knowing anything about the implementation of trans-
          formations in PDL. So when would you use this macro?
          Its main usage is to access values of arguments that
          are declared in the "OtherPars" section of a "pp_def"
          definition. But then you haven't heard about the "Oth-
          erPars" key yet?!  Let's have another example that
          illustrates typical usage of both new features:

            pp_def('pnmout',
                  Pars => 'a(a)',
                  OtherPars => "char* fd",
                  GenericTypes => [B,U,S,L],
                  Code => 'PerlIO *fp;
                           IO *io;

                         io = GvIO(gv_fetchpv($COMP(P),FALSE,SVt_PVIO));
                           if (!io || !(fp = IoIFP(P)))
                                  croak("Can\'t figure out FP");

                           if (PerlIO_write(fp,$P(P),len) != len)
                                          croak("Error writing pnm file");
            ');

          This function is used to write data from a pdl to a
          file. The file descriptor is passed as a string into
          this function. This parameter does not go into the
          "Pars" section since it cannot be usefully treated like
          a pdl but rather into the aptly named "OtherPars" sec-
          tion. Parameters in the "OtherPars" section follow
          those in the "Pars" section when invoking the function,
          i.e.

             open FILE,">out.dat" or die "couldn't open out.dat";
             pnmout($pdl,'FILE');

          When you want to access this parameter inside the code
          section you have to tell PP by using the $COMP macro,
          i.e. you write "$COMP(P)" as in the example. Otherwise
          PP wouldn't know that the "fd" you are referring to is
          the same as that specified in the "OtherPars" section.

          Another use for the "OtherPars" section is to set a
          named dimension in the signature. Let's have an example
          how that is done:

            pp_def('setdim',
                  Pars => '[o] a(a)',
                  OtherPars => 'int ns => n',
                  Code => 'loop(p) %{ $a() = n; %}',
            );

          This says that the named dimension "n" will be ini-
          tialised from the value of the other parameter "ns"
          which is of integer type (I guess you have realised
          that we use the "CType From => named_dim" syntax).  Now
          you can call this function in the usual way:

            setdim(($a=null),5);
            print $a;
              [ 0 1 2 3 4 ]

          Admittedly this function is not very useful but it
          demonstrates how it works. If you call the function
          with an existing pdl and you don't need to explicitly
          specify the size of "n" since PDL::PP can figure it out
          from the dimensions of the non-null pdl. In that case
          you just give the dimension parameter as "-1":

            $a = hist($b);
            setdim($a,-1);

          That should do it.

       The only PP function that we have used in the examples so
       far is "loop".  Additionally, there are currently two
       other functions which are recognised in the "Code" sec-
       tion:

       threadloop
         As we heard above the signature of a PP defined function
         defines the dimensions of all the pdl arguments involved
         in a primitive operation.  However, you often call the
         functions that you defined with PP with pdls that have
         more dimensions than those specified in the signature.
         In this case the primitive operation is performed on all
         subslices of appropriate dimensionality in what is
         called a threadloop (see also overview above and
         PDL::Indexing). Assuming you have some notion of this
         concept you will probably appreciate that the operation
         specified in the code section should be optimised since
         this is the tightest loop inside a threadloop.  However,
         if you revisit the example where we define the "pnmout"
         function, you will quickly realise that looking up the
         "IO" file descriptor in the inner threadloop is not very
         efficient when writing a pdl with many rows. A better
         approach would be to look up the "IO" descriptor once
         outside the threadloop and use its value then inside the
         tightest threadloop. This is exactly where the "thread-
         loop" function comes in handy. Here is an improved defi-
         nition of "pnmout" which uses this function:

           pp_def('pnmout',
                 Pars => 'a(a)',
                 OtherPars => "char* fd",
                 GenericTypes => [B,U,S,L],
                 Code => 'PerlIO *fp;
                          IO *io;
                          int len;

                        io = GvIO(gv_fetchpv($COMP(P),FALSE,SVt_PVIO));
                          if (!io || !(fp = IoIFP(P)))
                                 croak("Can\'t figure out FP");

                          len = $SIZE(E) * sizeof($GENERIC());



                          threadloop %{
                             if (PerlIO_write(fp,$P(P),len) != len)
                                         croak("Error writing pnm file");
                          %}
           ');

         This works as follows. Normally the C code you write
         inside the "Code" section is placed inside a threadloop
         (i.e., PP generates the appropriate wrapping XS code
         around it). However, when you explicitly use the
         "threadloop" function, PDL::PP recognises this and
         doesn't wrap your code with an additional threadloop.
         This has the effect that code you write outside the
         threadloop is only executed once per transformation and
         just the code with in the surrounding "%{ ... %}" pair
         is placed within the tightest threadloop. This also
         comes in handy when you want to perform a decision (or
         any other code, especially CPU intensive code) only once
         per thread, i.e.

           pp_addhdr('
             #define RAW 0
             #define ASCII 1
           ');
           pp_def('do_raworascii',
                  Pars => 'a(); b(); [o]c()',
                  OtherPars => 'int mode',
                Code => ' switch ($COMP(P)) {
                             case RAW:
                                 threadloop %{
                                     /* do raw stuff */
                                 %}
                                 break;
                             case ASCII:
                                 threadloop %{
                                     /* do ASCII stuff */
                                 %}
                                 break;
                             default:
                                 croak("unknown mode");
                            }'
            );


       types
         The types function works similar to the $T macro. How-
         ever, with the "types" function the code in the follow-
         ing block (delimited by "%{" and "%}" as usual) is exe-
         cuted for all those cases in which the datatype of the
         operation is any of the types represented by the letters
         in the argument to "type", e.g.

              Code => '...

                      types(s) %{
                          /* do integer type operation */
                      %}
                      types(s) %{
                          /* do floating point operation */
                      %}
                      ...'






       Other useful PP keys in data operation definitions

       You have already heard about the "OtherPars" key. Cur-
       rently, there are not many other keys for a data operation
       that will be useful in normal (whatever that is) PP pro-
       gramming. In fact, it would be interesting to hear about a
       case where you think you need more than what is provided
       at the moment.  Please speak up on one of the PDL mailing
       lists. Most other keys recognised by "pp_def" are only
       really useful for what we call slice operations (see also
       above).

       One thing that is strongly being planned is variable num-
       ber of arguments, which will be a little tricky.

       An incomplete list of the available keys:

       Inplace
           Setting this key marks the routine as working inplace
           - ie the input and output piddles are the same. An
           example is "$a->inplace->sqrt()" (or
           "sqrt(inplace($a))").

           Inplace => 1
               Use when the routine is a unary function, such as
               "sqrt".

           Inplace => ['a']
               If there are more than one input piddles, specify
               the name of the one that can be changed inplace
               using an array reference.

           Inplace => ['a','b']
               If there are more than one output piddle, specify
               the name of the input piddle and output piddle in
               a 2-element array reference. This probably isn't
               needed, but left in for completeness.

           If bad values are being used, care must be taken to
           ensure the propogation of the badflag when inplace is
           being used; consider this excerpt from
           Basic/Bad/bad.pd:

             pp_def('replacebad',HandleBad => 1,
               Pars => 'a(); [o]b();',
               OtherPars => 'double newval',
               Inplace => 1,
               CopyBadStatusCode =>
               '/* propogate badflag if inplace AND it has changed */
                if ( a == b && $ISPDLSTATEBAD(D) )
                  PDL->propogate_badflag( b, 0 );

                /* always make sure the output is "good" */
                $SETPDLSTATEGOOD(D);
               ',
               ...

           Since this routine removes all bad values, then the
           output piddle had its bad flag cleared. If run inplace
           (so "a == b"), then we have to tell all the children
           of "a" that the bad flag has been cleared (to save
           time we make sure that we call "PDL->pro-
           pogate_badgflag" only if the input piddle had its bad
           flag set).

           NOTE: one idea is that the documentation for the rou-
           tine could be automatically flagged to indicate that
           it can be executed inplace, ie something similar to
           how "HandleBad" sets "BadDoc" if it's not supplied
           (it's not an ideal solution).

       Other PDL::PP functions to support concise package defini-
       tion

       So far, we have described the "pp_def" and "pp_done" func-
       tions. PDL::PP exports a few other functions to aid you in
       writing concise PDL extension package definitions.

       Often when you interface library functions as in the above
       example you have to include additional C include files.
       Since the XS file is generated by PP we need some means to
       make PP insert the appropriate include directives in the
       right place into the generated XS file.  To this end there
       is the "pp_addhdr" function. This is also the function to
       use when you want to define some C functions for internal
       use by some of the XS functions (which are mostly func-
       tions defined by "pp_def").  By including these functions
       here you make sure that PDL::PP inserts your code before
       the point where the actual XS module section begins and
       will therefore be left untouched by xsubpp (cf. perlxs and
       perlxstut manpages).

       A typical call would be

         pp_addhdr('
         #include <unistd.h>       /* we need defs of XXXX */
         #include "libprotos.h"    /* prototypes of library functions */
         #include "mylocaldecs.h"  /* Local decs */

         static void do_the real_work(PDL_Byte * in, PDL_Byte * out, int n)
         {
               /* do some calculations with the data */
         }
         ');

       This ensures that all the constants and prototypes you
       need will be properly included and that you can use the
       internal functions defined here in the "pp_def"s, e.g.:

         pp_def('barfoo',
                Pars => ' a(a); [o] b(b)',
                GenericTypes => '[B]',
                Code => ' int ns = $SIZE(E);
                          do_the_real_work($P(P),$P(P),ns);
                        ',
         );

       In many cases the actual PP code (meaning the arguments to
       "pp_def" calls) is only part of the package you are cur-
       rently implementing. Often there is additional perl code
       and XS code you would normally have written into the pm
       and XS files which are now automatically generated by PP.
       So how to get this stuff into those dynamically generated
       files? Fortunately, there are a couple of functions, gen-
       erally called "pp_addXXX" that assist you in doing this.

       Let's assume you have additional perl code that should go
       into the generated pm-file. This is easily achieved with
       the "pp_addpm" command:

          pp_addpm(<<'EOD');

          =head1 NAME

          PDL::Lib::Mylib -- a PDL interface to the Mylib library

          =head1 DESCRIPTION

          This package implements an interface to the Mylib package with full
          threading and indexing support (see L<PDL::Indexing>).

          =cut

          use PGPLOT;

          =head2 use_myfunc
               this function applies the myfunc operation to all the
               elements of the input pdl regardless of dimensions
               and returns the sum of the result
          =cut

          sub use_myfunc {
               my $pdl = shift;

               myfunc($pdl->clump(-1),($res=null));

               return $res->sum;
          }

          EOD

       You have probably got the idea. In some cases you also
       want to export your additional functions. To avoid getting
       into trouble with PP which also messes around with the
       @EXPORT array you just tell PP to add your functions to
       the list of exported functions:

         pp_add_exported('', 'use_myfunc gethynx');

       Note the initial empty string argument (reason for it?).

       The "pp_add_isa" command works like the the
       "pp_add_exported" function.  The arguments to "pp_add_isa"
       are added the @ISA list, e.g.

         pp_add_isa(' Some::Other::Class ');

       Sometimes you want to add extra XS code of your own (that
       is generally not involved with any threading/indexing
       issues but supplies some other functionality you want to
       access from the perl side) to the generated XS file, for
       example

         pp_addxs('','

         # Determine endianness of machine

         int
         isbigendian()
            CODE:
              unsigned short i;
              PDL_Byte *b;

              i = 42; b = (PDL_Byte*) (void*) &i;



              if (*b == 42)
                 RETVAL = 0;
              else if (*(b+1) == 42)
                 RETVAL = 1;
              else
                 croak("Impossible - machine is neither big nor little endian!!\n");
              OUTPUT:
                RETVAL
         ');

       Especially "pp_add_exported" and "pp_addxs" should be used
       with care. PP uses PDL::Exporter, hence letting PP export
       your function means that they get added to the standard
       list of function exported by default (the list defined by
       the export tag ``:Func''). If you use "pp_addxs" you
       shouldn't try to do anything that involves threading or
       indexing directly. PP is much better at generating the
       appropriate code from your definitions.

       Finally, you may want to add some code to the BOOT section
       of the XS file (if you don't know what that is check per-
       lxs). This is easily done with the "pp_add_boot" command:

         pp_add_boot(<<EOB);
               descrip = mylib_initialize(KEEP_OPEN);

               if (descrip == NULL)
                  croak("Can't initialize library");

               GlobalStruc->descrip = descrip;
               GlobalStruc->maxfiles = 200;
         EOB

       By default, PP.pm puts all subs defined using the pp_def
       function into the output .pm file's EXPORT list. This can
       create problems if you are creating a subclassed object
       where you don't want any methods exported. (i.e. the meth-
       ods will only be called using the $object->method syntax).

       For these cases you can call pp_export_nothing() to clear
       out the export list. Example (At the end of the .pd file):

         pp_export_nothing();
         pp_done();

       By default, PP.pm puts the 'use Core;' line into the out-
       put .pm file. This imports Core's exported names into the
       current namespace, which can create problems if you are
       over-riding one of Core's methods in the current file.
       You end up getting messages like "Warning: sub sumover
       redefined in file subclass.pm" when running the program.

       For these cases the pp_core_importList can be used to
       change what is imported from Core.pm.  For example:

         pp_core_importList('()')

       This would result in

         use Core();

       being generated in the output .pm file. This would result
       in no names being imported from Core.pm. Similarly, call-
       ing

         pp_core_importList(' qw/ barf /')

       would result in

         use Core qw/ barf/;

       being generated in the output .pm file. This would result
       in just 'barf' being imported from Core.pm.

Slice operation
       The slice operation section of this manual is provided
       using dataflow and lazy evaluation: when you need it, ask
       Tjl to write it.  a delivery in a week from when I receive
       the email is 95% probable and two week delivery is 99%
       probable.

       And anyway, the slice operations require a much more inti-
       mate knowledge of PDL internals than the data operations.
       Furthermore, the complexity of the issues involved is con-
       siderably higher than that in the average data operation.
       If you would like to convince yourself of this fact take a
       look at the Basic/Slices/slices.pd file in the PDL distri-
       bution :-). Nevertheless, functions generated using the
       slice operations are at the heart of the index manipula-
       tion and dataflow capabilities of PDL.

       Also, there are a lot of dirty issues with virtual piddles
       and vaffines which we shall entirely skip here.

       Slices and bad values

       Slice operations need to be able to handle bad values (if
       support is compiled into PDL). The easiest thing to do is
       look at Basic/Slices/slices.pd to see how this works.

       Along with "BadCode", there are also the "BadBackCode" and
       "BadRedoDimsCode" keys for "pp_def". However, any "Equiv-
       CPOffsCode" should not need changing, since any changes
       are absorbed into the definition of the "$EQUIVCPOFFS()"
       macro (ie it is handled automatically by PDL::PP>.

USEFUL ROUTINES
       The PDL "Core" structure, defined in Basic/Core/pdl-
       core.h.PL, contains pointers to a number of routines that
       may be useful to you.  The majority of these routines deal
       with manipulating piddles, but some are more general:

       PDL->qsort_B( PDL_Byte *xx, int a, int b )
           Sort the array "xx" between the indices "a" and "b".
           There are also versions for the other PDL datatypes,
           with postfix "_S", "_U", "_L", "_F", and "_D".  Any
           module using this must ensure that "PDL::Ufunc" is
           loaded.

       PDL->qsort_ind_B( PDL_Byte *xx, int *ix, int a, int b )
           As for "PDL->qsort_B", but this time sorting the
           indices rather than the data.

       The routine "med2d" in Lib/Image2D/image2d.pd shows how
       such routines are used.

MAKEFILES FOR PP FILES
       If you are going to generate a package from your PP file
       (typical file extensions are ".pd" or ".pp" for the files
       containing PP code) it is easiest and safest to leave gen-
       eration of the appropriate commands to the Makefile. In
       the following we will outline the typical format of a perl
       Makefile to automatically build and install your package
       from a description in a PP file. Most of the rules to
       build the xs, pm and other required files from the PP file
       are already predefined in the PDL::Core::Dev package. We
       just have to tell MakeMaker to use it.

       In most cases you can define your Makefile like

         # Makefile.PL for a package defined by PP code.

         use PDL::Core::Dev;            # Pick up development utilities
         use ExtUtils::MakeMaker;

         $package = ["mylib.pd",Mylib,PDL::Lib::Mylib];
         %hash = pdlpp_stdargs($package);
         $hash{OBJECT} .= ' additional_Ccode$(OBJ_EXT) ';
         $hash{clean}->{FILES} .= ' todelete_Ccode$(OBJ_EXT) ';
         $hash{'VERSION_FROM'} = 'mylib.pd';
         WriteMakefile(%hash);

         sub MY::postamble { pdlpp_postamble($package); }

       Here, the list in $package is: first: PP source file name,
       then the prefix for the produced files and finally the
       whole package name.  You can modify the hash in whatever
       way you like but it would be reasonable to stay within
       some limits so that your package will continue to work
       with later versions of PDL.

       If you don't want to use prepackaged arguments, here is a
       generic Makefile.PL that you can adapt for your own needs:

         # Makefile.PL for a package defined by PP code.

         use PDL::Core::Dev;            # Pick up development utilities
         use ExtUtils::MakeMaker;

         WriteMakefile(
          'NAME'       => 'PDL::Lib::Mylib',
          'VERSION_FROM'       => 'mylib.pd',
          'TYPEMAPS'     => [&PDL_TYPEMAP()],
          'OBJECT'       => 'mylib$(OBJ_EXT) additional_Ccode$(OBJ_EXT)',
          'PM'         => { 'Mylib.pm'            => '$(INST_LIBDIR)/Mylib.pm'},
          'INC'          => &PDL_INCLUDE(), # add include dirs as required by your lib
          'LIBS'         => [''],   # add link directives as necessary
          'clean'        => {'FILES'  =>
                                 'Mylib.pm Mylib.xs Mylib$(OBJ_EXT)
                                 additional_Ccode$(OBJ_EXT)'},
         );

         # Add genpp rule; this will invoke PDL::PP on our PP file
         # the argument is an array reference where the array has three string elements:
         #   arg1: name of the source file that contains the PP code
         #   arg2: basename of the xs and pm files to be generated
         #   arg3: name of the package that is to be generated
         sub MY::postamble { pdlpp_postamble(["mylib.pd",Mylib,PDL::Lib::Mylib]); }

       To make life even easier PDL::Core::Dev defines the func-
       tion "pdlpp_stdargs" that returns a hash with default val-
       ues that can be passed (either directly or after appropri-
       ate modification) to a call to WriteMakefile.  Currently,
       "pdlpp_stdargs" returns a hash where the keys are filled
       in as follows:



               (
                'NAME'         => $mod,
                'TYPEMAPS'     => [&PDL_TYPEMAP()],
                'OBJECT'       => "$pref\$(OBJ_EXT)",
                PM     => {"$pref.pm" => "\$(INST_LIBDIR)/$pref.pm"},
                MAN3PODS => {"$src" => "\$(INST_MAN3DIR)/$mod.\$(MAN3EXT)"},
                'INC'          => &PDL_INCLUDE(),
                'LIBS'         => [''],
                'clean'        => {'FILES'  => "$pref.xs $pref.pm $pref\$(OBJ_EXT)"},
               )

       Here, $src is the name of the source file with PP code,
       $pref the prefix for the generated .pm and .xs files and
       $mod the name of the exntension module to generate.

INTERNALS
       The internals of the current version consist of a large
       table which gives the rules according to which things are
       translated and the subs which implement these rules.

       Later on, it would be good to make the table modifiable by
       the user so that different things may be tried.

       [Meta comment: here will hopefully be more in the future;
       currently, your best bet will be to read the source code
       :-( or ask on the list (try the latter first) ]

Appendix A: Some keys recognised by PDL::PP
       Unless otherwise specified, the arguments are strings.
       Keys marked with (bad) are only used if bad-value support
       is compiled into PDL.

       Pars
           define the signature of your function

       OtherPars
           arguments which are not pdls. Default: nothing.

       Code
           the actual code that implements the functionality;
           several PP macros and PP functions are recognised in
           the string value

       HandleBad (bad)
           If set to 1, the routine is assumed to support bad
           values and the code in the BadCode key is used if bad
           values are present; it also sets things up so that the
           "$ISBAD()" etc macros can be used.  If set to 0, cause
           the routine to print a warning if any of the input
           piddles have their bad flag set.

       BadCode (bad)
           Give the code to be used if bad values may be present
           in the input piddles.  Only used if "HandleBad => 1".

       GenericTypes
           An array reference. The array may contain any subset
           of the strings `B', `S', `U', `L', `F' and `D', which
           specify which types your operation will accept.  This
           is very useful (and important!) when interfacing an
           external library.  Default: [qw/B S U L F D/]

       Inplace
           Mark a function as being able to work inplace.


            Inplace => 1          if  Pars => 'a(); [o]b();'
            Inplace => ['a']      if  Pars => 'a(); b(); [o]c();'
            Inplace => ['a','b']  if  Pars => 'a(); b(); [o]c(); [o]d();'

           If bad values are being used, care must be taken to
           ensure the propogation of the badflag when inplace is
           being used; for instance see the code for "replacebad"
           in Basic/Bad/bad.pd.

       Doc Used to specify a documentation string in Pod format.
           See PDL::Doc for information on PDL documentation con-
           ventions. Note: in the special case where the PP 'Doc'
           string is one line this is implicitly used for the
           quick reference AND the documentation!

           If the Doc field is omitted PP will generate default
           documentation (after all it knows about the Signa-
           ture).

           If you really want the function NOT to be documented
           in any way at this point (e.g. for an internal rou-
           tine, or because youu are doing it elsewhere in the
           code) explictly specify "Doc=>undef".

       BadDoc (bad)
           Contains the text returned by the "badinfo" command
           (in "perldl") or the "-b" switch to the "pdldoc" shell
           script. In many cases, you will not need to specify
           this, since the information can be automatically cre-
           ated by PDL::PP. However, as befits computer-generated
           text, it's rather stilted; it may be much better to do
           it yourself!

Appendix B: PP macros and functions
       Macros

       Macros labelled by (bad) are only used if bad-value sup-
       port is compiled into PDL.

       $variablename_from_sig()
              access a pdl (by its name) that was specified in
              the signature

       $COMP(P)
              access a value in the private data structure of
              this transformation (mainly used to use an argument
              that is specified in the "OtherPar" section)

       $SIZE(E)
              replaced at runtime by the actual size of a named
              dimension (as specified in the signature)

       $GENERIC()
              replaced by the C type that is equal to the runtime
              type of the operation

       $P(P)  a pointer access to the PDL named "a" in the signa-
              ture. Useful for interfacing to C functions

       $PP(P) a physical pointer access to pdl "a"; mainly for
              internal use

       $TXXX(Alternative,Alternative)
              expansion alternatives according to runtime type of
              operation, where XXX is some string that is matched
              by "/[BSULFD+]/".

       $PDL(L)
              return a pointer to the pdl data structure (pdl *)
              of piddle "a"

       $ISBAD(a()) (bad)
              returns true if the value stored in "a()" equals
              the bad value for this piddle.  Requires "Handle-
              Bad" being set to 1.

       $ISGOOD(a()) (bad)
              returns true if the value stored in "a()" does not
              equal the bad value for this piddle.  Requires
              "HandleBad" being set to 1.

       $SETBAD(a()) (bad)
              Sets "a()" to equal the bad value for this piddle.
              Requires "HandleBad" being set to 1.

       functions


       "loop(p) %{ ... %}"
          loop over named dimensions; limits are generated auto-
          matically by PP

       "threadloop %{ ... %}"
          enclose following code in a threadloop

       "types(s) %{ ... %}"
          execute following code if type of operation is any of
          "TYPES"

SEE ALSO
       PDL

       For the concepts of threading and slicing check
       PDL::Indexing.

       PDL::Internals

       PDL::BadValues for information on bad values

       perlxs, perlxstut

CURRENTLY UNDOCUMENTED
       RedoDimsCode, $RESIZE()

BUGS
       PDL::PP is still, even in its rewritten form, too compli-
       cated.  It needs to be rethought a little as well as
       deconvoluted and modularized some more (e.g. all the NS
       things).

       After the rewrite, this can happen a little by little,
       though.

       Undocumented functions

       The following functions have been added since this manual
       was written and are as yet undocumented

       pp_export_nothing
       pp_core_importList
       pp_beginwrap
       pp_setversion
       pp_addbegin

AUTHOR
       Copyright(t) 1997 Tuomas J. Lukka (lukka@fas.harvard.edu),
       Karl Glaazebrook (kgb@aaocbn1.aao.GOV.AU) and Christian
       Soeller (c.soeller@auckland.ac.nz). All rights reserved.
       Although destined for release as a man page with the stan-
       dard PDL distribution, it is not public domain. Permission
       is granted to freely distribute verbatim copies of this
       document provided that no modifications outside of format-
       ting be made, and that this notice remain intact.  You are
       permitted and encouraged to use its code and derivatives
       thereof in your own source code for fun or for profit as
       you see fit.



perl v5.6.1                 2000-10-02                      PP(P)