Penguin
Annotated edit history of perlsub(1) version 2, including all changes. View license author blame.
Rev Author # Line
1 perry 1 PERLSUB
2 !!!PERLSUB
3 NAME
4 SYNOPSIS
5 DESCRIPTION
6 SEE ALSO
7 ----
8 !!NAME
9
10
11 perlsub - Perl subroutines
12 !!SYNOPSIS
13
14
15 To declare subroutines:
16
17
18 sub NAME; # A
19 sub NAME BLOCK # A declaration and a definition.
20 sub NAME(PROTO) BLOCK # ditto, but with prototypes
21 sub NAME : ATTRS BLOCK # with attributes
22 sub NAME(PROTO) : ATTRS BLOCK # with prototypes and attributes
23 To define an anonymous subroutine at runtime:
24
25
26 $subref = sub BLOCK; # no proto
27 $subref = sub (PROTO) BLOCK; # with proto
28 $subref = sub : ATTRS BLOCK; # with attributes
29 $subref = sub (PROTO) : ATTRS BLOCK; # with proto and attributes
30 To import subroutines:
31
32
33 use MODULE qw(NAME1 NAME2 NAME3);
34 To call subroutines:
35
36
37 NAME(LIST); #
38 !!DESCRIPTION
39
40
41 Like many languages, Perl provides for user-defined
42 subroutines. These may be located anywhere in the main
43 program, loaded in from other files via the do,
44 require, or use keywords, or generated on
45 the fly using eval or anonymous subroutines. You
46 can even call a function indirectly using a variable
47 containing its name or a CODE
48 reference.
49
50
51 The Perl model for function call and return values is
52 simple: all functions are passed as parameters one single
53 flat list of scalars, and all functions likewise return to
54 their caller one single flat list of scalars. Any arrays or
55 hashes in these call and return lists will collapse, losing
56 their identities--but you may always use pass-by-reference
57 instead to avoid this. Both call and return lists may
58 contain as many or as few scalar elements as you'd like.
59 (Often a function without an explicit return statement is
60 called a subroutine, but there's really no difference from
61 Perl's perspective.)
62
63
64 Any arguments passed in show up in the array @_.
65 Therefore, if you called a function with two arguments,
66 those would be stored in $_[[0] and $_[[1].
67 The array @_ is a local array, but its elements are
68 aliases for the actual scalar parameters. In particular, if
69 an element $_[[0] is updated, the corresponding
70 argument is updated (or an error occurs if it is not
71 updatable). If an argument is an array or hash element which
72 did not exist when the function was called, that element is
73 created only when (and if) it is modified or a reference to
74 it is taken. (Some earlier versions of Perl created the
75 element whether or not the element was assigned to.)
76 Assigning to the whole array @_ removes that
77 aliasing, and does not update any arguments.
78
79
80 The return value of a subroutine is the value of the last
81 expression evaluated. More explicitly, a return
82 statement may be used to exit the subroutine, optionally
83 specifying the returned value, which will be evaluated in
84 the appropriate context (list, scalar, or void) depending on
85 the context of the subroutine call. If you specify no return
86 value, the subroutine returns an empty list in list context,
87 the undefined value in scalar context, or nothing in void
88 context. If you return one or more aggregates (arrays and
89 hashes), these will be flattened together into one large
90 indistinguishable list.
91
92
93 Perl does not have named formal parameters. In practice all
94 you do is assign to a my() list of these. Variables
95 that aren't declared to be private are global variables. For
96 gory details on creating private variables, see ``Private
97 Variables via ''my()'''' and ``Temporary Values via
98 ''local()''''. To create protected environments for a set
99 of functions in a separate package (and probably a separate
100 file), see ``Packages'' in perlmod.
101
102
103 Example:
104
105
106 sub max {
107 my $max = shift(@_);
108 foreach $foo (@_) {
109 $max = $foo if $max
110 Example:
111
112
113 # get a line, combining continuation lines
114 # that start with whitespace
115 sub get_line {
116 $thisline = $lookahead; # global variables!
117 LINE: while (defined($lookahead =
118 $lookahead =
119 Assigning to a list of private variables to name your arguments:
120
121
122 sub maybeset {
123 my($key, $value) = @_;
124 $Foo{$key} = $value unless $Foo{$key};
125 }
126 Because the assignment copies the values, this also has the effect of turning call-by-reference into call-by-value. Otherwise a function is free to do in-place modifications of @_ and change its caller's values.
127
128
129 upcase_in($v1, $v2); # this changes $v1 and $v2
130 sub upcase_in {
131 for (@_) { tr/a-z/A-Z/ }
132 }
133 You aren't allowed to modify constants in this way, of course. If an argument were actually literal and you tried to change it, you'd take a (presumably fatal) exception. For example, this won't work:
134
135
136 upcase_in(
137 It would be much safer if the upcase_in() function were written to return a copy of its parameters instead of changing them in place:
138
139
140 ($v3, $v4) = upcase($v1, $v2); # this doesn't change $v1 and $v2
141 sub upcase {
142 return unless defined wantarray; # void context, do nothing
143 my @parms = @_;
144 for (@parms) { tr/a-z/A-Z/ }
145 return wantarray ? @parms : $parms[[0];
146 }
147 Notice how this (unprototyped) function doesn't care whether it was passed real scalars or arrays. Perl sees all arguments as one big, long, flat parameter list in @_. This is one area where Perl's simple argument-passing style shines. The upcase() function would work perfectly well without changing the upcase() definition even if we fed it things like this:
148
149
150 @newlist = upcase(@list1, @list2);
151 @newlist = upcase( split /:/, $var );
152 Do not, however, be tempted to do this:
153
154
155 (@a, @b) = upcase(@list1, @list2);
156 Like the flattened incoming parameter list, the return list is also flattened on return. So all you have managed to do here is stored everything in @a and made @b an empty list. See ``Pass by Reference'' for alternatives.
157
158
159 A subroutine may be called using an explicit
160 prefix. The is optional in modern Perl, as
161 are parentheses if the subroutine has been predeclared. The
162 is ''not'' optional when just naming the
163 subroutine, such as when it's used as an argument to
164 ''defined()'' or ''undef()''. Nor is it optional when
165 you want to do an indirect subroutine call with a subroutine
166 name or reference using the or
167 constructs, although the
168 $subref- notation solves that problem. See
169 perlref for more about all that.
170
171
172 Subroutines may be called recursively. If a subroutine is
173 called using the form, the argument list is
174 optional, and if omitted, no @_ array is set up for
175 the subroutine: the @_ array at the time of the
176 call is visible to subroutine instead. This is an efficiency
177 mechanism that new users may wish to avoid.
178
179
180
181 foo(); # pass a null list
182
183 Not only does the form make the argument list optional, it also disables any prototype checking on arguments you do provide. This is partly for historical reasons, and partly for having a convenient way to cheat if you know what you're doing. See Prototypes below.
184
185
186 Functions whose names are in all upper case are reserved to
187 the Perl core, as are modules whose names are in all lower
188 case. A function in all capitals is a loosely-held
189 convention meaning it will be called indirectly by the
190 run-time system itself, usually due to a triggered event.
191 Functions that do special, pre-defined things include
192 BEGIN, CHECK, INIT, END,
193 AUTOLOAD, and DESTROY--plus all functions
194 mentioned in perltie.
195
196
197 __Private Variables via__ ''my()''
198
199
200 Synopsis:
201
202
203 my $foo; # declare $foo lexically local
204 my (@wid, %get); # declare list of variables local
205 my $foo =
206 __WARNING__ : The use of attribute lists on my declarations is experimental. This feature should not be relied upon. It may change or disappear in future releases of Perl. See attributes.
207
208
209 The my operator declares the listed variables to be
210 lexically confined to the enclosing block, conditional
211 (if/unless/elsif/else), loop
212 (for/foreach/while/until/continue), subroutine,
213 eval, or do/require/use'd file. If more
214 than one value is listed, the list must be placed in
215 parentheses. All listed elements must be legal lvalues. Only
216 alphanumeric identifiers may be lexically scoped--magical
217 built-ins like $/ must currently be
218 localize with local instead.
219
220
221 Unlike dynamic variables created by the local
222 operator, lexical variables declared with my are
223 totally hidden from the outside world, including any called
224 subroutines. This is true if it's the same subroutine called
225 from itself or elsewhere--every call gets its own
226 copy.
227
228
229 This doesn't mean that a my variable declared in a
230 statically enclosing lexical scope would be invisible. Only
231 dynamic scopes are cut off. For example, the
232 bumpx() function below has access to the lexical
233 $x variable because both the my and the
234 sub occurred at the same scope, presumably file
235 scope.
236
237
238 my $x = 10;
239 sub bumpx { $x++ }
240 An eval(), however, can see lexical variables of the scope it is being evaluated in, so long as the names aren't hidden by declarations within the eval() itself. See perlref.
241
242
243 The parameter list to ''my()'' may be assigned to if
244 desired, which allows you to initialize your variables. (If
245 no initializer is given for a particular variable, it is
246 created with the undefined value.) Commonly this is used to
247 name input parameters to a subroutine.
248 Examples:
249
250
251 $arg =
252 sub cube_root {
253 my $arg = shift; # name doesn't matter
254 $arg **= 1/3;
255 return $arg;
256 }
257 The my is simply a modifier on something you might assign to. So when you do assign to variables in its argument list, my doesn't change whether those variables are viewed as a scalar or an array. So
258
259
260 my ($foo) =
261 both supply a list context to the right-hand side, while
262
263
264 my $foo =
265 supplies a scalar context. But the following declares only one variable:
266
267
268 my $foo, $bar = 1; # WRONG
269 That has the same effect as
270
271
272 my $foo;
273 $bar = 1;
274 The declared variable is not introduced (is not visible) until after the current statement. Thus,
275
276
277 my $x = $x;
278 can be used to initialize a new $x with the value of the old $x, and the expression
279
280
281 my $x = 123 and $x == 123
282 is false unless the old $x happened to have the value 123.
283
284
285 Lexical scopes of control structures are not bounded
286 precisely by the braces that delimit their controlled
287 blocks; control expressions are part of that scope, too.
288 Thus in the loop
289
290
291 while (my $line =
292 the scope of $line extends from its declaration throughout the rest of the loop construct (including the continue clause), but not beyond it. Similarly, in the conditional
293
294
295 if ((my $answer =
296 the scope of $answer extends from its declaration through the rest of that conditional, including any elsif and else clauses, but not beyond it.
297
298
299 None of the foregoing text applies to if/unless or
300 while/until modifiers appended to simple
301 statements. Such modifiers are not control structures and
302 have no effect on scoping.
303
304
305 The foreach loop defaults to scoping its index
306 variable dynamically in the manner of local.
307 However, if the index variable is prefixed with the keyword
308 my, or if there is already a lexical by that name
309 in scope, then a new lexical is created instead. Thus in the
310 loop
311
312
313 for my $i (1, 2, 3) {
314 some_function();
315 }
316 the scope of $i extends to the end of the loop, but not beyond it, rendering the value of $i inaccessible within some_function().
317
318
319 Some users may wish to encourage the use of lexically scoped
320 variables. As an aid to catching implicit uses to package
321 variables, which are always global, if you say
322
323
324 use strict 'vars';
325 then any variable mentioned from there to the end of the enclosing block must either refer to a lexical variable, be predeclared via our or use vars, or else must be fully qualified with the package name. A compilation error results otherwise. An inner block may countermand this with no strict 'vars'.
326
327
328 A my has both a compile-time and a run-time effect.
329 At compile time, the compiler takes notice of it. The
330 principal usefulness of this is to quiet use strict
331 'vars', but it is also essential for generation of
332 closures as detailed in perlref. Actual initialization is
333 delayed until run time, though, so it gets executed at the
334 appropriate time, such as each time through a loop, for
335 example.
336
337
338 Variables declared with my are not part of any
339 package and are therefore never fully qualified with the
340 package name. In particular, you're not allowed to try to
341 make a package variable (or other global)
342 lexical:
343
344
345 my $pack::var; # ERROR! Illegal syntax
346 my $_; # also illegal (currently)
347 In fact, a dynamic variable (also known as package or global variables) are still accessible using the fully qualified :: notation even while a lexical of the same name is also visible:
348
349
350 package main;
351 local $x = 10;
352 my $x = 20;
353 print
354 That will print out 20 and 10.
355
356
357 You may declare my variables at the outermost scope
358 of a file to hide any such identifiers from the world
359 outside that file. This is similar in spirit to C's static
360 variables when they are used at the file level. To do this
361 with a subroutine requires the use of a closure (an
362 anonymous function that accesses enclosing lexicals). If you
363 want to create a private subroutine that cannot be called
364 from outside that block, it can declare a lexical variable
365 containing an anonymous sub reference:
366
367
368 my $secret_version = '1.001-beta';
369 my $secret_sub = sub { print $secret_version };
370 As long as the reference is never returned by any function within the module, no outside module can see the subroutine, because its name is not in any package's symbol table. Remember that it's not ''REALLY'' called $some_pack::secret_version or anything; it's just $secret_version, unqualified and unqualifiable.
371
372
373 This does not work with object methods, however; all object
374 methods have to be in the symbol table of some package to be
375 found. See ``Function Templates'' in perlref for something
376 of a work-around to this.
377
378
379 __Persistent Private Variables__
380
381
382 Just because a lexical variable is lexically (also called
383 statically) scoped to its enclosing block, eval, or
384 do FILE , this doesn't mean that
385 within a function it works like a C static. It normally
386 works more like a C auto, but with implicit garbage
387 collection.
388
389
390 Unlike local variables in C or C ++ , Perl's
391 lexical variables don't necessarily get recycled just
392 because their scope has exited. If something more permanent
393 is still aware of the lexical, it will stick around. So long
394 as something else references a lexical, that lexical won't
395 be freed--which is as it should be. You wouldn't want memory
396 being free until you were done using it, or kept around once
397 you were done. Automatic garbage collection takes care of
398 this for you.
399
400
401 This means that you can pass back or save away references to
402 lexical variables, whereas to return a pointer to a C auto
403 is a grave error. It also gives us a way to simulate C's
404 function statics. Here's a mechanism for giving a function
405 private variables with both lexical scoping and a static
406 lifetime. If you do want to create something like C's static
407 variables, just enclose the whole function in an extra
408 block, and put the static variable outside the function but
409 in the block.
410
411
412 {
413 my $secret_val = 0;
414 sub gimme_another {
415 return ++$secret_val;
416 }
417 }
418 # $secret_val now becomes unreachable by the outside
419 # world, but retains its value between calls to gimme_another
420 If this function is being sourced in from a separate file via require or use, then this is probably just fine. If it's all in the main program, you'll need to arrange for the my to be executed early, either by putting the whole block above your main program, or more likely, placing merely a BEGIN sub around it to make sure it gets executed before your program starts to run:
421
422
423 sub BEGIN {
424 my $secret_val = 0;
425 sub gimme_another {
426 return ++$secret_val;
427 }
428 }
429 See ``Package Constructors and Destructors'' in perlmod about the special triggered functions, BEGIN, CHECK, INIT and END.
430
431
432 If declared at the outermost scope (the file scope), then
433 lexicals work somewhat like C's file statics. They are
434 available to all functions in that same file declared below
435 them, but are inaccessible from outside that file. This
436 strategy is sometimes used in modules to create private
437 variables that the whole module can see.
438
439
440 __Temporary Values via__ ''local()''
441
442
443 __WARNING__ : In general, you should be
444 using my instead of local, because it's
445 faster and safer. Exceptions to this include the global
446 punctuation variables, filehandles and formats, and direct
447 manipulation of the Perl symbol table itself. Format
448 variables often use local though, as do other
449 variables whose current value must be visible to called
450 subroutines.
451
452
453 Synopsis:
454
455
456 local $foo; # declare $foo dynamically local
457 local (@wid, %get); # declare list of variables local
458 local $foo =
459 local *FH; # localize $FH, @FH, %FH,
460 A local modifies its listed variables to be ``local'' to the enclosing block, eval, or do FILE--and to ''any subroutine called from within that block''. A local just gives temporary values to global (meaning package) variables. It does ''not'' create a local variable. This is known as dynamic scoping. Lexical scoping is done with my, which works more like C's auto declarations.
461
462
463 If more than one variable is given to local, they
464 must be placed in parentheses. All listed elements must be
465 legal lvalues. This operator works by saving the current
466 values of those variables in its argument list on a hidden
467 stack and restoring them upon exiting the block, subroutine,
468 or eval. This means that called subroutines can also
469 reference the local variable, but not the global one. The
470 argument list may be assigned to if desired, which allows
471 you to initialize your local variables. (If no initializer
472 is given for a particular variable, it is created with an
473 undefined value.) Commonly this is used to name the
474 parameters to a subroutine. Examples:
475
476
477 for $i ( 0 .. 9 ) {
478 $digits{$i} = $i;
479 }
480 # assume this function uses global %digits hash
481 parse_num();
482 # now temporarily add to %digits hash
483 if ($base12) {
484 # (NOTE: not claiming this is efficient!)
485 local %digits = (%digits, 't' =
486 Because local is a run-time operator, it gets executed each time through a loop. In releases of Perl previous to 5.0, this used more stack storage each time until the loop was exited. Perl now reclaims the space each time through, but it's still more efficient to declare your variables outside the loop.
487
488
489 A local is simply a modifier on an lvalue
490 expression. When you assign to a localized
491 variable, the local doesn't change whether its list
492 is viewed as a scalar or an array. So
493
494
495 local($foo) =
496 both supply a list context to the right-hand side, while
497
498
499 local $foo =
500 supplies a scalar context.
501
502
503 A note about local() and composite types is in
504 order. Something like local(%foo) works by
505 temporarily placing a brand new hash in the symbol table.
506 The old hash is left alone, but is hidden ``behind'' the new
507 one.
508
509
510 This means the old variable is completely invisible via the
511 symbol table (i.e. the hash entry in the *foo
512 typeglob) for the duration of the dynamic scope within which
513 the local() was seen. This has the effect of
514 allowing one to temporarily occlude any magic on composite
515 types. For instance, this will briefly alter a tied hash to
516 some other implementation:
517
518
519 tie %ahash, 'APackage';
520 [[...]
521 {
522 local %ahash;
523 tie %ahash, 'BPackage';
524 [[..called code will see %ahash tied to 'BPackage'..]
525 {
526 local %ahash;
527 [[..%ahash is a normal (untied) hash here..]
528 }
529 }
530 [[..%ahash back to its initial tied self again..]
531 As another example, a custom implementation of %ENV might look like this:
532
533
534 {
535 local %ENV;
2 perry 536 tie %ENV, '!MyOwnEnv';
1 perry 537 [[..do your own fancy %ENV manipulation here..]
538 }
539 [[..normal %ENV behavior here..]
540 It's also worth taking a moment to explain what happens when you localize a member of a composite type (i.e. an array or hash element). In this case, the element is localized ''by name''. This means that when the scope of the local() ends, the saved value will be restored to the hash element whose key was named in the local(), or the array element whose index was named in the local(). If that element was deleted while the local() was in effect (e.g. by a delete() from a hash or a shift() of an array), it will spring back into existence, possibly extending an array and filling in the skipped elements with undef. For instance, if you say
541
542
543 %hash = ( 'This' =
544 Perl will print
545
546
547 6 . . .
548 4 . . .
549 3 . . .
550 This is a test only a test.
551 The array has 6 elements: 0, 1, 2, undef, undef, 5
552 The behavior of ''local()'' on non-existent members of composite types is subject to change in future.
553
554
555 __Lvalue subroutines__
556
557
558 __WARNING__ : Lvalue subroutines are still
559 experimental and the implementation may change in future
560 versions of Perl.
561
562
563 It is possible to return a modifiable value from a
564 subroutine. To do this, you have to declare the subroutine
565 to return an lvalue.
566
567
568 my $val;
569 sub canmod : lvalue {
570 $val;
571 }
572 sub nomod {
573 $val;
574 }
575 canmod() = 5; # assigns to $val
576 nomod() = 5; # ERROR
577 The scalar/list context for the subroutine and for the right-hand side of assignment is determined as if the subroutine call is replaced by a scalar. For example, consider:
578
579
580 data(2,3) = get_data(3,4);
581 Both subroutines here are called in a scalar context, while in:
582
583
584 (data(2,3)) = get_data(3,4);
585 and in:
586
587
588 (data(2),data(3)) = get_data(3,4);
589 all the subroutines are called in a list context.
590
591
592 __Passing Symbol Table Entries (typeglobs)__
593
594
595 __WARNING__ : The mechanism described in
596 this section was originally the only way to simulate
597 pass-by-reference in older versions of Perl. While it still
598 works fine in modern versions, the new reference mechanism
599 is generally easier to work with. See below.
600
601
602 Sometimes you don't want to pass the value of an array to a
603 subroutine but rather the name of it, so that the subroutine
604 can modify the global copy of it rather than working with a
605 local copy. In perl you can refer to all objects of a
606 particular name by prefixing the name with a star:
607 *foo. This is often known as a ``typeglob'',
608 because the star on the front can be thought of as a
609 wildcard match for all the funny prefix characters on
610 variables and subroutines and such.
611
612
613 When evaluated, the typeglob produces a scalar value that
614 represents all the objects of that name, including any
615 filehandle, format, or subroutine. When assigned to, it
616 causes the name mentioned to refer to whatever *
617 value was assigned to it. Example:
618
619
620 sub doubleary {
621 local(*someary) = @_;
622 foreach $elem (@someary) {
623 $elem *= 2;
624 }
625 }
626 doubleary(*foo);
627 doubleary(*bar);
628 Scalars are already passed by reference, so you can modify scalar arguments without using this mechanism by referring explicitly to $_[[0] etc. You can modify all the elements of an array by passing all the elements as scalars, but you have to use the * mechanism (or the equivalent reference mechanism) to push, pop, or change the size of an array. It will certainly be faster to pass the typeglob (or reference).
629
630
631 Even if you don't want to modify an array, this mechanism is
632 useful for passing multiple arrays in a single
633 LIST , because normally the
634 LIST mechanism will merge all the array
635 values so that you can't extract out the individual arrays.
636 For more on typeglobs, see ``Typeglobs and Filehandles'' in
637 perldata.
638
639
640 __When to Still Use__ ''local()''
641
642
643 Despite the existence of my, there are still three
644 places where the local operator still shines. In
645 fact, in these three places, you ''must'' use
646 local instead of my.
647
648
649 1.
650
651
652 You need to give a global variable a temporary value,
653 especially $_.
654
655
656 The global variables, like @ARGV or the punctuation
657 variables, must be localized with local().
658 This block reads in ''/etc/motd'', and splits it up into
659 chunks separated by lines of equal signs, which are placed
660 in @Fields.
661
662
663 {
664 local @ARGV = (
665 It particular, it's important to localize $_ in any routine that assigns to it. Look out for implicit assignments in while conditionals.
666
667
668 2.
669
670
671 You need to create a local file or directory handle or a
672 local function.
673
674
675 A function that needs a filehandle of its own must use
676 local() on a complete typeglob. This can be used to
677 create new symbol table entries:
678
679
680 sub ioqueue {
681 local (*READER, *WRITER); # not my!
682 pipe (READER, WRITER); or die
683 See the Symbol module for a way to create anonymous symbol table entries.
684
685
686 Because assignment of a reference to a typeglob creates an
687 alias, this can be used to create what is effectively a
688 local function, or at least, a local alias.
689
690
691 {
692 local *grow =
693 See ``Function Templates'' in perlref for more about manipulating functions by name in this way.
694
695
696 3.
697
698
699 You want to temporarily change just one element of an array
700 or hash.
701
702
703 You can localize just one element of an aggregate.
704 Usually this is done on dynamics:
705
706
707 {
708 local $SIG{INT} = 'IGNORE';
709 funct(); # uninterruptible
710 }
711 # interruptibility automatically restored here
712 But it also works on lexically declared aggregates. Prior to 5.005, this operation could on occasion misbehave.
713
714
715 __Pass by Reference__
716
717
718 If you want to pass more than one array or hash into a
719 function--or return them from it--and have them maintain
720 their integrity, then you're going to have to use an
721 explicit pass-by-reference. Before you do that, you need to
722 understand references as detailed in perlref. This section
723 may not make much sense to you otherwise.
724
725
726 Here are a few simple examples. First, let's pass in several
727 arrays to a function and have it pop all of then,
728 returning a new list of all their former last
729 elements:
730
731
732 @tailings = popmany ( @a, @b, @c, @d );
733 sub popmany {
734 my $aref;
735 my @retlist = ();
736 foreach $aref ( @_ ) {
737 push @retlist, pop @$aref;
738 }
739 return @retlist;
740 }
741 Here's how you might write a function that returns a list of keys occurring in all the hashes passed to it:
742
743
744 @common = inter( %foo, %bar, %joe );
745 sub inter {
746 my ($k, $href, %seen); # locals
747 foreach $href (@_) {
748 while ( $k = each %$href ) {
749 $seen{$k}++;
750 }
751 }
752 return grep { $seen{$_} == @_ } keys %seen;
753 }
754 So far, we're using just the normal list return mechanism. What happens if you want to pass or return a hash? Well, if you're using only one of them, or you don't mind them concatenating, then the normal calling convention is ok, although a little expensive.
755
756
757 Where people get into trouble is here:
758
759
760 (@a, @b) = func(@c, @d);
761 or
762 (%a, %b) = func(%c, %d);
763 That syntax simply won't work. It sets just @a or %a and clears the @b or %b. Plus the function didn't get passed into two separate arrays or hashes: it got one long list in @_, as always.
764
765
766 If you can arrange for everyone to deal with this through
767 references, it's cleaner code, although not so nice to look
768 at. Here's a function that takes two array references as
769 arguments, returning the two array elements in order of how
770 many elements they have in them:
771
772
773 ($aref, $bref) = func(@c, @d);
774 print
775 It turns out that you can actually do this also:
776
777
778 (*a, *b) = func(@c, @d);
779 print
780 Here we're using the typeglobs to do symbol table aliasing. It's a tad subtle, though, and also won't work if you're using my variables, because only globals (even in disguise as locals) are in the symbol table.
781
782
783 If you're passing around filehandles, you could usually just
784 use the bare typeglob, like *STDOUT, but typeglobs
785 references work, too. For example:
786
787
788 splutter(*STDOUT);
789 sub splutter {
790 my $fh = shift;
791 print $fh
792 $rec = get_rec(*STDIN);
793 sub get_rec {
794 my $fh = shift;
795 return scalar
796 If you're planning on generating new filehandles, you could do this. Notice to pass back just the bare *FH, not its reference.
797
798
799 sub openit {
800 my $path = shift;
801 local *FH;
802 return open (FH, $path) ? *FH : undef;
803 }
804
805
806 __Prototypes__
807
808
809 Perl supports a very limited kind of compile-time argument
810 checking using function prototyping. If you
811 declare
812
813
814 sub mypush (@@)
815 then mypush() takes arguments exactly like push() does. The function declaration must be visible at compile time. The prototype affects only interpretation of new-style calls to the function, where new-style is defined as not using the character. In other words, if you call it like a built-in function, then it behaves like a built-in function. If you call it like an old-fashioned subroutine, then it behaves like an old-fashioned subroutine. It naturally falls out from this rule that prototypes have no influence on subroutine references like or on indirect subroutine calls like or $subref-.
816
817
818 Method calls are not influenced by prototypes either,
819 because the function to be called is indeterminate at
820 compile time, since the exact code called depends on
821 inheritance.
822
823
824 Because the intent of this feature is primarily to let you
825 define subroutines that work like built-in functions, here
826 are prototypes for some other functions that parse almost
827 exactly like the corresponding built-in.
828
829
830 Declared as Called as
831 sub mylink ($$) mylink $old, $new
832 sub myvec ($$$) myvec $var, $offset, 1
833 sub myindex ($$;$) myindex
834 Any backslashed prototype character represents an actual argument that absolutely must start with that character. The value passed as part of @_ will be a reference to the actual argument given in the subroutine call, obtained by applying \ to that argument.
835
836
837 Unbackslashed prototype characters have special meanings.
838 Any unbackslashed @ or % eats all
839 remaining arguments, and forces list context. An argument
840 represented by $ forces scalar context. An
841 requires an anonymous subroutine, which, if
842 passed as the first argument, does not require the
843 sub keyword or a subsequent comma.
844
845
846 A * allows the subroutine to accept a bareword,
847 constant, scalar expression, typeglob, or a reference to a
848 typeglob in that slot. The value will be available to the
849 subroutine either as a simple scalar, or (in the latter two
850 cases) as a reference to the typeglob. If you wish to always
851 convert such arguments to a typeglob reference, use
852 ''Symbol::qualify_to_ref()'' as follows:
853
854
855 use Symbol 'qualify_to_ref';
856 sub foo (*) {
857 my $fh = qualify_to_ref(shift, caller);
858 ...
859 }
860 A semicolon separates mandatory arguments from optional arguments. It is redundant before @ or %, which gobble up everything else.
861
862
863 Note how the last three examples in the table above are
864 treated specially by the parser. mygrep() is parsed
865 as a true list operator, myrand() is parsed as a
866 true unary operator with unary precedence the same as
867 rand(), and mytime() is truly without
868 arguments, just like time(). That is, if you
869 say
870
871
872 mytime +2;
873 you'll get mytime() + 2, not mytime(2), which is how it would be parsed without a prototype.
874
875
876 The interesting thing about is that you can
877 generate new syntax with it, provided it's in the initial
878 position:
879
880
881 sub try (
882 try {
883 die
884 That prints . (Yes, there are still unresolved issues having to do with visibility of @_. I'm ignoring that question for the moment. (But note that if we make @_ lexically scoped, those anonymous subroutines can act like closures... (Gee, is this sounding a little Lispish? (Never mind.))))
885
886
887 And here's a reimplementation of the Perl grep
888 operator:
889
890
891 sub mygrep (
892 Some folks would prefer full alphanumeric prototypes. Alphanumerics have been intentionally left out of prototypes for the express purpose of someday in the future adding named, formal parameters. The current mechanism's main goal is to let module writers provide better diagnostics for module users. Larry feels the notation quite understandable to Perl programmers, and that it will not intrude greatly upon the meat of the module, nor make it harder to read. The line noise is visually encapsulated into a small pill that's easy to swallow.
893
894
895 It's probably best to prototype new functions, not retrofit
896 prototyping into older ones. That's because you must be
897 especially careful about silent impositions of differing
898 list versus scalar contexts. For example, if you decide that
899 a function should take just one parameter, like
900 this:
901
902
903 sub func ($) {
904 my $n = shift;
905 print
906 and someone has been calling it with an array or expression returning a list:
907
908
909 func(@foo);
910 func( split /:/ );
911 Then you've just supplied an automatic scalar in front of their argument, which can be more than a bit surprising. The old @foo which used to hold one thing doesn't get passed in. Instead, func() now gets passed in a 1; that is, the number of elements in @foo. And the split gets called in scalar context so it starts scribbling on your @_ parameter list. Ouch!
912
913
914 This is all very powerful, of course, and should be used
915 only in moderation to make the world a better
916 place.
917
918
919 __Constant Functions__
920
921
922 Functions with a prototype of () are potential
923 candidates for inlining. If the result after optimization
924 and constant folding is either a constant or a
925 lexically-scoped scalar which has no other references, then
926 it will be used in place of function calls made without
927 . Calls made using are never
928 inlined. (See ''constant.pm'' for an easy way to declare
929 most constants.)
930
931
932 The following functions would all be inlined:
933
934
935 sub pi () { 3.14159 } # Not exact, but close.
936 sub PI () { 4 * atan2 1, 1 } # As good as it gets,
937 # and it's inlined, too!
938 sub ST_DEV () { 0 }
939 sub ST_INO () { 1 }
940 sub FLAG_FOO () { 1
941 sub OPT_BAZ () { not (0x1B58
942 sub N () { int(BAZ_VAL) / 3 }
943 BEGIN {
944 my $prod = 1;
945 for (1..N) { $prod *= $_ }
946 sub N_FACTORIAL () { $prod }
947 }
948 If you redefine a subroutine that was eligible for inlining, you'll get a mandatory warning. (You can use this warning to tell whether or not a particular subroutine is considered constant.) The warning is considered severe enough not to be optional because previously compiled invocations of the function will still be using the old value of the function. If you need to be able to redefine the subroutine, you need to ensure that it isn't inlined, either by dropping the () prototype (which changes calling semantics, so beware) or by thwarting the inlining mechanism in some other way, such as
949
950
951 sub not_inlined () {
952 23 if $];
953 }
954
955
956 __Overriding Built-in Functions__
957
958
959 Many built-in functions may be overridden, though this
960 should be tried only occasionally and for good reason.
961 Typically this might be done by a package attempting to
962 emulate missing built-in functionality on a non-Unix
963 system.
964
965
966 Overriding may be done only by importing the name from a
967 module--ordinary predeclaration isn't good enough. However,
968 the use subs pragma lets you, in effect, predeclare
969 subs via the import syntax, and these names may then
970 override built-in ones:
971
972
973 use subs 'chdir', 'chroot', 'chmod', 'chown';
974 chdir $somewhere;
975 sub chdir { ... }
976 To unambiguously refer to the built-in form, precede the built-in name with the special package qualifier CORE::. For example, saying CORE::open() always refers to the built-in open(), even if the current package has imported some other subroutine called from elsewhere. Even though it looks like a regular function call, it isn't: you can't take a reference to it, such as the incorrect might appear to produce.
977
978
979 Library modules should not in general export built-in names
980 like open or chdir as part of their
981 default @EXPORT list, because these may sneak into
982 someone else's namespace and change the semantics
983 unexpectedly. Instead, if the module adds that name to
984 @EXPORT_OK, then it's possible for a user to import
985 the name explicitly, but not implicitly. That is, they could
986 say
987
988
989 use Module 'open';
990 and it would import the open override. But if they said
991
992
993 use Module;
994 they would get the default imports without overrides.
995
996
997 The foregoing mechanism for overriding built-in is
998 restricted, quite deliberately, to the package that requests
999 the import. There is a second method that is sometimes
1000 applicable when you wish to override a built-in everywhere,
1001 without regard to namespace boundaries. This is achieved by
1002 importing a sub into the special namespace
1003 CORE::GLOBAL::. Here is an example that quite
1004 brazenly replaces the glob operator with something
1005 that understands regular expressions.
1006
1007
1008 package REGlob;
1009 require Exporter;
1010 @ISA = 'Exporter';
1011 @EXPORT_OK = 'glob';
1012 sub import {
1013 my $pkg = shift;
1014 return unless @_;
1015 my $sym = shift;
1016 my $where = ($sym =~ s/^GLOBAL_// ? 'CORE::GLOBAL' : caller(0));
1017 $pkg-
1018 sub glob {
1019 my $pat = shift;
1020 my @got;
1021 local *D;
1022 if (opendir D, '.') {
1023 @got = grep /$pat/, readdir D;
1024 closedir D;
1025 }
1026 return @got;
1027 }
1028 1;
1029 And here's how it could be (ab)used:
1030
1031
1032 #use REGlob 'GLOBAL_glob'; # override glob() in ALL namespaces
1033 package Foo;
1034 use REGlob 'glob'; # override glob() in Foo:: only
1035 print for
1036 The initial comment shows a contrived, even dangerous example. By overriding glob globally, you would be forcing the new (and subversive) behavior for the glob operator for ''every'' namespace, without the complete cognizance or cooperation of the modules that own those namespaces. Naturally, this should be done with extreme caution--if it must be done at all.
1037
1038
1039 The REGlob example above does not implement all the
1040 support needed to cleanly override perl's glob
1041 operator. The built-in glob has different behaviors
1042 depending on whether it appears in a scalar or list context,
1043 but our REGlob doesn't. Indeed, many perl built-in
1044 have such context sensitive behaviors, and these must be
1045 adequately supported by a properly written override. For a
1046 fully functional example of overriding glob, study
2 perry 1047 the implementation of File::!DosGlob in the standard
1 perry 1048 library.
1049
1050
1051 __Autoloading__
1052
1053
1054 If you call a subroutine that is undefined, you would
1055 ordinarily get an immediate, fatal error complaining that
1056 the subroutine doesn't exist. (Likewise for subroutines
1057 being used as methods, when the method doesn't exist in any
1058 base class of the class's package.) However, if an
1059 AUTOLOAD subroutine is defined in the package or
1060 packages used to locate the original subroutine, then that
1061 AUTOLOAD subroutine is called with the arguments
1062 that would have been passed to the original subroutine. The
1063 fully qualified name of the original subroutine magically
1064 appears in the global $AUTOLOAD variable of the
1065 same package as the AUTOLOAD routine. The name is
1066 not passed as an ordinary argument because, er, well, just
1067 because, that's why...
1068
1069
1070 Many AUTOLOAD routines load in a definition for the
1071 requested subroutine using ''eval()'', then execute that
1072 subroutine using a special form of ''goto()'' that erases
1073 the stack frame of the AUTOLOAD routine without a
1074 trace. (See the source to the standard module documented in
2 perry 1075 !AutoLoader, for example.) But an AUTOLOAD routine
1 perry 1076 can also just emulate the routine and never define it. For
1077 example, let's pretend that a function that wasn't defined
1078 should just invoke system with those arguments. All
1079 you'd do is:
1080
1081
1082 sub AUTOLOAD {
1083 my $program = $AUTOLOAD;
1084 $program =~ s/.*:://;
1085 system($program, @_);
1086 }
1087 date();
1088 who('am', 'i');
1089 ls('-l');
1090 In fact, if you predeclare functions you want to call that way, you don't even need parentheses:
1091
1092
1093 use subs qw(date who ls);
1094 date;
1095 who
1096 A more complete example of this is the standard Shell module, which can treat undefined subroutine calls as calls to external programs.
1097
1098
1099 Mechanisms are available to help modules writers split their
2 perry 1100 modules into autoloadable files. See the standard !AutoLoader
1101 module described in !AutoLoader and in !AutoSplit, the
1102 standard !SelfLoader modules in !SelfLoader, and the document
1 perry 1103 on adding C functions to Perl code in perlxs.
1104
1105
1106 __Subroutine Attributes__
1107
1108
1109 A subroutine declaration or definition may have a list of
1110 attributes associated with it. If such an attribute list is
1111 present, it is broken up at space or colon boundaries and
1112 treated as though a use attributes had been seen.
1113 See attributes for details about what attributes are
1114 currently supported. Unlike the limitation with the
1115 obsolescent use attrs, the sub : ATTRLIST
1116 syntax works to associate the attributes with a
1117 pre-declaration, and not just with a subroutine
1118 definition.
1119
1120
1121 The attributes must be valid as simple identifier names
1122 (without any punctuation other than the '_' character). They
1123 may have a parameter list appended, which is only checked
1124 for whether its parentheses ('(',')') nest
1125 properly.
1126
1127
1128 Examples of valid syntax (even though the attributes are
1129 unknown):
1130
1131
1132 sub fnord (
1133 Examples of invalid syntax:
1134
1135
1136 sub fnord : switch(10,foo() ; # ()-string not balanced
1137 sub snoid : Ugly('(') ; # ()-string not balanced
1138 sub xyzzy : 5x5 ; #
1139 The attribute list is passed as a list of constant strings to the code which associates them with the subroutine. In particular, the second example of valid syntax above currently looks like this in terms of how it's parsed and invoked:
1140
1141
1142 use attributes __PACKAGE__,
1143 For further details on attribute lists and their manipulation, see attributes.
1144 !!SEE ALSO
1145
1146
1147 See ``Function Templates'' in perlref for more about
1148 references and closures. See perlxs if you'd like to learn
1149 about calling C subroutines from Perl. See perlembed if
1150 you'd like to learn about calling Perl subroutines from C.
1151 See perlmod to learn about bundling up your functions in
1152 separate files. See perlmodlib to learn what library modules
1153 come standard on your system. See perltoot to learn how to
1154 make object method calls.
1155 ----
This page is a man page (or other imported legacy content). We are unable to automatically determine the license status of this page.