Penguin
Annotated edit history of perlfaq5(1) version 2 showing authors affecting page license. View with all changes included.
Rev Author # Line
1 perry 1 PERLFAQ5
2 !!!PERLFAQ5
3 NAME
4 DESCRIPTION
5 AUTHOR AND COPYRIGHT
6 ----
7 !!NAME
8
9
10 perlfaq5 - Files and Formats ($Revision: 1.38 $, $Date: 1999/05/23 16:08:30 $)
11 !!DESCRIPTION
12
13
14 This section deals with I/O and the ``f'' issues:
15 filehandles, flushing, formats, and footers.
16
17
18 __How do I flush/unbuffer an output filehandle? Why must I
19 do this?__
20
21
22 The C standard I/O library (stdio) normally buffers
23 characters sent to devices. This is done for efficiency
24 reasons so that there isn't a system call for each byte. Any
25 time you use ''print()'' or ''write()'' in Perl, you
26 go though this buffering. ''syswrite()'' circumvents
27 stdio and buffering.
28
29
30 In most stdio implementations, the type of output buffering
31 and the size of the buffer varies according to the type of
32 device. Disk files are block buffered, often with a buffer
33 size of more than 2k. Pipes and sockets are often buffered
34 with a buffer size between 1/2 and 2k. Serial devices (e.g.
35 modems, terminals) are normally line-buffered, and stdio
36 sends the entire line when it gets the newline.
37
38
39 Perl does not support truly unbuffered output (except
40 insofar as you can syswrite(OUT, $char, 1)). What
41 it does instead support is ``command buffering'', in which a
42 physical write is performed after every output command. This
43 isn't as hard on your system as unbuffering, but does get
44 the output where you want it when you want it.
45
46
47 If you expect characters to get to your device when you
48 print them there, you'll want to autoflush its handle. Use
49 ''select()'' and the $ variable to control
50 autoflushing (see perlvar/$ and ``select'' in
51 perlfunc):
52
53
54 $old_fh = select(OUTPUT_HANDLE);
55 $ = 1;
56 select($old_fh);
57 Or using the traditional idiom:
58
59
60 select((select(OUTPUT_HANDLE), $ = 1)[[0]);
61 Or if don't mind slowly loading several thousand lines of module code just because you're afraid of the $ variable:
62
63
64 use !FileHandle;
65 open(DEV,
66 or the newer IO::* modules:
67
68
69 use IO::Handle;
70 open(DEV,
71 or even this:
72
73
74 use IO::Socket; # this one is kinda a pipe?
75 $sock = IO::Socket::INET-
76 $sock-
77 Note the bizarrely hardcoded carriage return and newline in their octal equivalents. This is the ONLY way (currently) to assure a proper flush on all platforms, including Macintosh. That's the way things work in network programming: you really should specify the exact bit pattern on the network line terminator. In practice, often works, but this is not portable.
78
79
80 See perlfaq9 for other examples of fetching URLs over the
81 web.
82
83
84 __How do I change one line in a file/delete a line in a
85 file/insert a line in the middle of a file/append to the
86 beginning of a file?__
87
88
89 Those are operations of a text editor. Perl is not a text
90 editor. Perl is a programming language. You have to
91 decompose the problem into low-level calls to read, write,
92 open, close, and seek.
93
94
95 Although humans have an easy time thinking of a text file as
96 being a sequence of lines that operates much like a stack of
97 playing cards--or punch cards--computers usually see the
98 text file as a sequence of bytes. In general, there's no
99 direct way for Perl to seek to a particular line of a file,
100 insert text into a file, or remove text from a
101 file.
102
103
104 (There are exceptions in special circumstances. You can add
105 or remove data at the very end of the file. A sequence of
106 bytes can be replaced with another sequence of the same
107 length. The $DB_RECNO array bindings as documented
108 in DB_File also provide a direct way of modifying a file.
109 Files where all lines are the same length are also easy to
110 alter.)
111
112
113 The general solution is to create a temporary copy of the
114 text file with the changes you want, then copy that over the
115 original. This assumes no locking.
116
117
118 $old = $file;
119 $new =
120 open(OLD,
121 # Correct typos, preserving case
122 while (
123 close(OLD) or die
124 rename($old, $bak) or die
125 Perl can do this sort of thing for you automatically with the -i command-line switch or the closely-related $^I variable (see perlrun for more details). Note that -i may require a suffix on some non-Unix systems; see the platform-specific documentation that came with your port.
126
127
128 # Renumber a series of tests from the command line
129 perl -pi -e 's/(^s+tests+)d+/ $1 . ++$count /e' t/op/taint.t
130 # form a script
131 local($^I, @ARGV) = ('.orig', glob(
132 If you need to seek to an arbitrary line of a file that changes infrequently, you could build up an index of byte positions of where the line ends are in the file. If the file is large, an index of every tenth or hundredth line end would allow you to seek and read fairly efficiently. If the file is sorted, try the look.pl library (part of the standard perl distribution).
133
134
135 In the unique case of deleting lines at the end of a file,
136 you can use ''tell()'' and ''truncate()''. The
137 following code snippet deletes the last line of a file
138 without making a copy or reading the whole file into
139 memory:
140
141
142 open (FH,
143 Error checking is left as an exercise for the reader.
144
145
146 __How do I count the number of lines in a
147 file?__
148
149
150 One fairly efficient way is to count newlines in the file.
151 The following program uses a feature of tr///, as documented
152 in perlop. If your text file doesn't end with a newline,
153 then it's not really a proper text file, so this may report
154 one fewer line than you expect.
155
156
157 $lines = 0;
158 open(FILE, $filename) or die
159 This assumes no funny games with newline translations.
160
161
162 __How do I make a temporary file name?__
163
164
165 Use the new_tmpfile class method from the IO::File
166 module to get a filehandle opened for reading and writing.
167 Use it if you don't need to know the file's
168 name:
169
170
171 use IO::File;
172 $fh = IO::File-
173 If you do need to know the file's name, you can use the tmpnam function from the POSIX module to get a filename that you then open yourself:
174
175
176 use Fcntl;
177 use POSIX qw(tmpnam);
178 # try new temporary filenames until we get one that didn't already
179 # exist; the check should be unnecessary, but you can't be too careful
180 do { $name = tmpnam() }
181 until sysopen(FH, $name, O_RDWRO_CREATO_EXCL);
182 # install atexit-style handler so that when we exit or die,
183 # we automatically delete this temporary file
184 END { unlink($name) or die
185 # now go on to use the file ...
186 If you're committed to creating a temporary file by hand, use the process ID and/or the current time-value. If you need to have many temporary files in one process, use a counter:
187
188
189 BEGIN {
190 use Fcntl;
191 my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMP} $ENV{TEMP};
192 my $base_name = sprintf(
193
194
195 __How can I manipulate fixed-record-length
196 files?__
197
198
199 The most efficient way is using ''pack()'' and
200 ''unpack()''. This is faster than using ''substr()''
201 when taking many, many strings. It is slower for just a
202 few.
203
204
205 Here is a sample chunk of code to break up and put back
206 together again some fixed-format input lines, in this case
207 from the output of a normal, Berkeley-style ps:
208
209
210 # sample input line:
211 # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what
212 $PS_T = 'A6 A4 A7 A5 A*';
213 open(PS,
214 We've used $$var in a way that forbidden by use strict 'refs'. That is, we've promoted a string to a scalar variable reference using symbolic references. This is ok in small programs, but doesn't scale well. It also only works on global variables, not lexicals.
215
216
217 __How can I make a filehandle local to a subroutine? How do
218 I pass filehandles between subroutines? How do I make an
219 array of filehandles?__
220
221
222 The fastest, simplest, and most direct way is to localize
223 the typeglob of the filehandle in question:
224
225
226 local *!TmpHandle;
227 Typeglobs are fast (especially compared with the alternatives) and reasonably easy to use, but they also have one subtle drawback. If you had, for example, a function named ''!TmpHandle()'', or a variable named %!TmpHandle, you just hid it from yourself.
228
229
230 sub findme {
231 local *!HostFile;
232 open(!HostFile,
233 Here's how to use typeglobs in a loop to open and store a bunch of filehandles. We'll use as values of the hash an ordered pair to make it easy to sort the hash in insertion order.
234
235
236 @names = qw(motd termcap passwd hosts);
237 my $i = 0;
238 foreach $filename (@names) {
239 local *FH;
240 open(FH,
241 # Using the filehandles in the array
242 foreach $name (sort { $file{$a}[[0]
243 For passing filehandles to functions, the easiest way is to preface them with a star, as in func(*STDIN). See ``Passing Filehandles'' in perlfaq7 for details.
244
245
246 If you want to create many anonymous handles, you should
247 check out the Symbol, !FileHandle, or IO::Handle (etc.)
248 modules. Here's the equivalent code with Symbol::gensym,
249 which is reasonably light-weight:
250
251
252 foreach $filename (@names) {
253 use Symbol;
254 my $fh = gensym();
255 open($fh,
256 Here's using the semi-object-oriented !FileHandle module, which certainly isn't light-weight:
257
258
259 use !FileHandle;
260 foreach $filename (@names) {
261 my $fh = !FileHandle-
262 Please understand that whether the filehandle happens to be a (probably localized) typeglob or an anonymous handle from one of the modules in no way affects the bizarre rules for managing indirect handles. See the next question.
263
264
265 __How can I use a filehandle indirectly?__
266
267
268 An indirect filehandle is using something other than a
269 symbol in a place that a filehandle is expected. Here are
270 ways to get indirect filehandles:
271
272
273 $fh = SOME_FH; # bareword is strict-subs hostile
274 $fh =
275 Or, you can use the new method from the !FileHandle or IO modules to create an anonymous filehandle, store that in a scalar variable, and use it as though it were a normal filehandle.
276
277
278 use !FileHandle;
279 $fh = !FileHandle-
280 use IO::Handle; # 5.004 or higher
281 $fh = IO::Handle-
282 Then use any of those as you would a normal filehandle. Anywhere that Perl is expecting a filehandle, an indirect filehandle may be used instead. An indirect filehandle is just a scalar variable that contains a filehandle. Functions like print, open, seek, or the diamond operator will accept either a read filehandle or a scalar variable containing one:
283
284
285 ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR);
286 print $ofh
287 If you're passing a filehandle to a function, you can write the function in two ways:
288
289
290 sub accept_fh {
291 my $fh = shift;
292 print $fh
293 Or it can localize a typeglob and use the filehandle directly:
294
295
296 sub accept_fh {
297 local *FH = shift;
298 print FH
299 Both styles work with either objects or typeglobs of real filehandles. (They might also work with strings under some circumstances, but this is risky.)
300
301
302 accept_fh(*STDOUT);
303 accept_fh($handle);
304 In the examples above, we assigned the filehandle to a scalar variable before using it. That is because only simple scalar variables, not expressions or subscripts of hashes or arrays, can be used with built-ins like print, printf, or the diamond operator. Using something other than a simple scalar varaible as a filehandle is illegal and won't even compile:
305
306
307 @fd = (*STDIN, *STDOUT, *STDERR);
308 print $fd[[1]
309 With print and printf, you get around this by using a block and an expression where you would place the filehandle:
310
311
312 print { $fd[[1] }
313 That block is a proper block like any other, so you can put more complicated code there. This sends the message out to one of two places:
314
315
316 $ok = -x
317 This approach of treating print and printf like object methods calls doesn't work for the diamond operator. That's because it's a real operator, not just a function with a comma-less argument. Assuming you've been storing typeglobs in your structure as we did above, you can use the built-in function named readline to reads a record just as does. Given the initialization shown above for @fd, this would work, but only because ''readline()'' require a typeglob. It doesn't work with objects or strings, which might be a bug we haven't fixed yet.
318
319
320 $got = readline($fd[[0]);
321 Let it be noted that the flakiness of indirect filehandles is not related to whether they're strings, typeglobs, objects, or anything else. It's the syntax of the fundamental operators. Playing the object game doesn't help you at all here.
322
323
324 __How can I set up a footer format to be used with__
325 ''write()''__?__
326
327
328 There's no builtin way to do this, but perlform has a couple
329 of techniques to make it possible for the intrepid
330 hacker.
331
332
333 __How can I__ ''write()'' __into a
334 string?__
335
336
337 See ``Accessing Formatting Internals'' in perlform for an
338 ''swrite()'' function.
339
340
341 __How can I output my numbers with commas
342 added?__
343
344
345 This one will do it for you:
346
347
348 sub commify {
349 local $_ = shift;
350 1 while s/^([[-+]?d+)(d{3})/$1,$2/;
351 return $_;
352 }
353 $n = 23659019423.2331;
354 print
355 GOT: 23,659,019,423.2331
356 You can't just:
357
358
359 s/^([[-+]?d+)(d{3})/$1,$2/g;
360 because you have to put the comma in and then recalculate your position.
361
362
363 Alternatively, this code commifies all numbers in a line
364 regardless of whether they have decimal portions, are
365 preceded by + or -, or whatever:
366
367
368 # from Andrew Johnson
369
370
371 __How can I translate tildes (~) in a
372 filename?__
373
374
375 Use the glob()'') operator, documented in
376 perlfunc. Older versions of Perl require that you have a
377 shell installed that groks tildes. Recent perl versions have
378 this feature built in. The Glob::KGlob module (available
379 from CPAN ) gives more portable glob
380 functionality.
381
382
383 Within Perl, you may use this directly:
384
385
386 $filename =~ s{
387 ^ ~ # find a leading tilde
388 ( # save this in $1
389 [[^/] # a non-slash character
390 * # repeated 0 or more times (0 means me)
391 )
392 }{
393 $1
394 ? (getpwnam($1))[[7]
395 : ( $ENV{HOME} $ENV{LOGDIR} )
396 }ex;
397
398
399 __How come when I open a file read-write it wipes it
400 out?__
401
402
403 Because you're using something like this, which truncates
404 the file and ''then'' gives you read-write
405 access:
406
407
408 open(FH,
409 Whoops. You should instead use this, which will fail if the file doesn't exist.
410
411
412 open(FH,
413 Using ``
414
415
416 Here are examples of many kinds of file opens. Those using
417 ''sysopen()'' all assume
418
419
420 use Fcntl;
421 To open file for reading:
422
423
424 open(FH,
425 To open file for writing, create new file if needed or else truncate old file:
426
427
428 open(FH,
429 To open file for writing, create new file, file must not exist:
430
431
432 sysopen(FH, $path, O_WRONLYO_EXCLO_CREAT) die $!;
433 sysopen(FH, $path, O_WRONLYO_EXCLO_CREAT, 0666) die $!;
434 To open file for appending, create if necessary:
435
436
437 open(FH,
438 To open file for appending, file must exist:
439
440
441 sysopen(FH, $path, O_WRONLYO_APPEND) die $!;
442 To open file for update, file must exist:
443
444
445 open(FH,
446 To open file for update, create file if necessary:
447
448
449 sysopen(FH, $path, O_RDWRO_CREAT) die $!;
450 sysopen(FH, $path, O_RDWRO_CREAT, 0666) die $!;
451 To open file for update, file must not exist:
452
453
454 sysopen(FH, $path, O_RDWRO_EXCLO_CREAT) die $!;
455 sysopen(FH, $path, O_RDWRO_EXCLO_CREAT, 0666) die $!;
456 To open a file without blocking, creating if necessary:
457
458
459 sysopen(FH,
460 Be warned that neither creation nor deletion of files is guaranteed to be an atomic operation over NFS . That is, two processes might both successfully create or unlink the same file! Therefore O_EXCL isn't as exclusive as you might wish.
461
462
463 See also the new perlopentut if you have it (new for
464 5.6).
465
466
467 __Why do I sometimes get an ``Argument list too long'' when
468 I use __
469
470
471 The operator performs a globbing operation
472 (see above). In Perl versions earlier than v5.6.0, the
473 internal ''glob()'' operator forks csh(1) to do
474 the actual glob expansion, but csh can't handle more than
475 127 items and so gives the error message Argument list
476 too long. People who installed tcsh as csh won't have
477 this problem, but their users may be surprised by
478 it.
479
480
481 To get around this, either upgrade to Perl v5.6.0 or later,
482 do the glob yourself with ''readdir()'' and patterns, or
483 use a module like Glob::KGlob, one that doesn't use the
484 shell to do globbing.
485
486
487 __Is there a leak/bug in__
488 ''glob()''__?__
489
490
491 Due to the current implementation on some operating systems,
492 when you use the ''glob()'' function or its angle-bracket
493 alias in a scalar context, you may cause a memory leak
494 and/or unpredictable behavior. It's best therefore to use
495 ''glob()'' only in list context.
496
497
498 __How can I open a file with a leading ``
499 __
500
501
502 Normally perl ignores trailing blanks in filenames, and
503 interprets certain leading characters (or a trailing ``'')
504 to mean something special. To avoid this, you might want to
505 use a routine like the one below. It turns incomplete
506 pathnames into explicit relative ones, and tacks a trailing
507 null byte on the name to make perl leave it
508 alone:
509
510
511 sub safe_filename {
512 local $_ = shift;
513 s#^([[^./])#./$1#;
514 $_ .=
515 $badpath =
516 This assumes that you are using POSIX (portable operating systems interface) paths. If you are on a closed, non-portable, proprietary system, you may have to adjust the above.
517
518
519 It would be a lot clearer to use ''sysopen()'',
520 though:
521
522
523 use Fcntl;
524 $badpath =
525 For more information, see also the new perlopentut if you have it (new for 5.6).
526
527
528 __How can I reliably rename a file?__
529
530
531 Well, usually you just use Perl's ''rename()'' function.
532 That may not work everywhere, though, particularly when
533 renaming files across file systems. Some sub-Unix systems
534 have broken ports that corrupt the semantics of
535 ''rename()''--for example, WinNT does this right, but
536 Win95 and Win98 are broken. (The last two parts are not
537 surprising, but the first is. :-)
538
539
540 If your operating system supports a proper mv(1)
541 program or its moral equivalent, this works:
542
543
544 rename($old, $new) or system(
545 It may be more compelling to use the File::Copy module instead. You just copy to the new file to the new name (checking return values), then delete the old one. This isn't really the same semantically as a real ''rename()'', though, which preserves metainformation like permissions, timestamps, inode info, etc.
546
547
548 Newer versions of File::Copy exports a ''move()''
549 function.
550
551
552 __How can I lock a file?__
553
554
555 Perl's builtin ''flock()'' function (see perlfunc for
556 details) will call flock(2) if that exists,
557 fcntl(2) if it doesn't (on perl version 5.004 and
558 later), and lockf(3) if neither of the two previous
559 system calls exists. On some systems, it may even use a
560 different form of native locking. Here are some gotchas with
561 Perl's ''flock()'':
562
563
564 1
565
566
567 Produces a fatal error if none of the three system calls (or
568 their close equivalent) exists.
569
570
571 2
572
573
574 lockf(3) does not provide shared locking, and
575 requires that the filehandle be open for writing (or
576 appending, or read/writing).
577
578
579 3
580
581
582 Some versions of ''flock()'' can't lock files over a
583 network (e.g. on NFS file systems), so you'd
584 need to force the use of fcntl(2) when you build
585 Perl. But even this is dubious at best. See the flock entry
586 of perlfunc and the ''INSTALL'' file in
587 the source distribution for information on building Perl to
588 do this.
589
590
591 Two potentially non-obvious but traditional flock semantics
592 are that it waits indefinitely until the lock is granted,
593 and that its locks are ''merely advisory''. Such
594 discretionary locks are more flexible, but offer fewer
595 guarantees. This means that files locked with ''flock()''
596 may be modified by programs that do not also use
597 ''flock()''. Cars that stop for red lights get on well
598 with each other, but not with cars that don't stop for red
599 lights. See the perlport manpage, your port's specific
600 documentation, or your system-specific local manpages for
601 details. It's best to assume traditional behavior if you're
602 writing portable programs. (If you're not, you should as
603 always feel perfectly free to write for your own system's
604 idiosyncrasies (sometimes called ``features''). Slavish
605 adherence to portability concerns shouldn't get in the way
606 of your getting your job done.)
607
608
609 For more information on file locking, see also ``File
610 Locking'' in perlopentut if you have it (new for
611 5.6).
612
613
614 __Why can't I just open( FH ,
615 ``
616
617
618 A common bit of code __NOT TO USE__ is
619 this:
620
621
622 sleep(3) while -e
623 This is a classic race condition: you take two steps to do something which must be done in one. That's why computer hardware provides an atomic test-and-set instruction. In theory, this ``ought'' to work:
624
625
626 sysopen(FH,
627 except that lamentably, file creation (and deletion) is not atomic over NFS , so this won't work (at least, not every time) over the net. Various schemes involving ''link()'' have been suggested, but these tend to involve busy-wait, which is also subdesirable.
628
629
630 __I still don't get locking. I just want to increment the
631 number in the file. How can I do this?__
632
633
634 Didn't anyone ever tell you web-page hit counters were
635 useless? They don't count number of hits, they're a waste of
636 time, and they serve only to stroke the writer's vanity.
637 It's better to pick a random number; they're more
638 realistic.
639
640
641 Anyway, this is what you can do if you can't help
642 yourself.
643
644
645 use Fcntl qw(:DEFAULT :flock);
646 sysopen(FH,
647 Here's a much better web-page hit counter:
648
649
650 $hits = int( (time() - 850_000_000) / rand(1_000) );
651 If the count doesn't impress your friends, then the code might. :-)
652
653
654 __How do I randomly update a binary file?__
655
656
657 If you're just trying to patch a binary, in many cases
658 something as simple as this works:
659
660
661 perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs
662 However, if you have fixed sized records, then you might do something more like this:
663
664
665 $RECSIZE = 220; # size of record, in bytes
666 $recno = 37; # which record to update
667 open(FH,
668 Locking and error checking are left as an exercise for the reader. Don't forget them or you'll be quite sorry.
669
670
671 __How do I get a file's timestamp in perl?__
672
673
674 If you want to retrieve the time at which the file was last
675 read, written, or had its meta-data (owner, etc) changed,
676 you use the __-M__, __-A__, or __-C__ filetest
677 operations as documented in perlfunc. These retrieve the age
678 of the file (measured against the start-time of your
679 program) in days as a floating point number. To retrieve the
680 ``raw'' time in seconds since the epoch, you would call the
681 stat function, then use ''localtime()'', ''gmtime()'',
682 or ''POSIX::strftime()'' to convert this into
683 human-readable form.
684
685
686 Here's an example:
687
688
689 $write_secs = (stat($file))[[9];
690 printf
691 If you prefer something more legible, use the File::stat module (part of the standard distribution in version 5.004 and later):
692
693
694 # error checking left as an exercise for reader.
695 use File::stat;
696 use Time::localtime;
697 $date_string = ctime(stat($file)-
698 The ''POSIX::strftime()'' approach has the benefit of being, in theory, independent of the current locale. See perllocale for details.
699
700
701 __How do I set a file's timestamp in perl?__
702
703
704 You use the ''utime()'' function documented in ``utime''
705 in perlfunc. By way of example, here's a little program that
706 copies the read and write times from its first argument to
707 all the rest of them.
708
709
710 if (@ARGV
711 Error checking is, as usual, left as an exercise for the reader.
712
713
714 Note that ''utime()'' currently doesn't work correctly
715 with Win95/NT ports. A bug has been reported. Check it
716 carefully before using ''utime()'' on those
717 platforms.
718
719
720 __How do I print to more than one file at
721 once?__
722
723
724 If you only have to do this once, you can do
725 this:
726
727
728 for $fh (FH1, FH2, FH3) { print $fh
729 To connect up to one filehandle to several output filehandles, it's easiest to use the tee(1) program if you have it, and let it take care of the multiplexing:
730
731
732 open (FH,
733 Or even:
734
735
736 # make STDOUT go to three files, plus original STDOUT
737 open (STDOUT,
738 Otherwise you'll have to write your own multiplexing print function--or your own tee program--or use Tom Christiansen's, at http://www.perl.com/CPAN/authors/id/TOMC/scripts/tct.gz , which is written in Perl and offers much greater functionality than the stock version.
739
740
741 __How can I read in an entire file all at
742 once?__
743
744
745 The customary Perl approach for processing all the lines in
746 a file is to do so one line at a time:
747
748
749 open (INPUT, $file) die
750 This is tremendously more efficient than reading the entire file into memory as an array of lines and then processing it one element at a time, which is often--if not almost always--the wrong approach. Whenever you see someone do this:
751
752
753 @lines =
754 you should think long and hard about why you need everything loaded at once. It's just not a scalable solution. You might also find it more fun to use the standard DB_File module's $DB_RECNO bindings, which allow you to tie an array to a file so that accessing an element the array actually accesses the corresponding line in the file.
755
756
757 On very rare occasion, you may have an algorithm that
758 demands that the entire file be in memory at once as one
759 scalar. The simplest solution to that is
760
761
762 $var = `cat $file`;
763 Being in scalar context, you get the whole thing. In list context, you'd get a list of all the lines:
764
765
766 @lines = `cat $file`;
767 This tiny but expedient solution is neat, clean, and portable to all systems on which decent tools have been installed. For those who prefer not to use the toolbox, you can of course read the file manually, although this makes for more complicated code.
768
769
770 {
771 local(*INPUT, $/);
772 open (INPUT, $file) die
773 That temporarily undefs your record separator, and will automatically close the file at block exit. If the file is already open, just use this:
774
775
776 $var = do { local $/;
777
778
779 __How can I read in a file by paragraphs?__
780
781
782 Use the $/ variable (see perlvar for details). You
783 can either set it to to eliminate
784 empty paragraphs (, for
785 instance, gets treated as two paragraphs and not three), or
786 to accept empty
787 paragraphs.
788
789
790 Note that a blank line must have no blanks in it. Thus
791 is one paragraph, but
792 is two.
793
794
795 __How can I read a single character from a file? From the
796 keyboard?__
797
798
799 You can use the builtin getc() function for most
800 filehandles, but it won't (easily) work on a terminal
801 device. For STDIN , either use the
802 Term::!ReadKey module from CPAN or use the
803 sample code in ``getc'' in perlfunc.
804
805
806 If your system supports the portable operating system
807 programming interface ( POSIX ), you can use
808 the following code, which you'll note turns off echo
809 processing as well.
810
811
812 #!/usr/bin/perl -w
813 use strict;
814 $ = 1;
815 for (1..4) {
816 my $got;
817 print
818 BEGIN {
819 use POSIX qw(:termios_h);
820 my ($term, $oterm, $echo, $noecho, $fd_stdin);
821 $fd_stdin = fileno(STDIN);
822 $term = POSIX::Termios-
823 $echo = ECHO ECHOK ICANON;
824 $noecho = $oterm
825 sub cbreak {
826 $term-
827 sub cooked {
828 $term-
829 sub getone {
830 my $key = '';
831 cbreak();
832 sysread(STDIN, $key, 1);
833 cooked();
834 return $key;
835 }
836 }
837 END { cooked() }
838 The Term::!ReadKey module from CPAN may be easier to use. Recent versions include also support for non-portable systems as well.
839
840
841 use Term::!ReadKey;
842 open(TTY,
843 For legacy DOS systems, Dan Carson
844
845
846 To put the PC in ``raw'' mode, use ioctl with
847 some magic numbers gleaned from msdos.c (Perl source file)
848 and Ralf Brown's interrupt list (comes across the net every
849 so often):
850
851
852 $old_ioctl = ioctl(STDIN,0,0); # Gets device info
853 $old_ioctl
854 Then to read a single character:
855
856
857 sysread(STDIN,$c,1); # Read a single character
858 And to put the PC back to ``cooked'' mode:
859
860
861 ioctl(STDIN,1,$old_ioctl); # Sets it back to cooked mode.
862 So now you have $c. If ord($c) == 0, you have a two byte code, which means you hit a special key. Read another byte with sysread(STDIN,$c,1), and that value tells you what combination it was according to this table:
863
864
865 # PC 2-byte keycodes = ^@ + the following:
866 # HEX KEYS
867 # --- ----
868 # 0F SHF TAB
869 # 10-19 ALT QWERTYUIOP
870 # 1E-26 ALT ASDFGHJKL
871 # 2C-32 ALT ZXCVBNM
872 # 3B-44 F1-F10
873 # 47-49 HOME,UP,!PgUp
874 # 4B LEFT
875 # 4D RIGHT
876 # 4F-53 END,DOWN,!PgDn,Ins,Del
877 # 54-5D SHF F1-F10
878 # 5E-67 CTR F1-F10
879 # 68-71 ALT F1-F10
880 # 73-77 CTR LEFT,RIGHT,END,!PgDn,HOME
881 # 78-83 ALT 1234567890-=
882 # 84 CTR !PgUp
883 This is all trial and error I did a long time ago; I hope I'm reading the file that worked...
884
885
886 __How can I tell whether there's a character waiting on a
887 filehandle?__
888
889
890 The very first thing you should do is look into getting the
891 Term::!ReadKey extension from CPAN . As we
892 mentioned earlier, it now even has limited support for
893 non-portable (read: not open systems, closed, proprietary,
894 not POSIX , not Unix, etc)
895 systems.
896
897
898 You should also check out the Frequently Asked Questions
899 list in comp.unix.* for things like this: the answer is
900 essentially the same. It's very system dependent. Here's one
901 solution that works on BSD
902 systems:
903
904
905 sub key_ready {
906 my($rin, $nfd);
907 vec($rin, fileno(STDIN), 1) = 1;
908 return $nfd = select($rin,undef,undef,0);
909 }
910 If you want to find out how many characters are waiting, there's also the FIONREAD ioctl call to be looked at. The ''h2ph'' tool that comes with Perl tries to convert C include files to Perl code, which can be required. FIONREAD ends up defined as a function in the ''sys/ioctl.ph'' file:
911
912
913 require 'sys/ioctl.ph';
914 $size = pack(
915 If ''h2ph'' wasn't installed or doesn't work for you, you can ''grep'' the include files by hand:
916
917
918 % grep FIONREAD /usr/include/*/*
919 /usr/include/asm/ioctls.h:#define FIONREAD 0x541B
920 Or write a small C program using the editor of champions:
921
922
923 % cat
924 And then hard-code it, leaving porting as an exercise to your successor.
925
926
927 $FIONREAD = 0x4004667f; # XXX: opsys dependent
928 $size = pack(
929 FIONREAD requires a filehandle connected to a stream, meaning that sockets, pipes, and tty devices work, but ''not'' files.
930
931
932 __How do I do a__ tail -f __in
933 perl?__
934
935
936 First try
937
938
939 seek(GWFILE, 0, 1);
940 The statement seek(GWFILE, 0, 1) doesn't change the current position, but it does clear the end-of-file condition on the handle, so that the next GWFILE
941
942
943 If that doesn't work (it relies on features of your stdio
944 implementation), then you need something more like
945 this:
946
947
948 for (;;) {
949 for ($curpos = tell(GWFILE);
950 If this still doesn't work, look into the POSIX module. POSIX defines the ''clearerr()'' method, which can remove the end of file condition on a filehandle. The method: read until end of file, ''clearerr()'', read some more. Lather, rinse, repeat.
951
952
953 There's also a File::Tail module from CPAN
954 .
955
956
957 __How do I__ ''dup()'' __a filehandle in
958 Perl?__
959
960
961 If you check ``open'' in perlfunc, you'll see that several
962 of the ways to call ''open()'' should do the trick. For
963 example:
964
965
966 open(LOG,
967 Or even with a literal numeric descriptor:
968
969
970 $fd = $ENV{MHCONTEXTFD};
971 open(MHCONTEXT,
972 Note that ``
973
974
975 Error checking, as always, has been left as an exercise for
976 the reader.
977
978
979 __How do I close a file descriptor by
980 number?__
981
982
983 This should rarely be necessary, as the Perl ''close()''
984 function is to be used for things that Perl opened itself,
985 even if it was a dup of a numeric descriptor as with
986 MHCONTEXT above. But if you really have to,
987 you may be able to do this:
988
989
990 require 'sys/syscall.ph';
991 $rc = syscall(
992 Or, just use the fdopen(3S) feature of ''open()'':
993
994
995 {
996 local *F;
997 open F,
998
999
1000 __Why can't I use ``C:tempfoo'' in DOS
1001 paths? What doesn't `C:tempfoo.exe` work?__
1002
1003
1004 Whoops! You just put a tab and a formfeed into that
1005 filename! Remember that within double quoted strings
1006 (``likethis''), the backslash is an escape character. The
1007 full list of these is in ``Quote and Quote-like Operators''
1008 in perlop. Unsurprisingly, you don't have a file called
1009 ``c:(tab)emp(formfeed)oo'' or ``c:(tab)emp(formfeed)oo.exe''
1010 on your legacy DOS filesystem.
1011
1012
1013 Either single-quote your strings, or (preferably) use
1014 forward slashes. Since all DOS and Windows
1015 versions since something like MS-DOS 2.0 or so have treated
1016 / and \ the same in a path, you might as
1017 well use the one that doesn't clash with Perl--or the
1018 POSIX shell, ANSI C and C
1019 ++ , awk, Tcl, Java, or Python, just to
1020 mention a few. POSIX paths are more portable,
1021 too.
1022
1023
1024 __Why doesn't glob(``*.*'') get all the
1025 files?__
1026
1027
1028 Because even on non-Unix ports, Perl's glob function follows
1029 standard Unix globbing semantics. You'll need
1030 glob( to get all (non-hidden) files.
1031 This makes ''glob()'' portable even to legacy systems.
1032 Your port may include proprietary globbing functions as
1033 well. Check its documentation for details.
1034
1035
1036 __Why does Perl let me delete read-only files? Why does__
1037 -i __clobber protected files? Isn't this a bug in
1038 Perl?__
1039
1040
1041 This is elaborately and painstakingly described in the ``Far
1042 More Than You Ever Wanted To Know'' in
1043 http://www.perl.com/CPAN/doc/FMTEYEWTK/file-dir-perms
1044 .
1045
1046
1047 The executive summary: learn how your filesystem works. The
1048 permissions on a file say what can happen to the data in
1049 that file. The permissions on a directory say what can
1050 happen to the list of files in that directory. If you delete
1051 a file, you're removing its name from the directory (so the
1052 operation depends on the permissions of the directory, not
1053 of the file). If you try to write to the file, the
1054 permissions of the file govern whether you're allowed
1055 to.
1056
1057
1058 __How do I select a random line from a
1059 file?__
1060
1061
1062 Here's an algorithm from the Camel Book:
1063
1064
1065 srand;
1066 rand($.)
1067 This has a significant advantage in space over reading the whole file in. A simple proof by induction is available upon request if you doubt the algorithm's correctness.
1068
1069
1070 __Why do I get weird spaces when I print an array of
1071 lines?__
1072
1073
1074 Saying
1075
1076
1077 print
1078 joins together the elements of @lines with a space between them. If @lines were ( then the above statement would print
1079
1080
1081 little fluffy clouds
1082 but if each element of @lines was a line of text, ending a newline character ( then it would print:
1083
1084
1085 little
1086 fluffy
1087 clouds
1088 If your array contains lines, just print them:
1089
1090
1091 print @lines;
1092 !!AUTHOR AND COPYRIGHT
1093
1094
1095 Copyright (c) 1997-1999 Tom Christiansen and Nathan
1096 Torkington. All rights reserved.
1097
1098
1099 When included as an integrated part of the Standard
1100 Distribution of Perl or of its documentation (printed or
1101 otherwise), this works is covered under Perl's Artistic
1102 License. For separate distributions of all or part of this
1103 FAQ outside of that, see
1104 perlfaq.
1105
1106
1107 Irrespective of its distribution, all code examples here are
1108 in the public domain. You are permitted and encouraged to
1109 use this code and any derivatives thereof in your own
1110 programs for fun or for profit as you see fit. A simple
1111 comment in the code giving credit to the FAQ
1112 would be courteous but is not required.
1113 ----
This page is a man page (or other imported legacy content). We are unable to automatically determine the license status of this page.