Penguin
Blame: perlreftut(1)
EditPageHistoryDiffInfoLikePages
Annotated edit history of perlreftut(1) version 1, including all changes. View license author blame.
Rev Author # Line
1 perry 1 PERLREFTUT
2 !!!PERLREFTUT
3 NAME
4 DESCRIPTION
5 Who Needs Complicated Data Structures?
6 The Solution
7 Syntax
8 An Example
9 Arrow Rule
10 Solution
11 The Rest
12 Summary
13 Credits
14 ----
15 !!NAME
16
17
18 perlreftut - Mark's very short tutorial about references
19 !!DESCRIPTION
20
21
22 One of the most important new features in Perl 5 was the
23 capability to manage complicated data structures like
24 multidimensional arrays and nested hashes. To enable these,
25 Perl 5 introduced a feature called `references', and using
26 references is the key to managing complicated, structured
27 data in Perl. Unfortunately, there's a lot of funny syntax
28 to learn, and the main manual page can be hard to follow.
29 The manual is quite complete, and sometimes people find that
30 a problem, because it can be hard to tell what is important
31 and what isn't.
32
33
34 Fortunately, you only need to know 10% of what's in the main
35 page to get 90% of the benefit. This page will show you that
36 10%.
37 !!Who Needs Complicated Data Structures?
38
39
40 One problem that came up all the time in Perl 4 was how to
41 represent a hash whose values were lists. Perl 4 had hashes,
42 of course, but the values had to be scalars; they couldn't
43 be lists.
44
45
46 Why would you want a hash of lists? Let's take a simple
47 example: You have a file of city and country names, like
48 this:
49
50
51 Chicago, USA
52 Frankfurt, Germany
53 Berlin, Germany
54 Washington, USA
55 Helsinki, Finland
56 New York, USA
57 and you want to produce an output like this, with each country mentioned once, and then an alphabetical list of the cities in that country:
58
59
60 Finland: Helsinki.
61 Germany: Berlin, Frankfurt.
62 USA: Chicago, New York, Washington.
63 The natural way to do this is to have a hash whose keys are country names. Associated with each country name key is a list of the cities in that country. Each time you read a line of input, split it into a country and a city, look up the list of cities already known to be in that country, and append the new city to the list. When you're done reading the input, iterate over the hash as usual, sorting each list of cities before you print it out.
64
65
66 If hash values can't be lists, you lose. In Perl 4, hash
67 values can't be lists; they can only be strings. You lose.
68 You'd probably have to combine all the cities into a single
69 string somehow, and then when time came to write the output,
70 you'd have to break the string into a list, sort the list,
71 and turn it back into a string. This is messy and
72 error-prone. And it's frustrating, because Perl already has
73 perfectly good lists that would solve the problem if only
74 you could use them.
75 !!The Solution
76
77
78 By the time Perl 5 rolled around, we were already stuck with
79 this design: Hash values must be scalars. The solution to
80 this is references.
81
82
83 A reference is a scalar value that ''refers to'' an
84 entire array or an entire hash (or to just about anything
85 else). Names are one kind of reference that you're already
86 familiar with. Think of the President: a messy, inconvenient
87 bag of blood and bones. But to talk about him, or to
88 represent him in a computer program, all you need is the
89 easy, convenient scalar string ``Bill
90 Clinton''.
91
92
93 References in Perl are like names for arrays and hashes.
94 They're Perl's private, internal names, so you can be sure
95 they're unambiguous. Unlike ``Bill Clinton'', a reference
96 only refers to one thing, and you always know what it refers
97 to. If you have a reference to an array, you can recover the
98 entire array from it. If you have a reference to a hash, you
99 can recover the entire hash. But the reference is still an
100 easy, compact scalar value.
101
102
103 You can't have a hash whose values are arrays; hash values
104 can only be scalars. We're stuck with that. But a single
105 reference can refer to an entire array, and references are
106 scalars, so you can have a hash of references to arrays, and
107 it'll act a lot like a hash of arrays, and it'll be just as
108 useful as a hash of arrays.
109
110
111 We'll come back to this city-country problem later, after
112 we've seen some syntax for managing references.
113 !!Syntax
114
115
116 There are just two ways to make a reference, and just two
117 ways to use it once you have it.
118
119
120 __Making References__
121
122
123 __Make Rule 1__
124
125
126 If you put a \ in front of a variable, you get a
127 reference to that variable.
128
129
130 $aref = @array; # $aref now holds a reference to @array
131 $href = %hash; # $href now holds a reference to %hash
132 Once the reference is stored in a variable like $aref or $href, you can copy it or store it just the same as any other scalar value:
133
134
135 $xy = $aref; # $xy now holds a reference to @array
136 $p[[3] = $href; # $p[[3] now holds a reference to %hash
137 $z = $p[[3]; # $z now holds a reference to %hash
138 These examples show how to make references to variables with names. Sometimes you want to make an array or a hash that doesn't have a name. This is analogous to the way you like to be able to use the string or the number 80 without having to store it in a named variable first.
139
140
141 __Make Rule 2__
142
143
144 [[ ITEMS ] makes a new, anonymous array, and returns
145 a reference to that array. { ITEMS } makes a new,
146 anonymous hash. and returns a reference to that
147 hash.
148
149
150 $aref = [[ 1,
151 $href = { APR =
152 The references you get from rule 2 are the same kind of references that you get from rule 1:
153
154
155 # This:
156 $aref = [[ 1, 2, 3 ];
157 # Does the same as this:
158 @array = (1, 2, 3);
159 $aref = @array;
160 The first line is an abbreviation for the following two lines, except that it doesn't create the superfluous array variable @array.
161
162
163 __Using References__
164
165
166 What can you do with a reference once you have it? It's a
167 scalar value, and we've seen that you can store it as a
168 scalar and get it back again just like any scalar. There are
169 just two more ways to use it:
170
171
172 __Use Rule 1__
173
174
175 If $aref contains a reference to an array, then you
176 can put {$aref} anywhere you would normally put the
177 name of an array. For example, @{$aref} instead of
178 @array.
179
180
181 Here are some examples of that:
182
183
184 Arrays:
185
186
187 @a @{$aref} An array
188 reverse @a reverse @{$aref} Reverse the array
189 $a[[3] ${$aref}[[3] An element of the array
190 $a[[3] = 17; ${$aref}[[3] = 17 Assigning an element
191 On each line are two expressions that do the same thing. The left-hand versions operate on the array @a, and the right-hand versions operate on the array that is referred to by $aref, but once they find the array they're operating on, they do the same things to the arrays.
192
193
194 Using a hash reference is ''exactly'' the
195 same:
196
197
198 %h %{$href} A hash
199 keys %h keys %{$href} Get the keys from the hash
200 $h{'red'} ${$href}{'red'} An element of the hash
201 $h{'red'} = 17 ${$href}{'red'} = 17 Assigning an element
202 __Use Rule 2__
203
204
205 ${$aref}[[3] is too hard to read, so you can write
206 $aref- instead.
207
208
209 ${$href}{red} is too hard to read, so you can write
210 $href- instead.
211
212
213 Most often, when you have an array or a hash, you want to
214 get or set a single element from it. ${$aref}[[3]
215 and ${$href}{'red'} have too much punctuation, and
216 Perl lets you abbreviate.
217
218
219 If $aref holds a reference to an array, then
220 $aref- is the fourth element of the array.
221 Don't confuse this with $aref[[3], which is the
222 fourth element of a totally different array, one deceptively
223 named @aref. $aref and @aref are
224 unrelated the same way that $item and
225 @item are.
226
227
228 Similarly, $href- is part of the hash
229 referred to by the scalar variable $href, perhaps
230 even one with no name. $href{'red'} is part of the
231 deceptively named %href hash. It's easy to forget
232 to leave out the -, and if you do, you'll get
233 bizarre results when your program gets array and hash
234 elements out of totally unexpected hashes and arrays that
235 weren't the ones you wanted to use.
236 !!An Example
237
238
239 Let's see a quick example of how all this is
240 useful.
241
242
243 First, remember that [[1, 2, 3] makes an anonymous
244 array containing (1, 2, 3), and gives you a
245 reference to that array.
246
247
248 Now think about
249
250
251 @a = ( [[1, 2, 3],
252 [[4, 5, 6],
253 [[7, 8, 9]
254 );
255 @a is an array with three elements, and each one is a reference to another array.
256
257
258 $a[[1] is one of these references. It refers to an
259 array, the array containing (4, 5, 6), and because
260 it is a reference to an array, __USE RULE
261 2__ says that we can write $a[[1]- to get
262 the third element from that array. $a[[1]- is
263 the 6. Similarly, $a[[0]- is the 2. What we
264 have here is like a two-dimensional array; you can write
265 $a[[ROW]- to get or set the element in
266 any row and any column of the array.
267
268
269 The notation still looks a little cumbersome, so there's one
270 more abbreviation:
271 !!Arrow Rule
272
273
274 In between two __subscripts__, the arrow is
275 optional.
276
277
278 Instead of $a[[1]-, we can write
279 $a[[1][[2]; it means the same thing. Instead of
280 $a[[0]-, we can write $a[[0][[1]; it
281 means the same thing.
282
283
284 Now it really looks like two-dimensional
285 arrays!
286
287
288 You can see why the arrows are important. Without them, we
289 would have had to write ${$a[[1]}[[2] instead of
290 $a[[1][[2]. For three-dimensional arrays, they let us
291 write $x[[2][[3][[5] instead of the unreadable
292 ${${$x[[2]}[[3]}[[5].
293 !!Solution
294
295
296 Here's the answer to the problem I posed earlier, of
297 reformatting a file of city and country names.
298
299
300 1 while (
301 The program has two pieces: Lines 1--5 read the input and build a data structure, and lines 7--12 analyze the data and print out the report.
302
303
304 In the first part, line 4 is the important one. We're going
305 to have a hash, %table, whose keys are country
306 names, and whose values are (references to) arrays of city
307 names. After acquiring a city and country name, the program
308 looks up $table{$country}, which holds (a reference
309 to) the list of cities seen in that country so far. Line 4
310 is totally analogous to
311
312
313 push @array, $city;
314 except that the name array has been replaced by the reference {$table{$country}}. The push adds a city name to the end of the referred-to array.
315
316
317 In the second part, line 9 is the important one. Again,
318 $table{$country} is (a reference to) the list of
319 cities in the country, so we can recover the original list,
320 and copy it into the array @cities, by using
321 @{$table{$country}}. Line 9 is totally analogous
322 to
323
324
325 @cities = @array;
326 except that the name array has been replaced by the reference {$table{$country}}. The @ tells Perl to get the entire array.
327
328
329 The rest of the program is just familiar uses of
330 chomp, split, sort,
331 print, and doesn't involve references at
332 all.
333
334
335 There's one fine point I skipped. Suppose the program has
336 just read the first line in its input that happens to
337 mention Greece. Control is at line 4, $country is
338 'Greece', and $city is 'Athens'.
339 Since this is the first city in Greece,
340 $table{$country} is undefined---in fact there isn't
341 an 'Greece' key in %table at all. What
342 does line 4 do here?
343
344
345 4 push @{$table{$country}}, $city;
346 This is Perl, so it does the exact right thing. It sees that you want to push Athens onto an array that doesn't exist, so it helpfully makes a new, empty, anonymous array for you, installs it in the table, and then pushes Athens onto it. This is called `autovivification'.
347 !!The Rest
348
349
350 I promised to give you 90% of the benefit with 10% of the
351 details, and that means I left out 90% of the details. Now
352 that you have an overview of the important parts, it should
353 be easier to read the perlref manual page, which discusses
354 100% of the details.
355
356
357 Some of the highlights of perlref:
358
359
360 You can make references to anything, including scalars,
361 functions, and other references.
362
363
364 In __USE RULE 1__, you can omit the curly
365 brackets whenever the thing inside them is an atomic scalar
366 variable like $aref. For example, @$aref
367 is the same as @{$aref}, and $$aref[[1] is
368 the same as ${$aref}[[1]. If you're just starting
369 out, you may want to adopt the habit of always including the
370 curly brackets.
371
372
373 To see if a variable contains a reference, use the `ref'
374 function. It returns true if its argument is a reference.
375 Actually it's a little better than that: It returns
376 HASH for hash references and
377 ARRAY for array references.
378
379
380 If you try to use a reference like a string, you get strings
381 like
382
383
384 ARRAY(0x80f5dec) or HASH(0x826afc0)
385 If you ever see a string that looks like this, you'll know you printed out a reference by mistake.
386
387
388 A side effect of this representation is that you can use
389 eq to see if two references refer to the same
390 thing. (But you should usually use == instead
391 because it's much faster.)
392
393
394 You can use a string as if it were a reference. If you use
395 the string as an array reference,
396 it's taken to be a reference to the array @foo.
397 This is called a ''soft reference'' or ''symbolic
398 reference''.
399
400
401 You might prefer to go on to perllol instead of perlref; it
402 discusses lists of lists and multidimensional arrays in
403 detail. After that, you should move on to perldsc; it's a
404 Data Structure Cookbook that shows recipes for using and
405 printing out arrays of hashes, hashes of arrays, and other
406 kinds of data.
407 !!Summary
408
409
410 Everyone needs compound data structures, and in Perl the way
411 you get them is with references. There are four important
412 rules for managing references: Two for making references and
413 two for using them. Once you know these rules you can do
414 most of the important things you need to do with
415 references.
416 !!Credits
417
418
419 Author: Mark-Jason Dominus, Plover Systems
420 (mjd-perl-ref+@plover.com)
421
422
423 This article originally appeared in ''The Perl Journal''
424 (http://tpj.com) volume 3, #2. Reprinted with
425 permission.
426
427
428 The original title was ''Understand References
429 Today''.
430
431
432 __Distribution Conditions__
433
434
435 Copyright 1998 The Perl Journal.
436
437
438 When included as part of the Standard Version of Perl, or as
439 part of its complete documentation whether printed or
440 otherwise, this work may be distributed only under the terms
441 of Perl's Artistic License. Any distribution of this file or
442 derivatives thereof outside of that package require that
443 special arrangements be made with copyright
444 holder.
445
446
447 Irrespective of its distribution, all code examples in these
448 files are hereby placed into the public domain. You are
449 permitted and encouraged to use this code in your own
450 programs for fun or for profit as you see fit. A simple
451 comment in the code giving credit would be courteous but is
452 not required.
453 ----
This page is a man page (or other imported legacy content). We are unable to automatically determine the license status of this page.