Penguin
Blame: DatabaseTreeStructures
EditPageHistoryDiffInfoLikePages
Annotated edit history of DatabaseTreeStructures version 3, including all changes. View license author blame.
Rev Author # Line
1 NickJohnson 1 There are a number of commonly used ways to represent tree structures in a database, each suited for different access patterns. This page attempts to outline the most common ones, along with what operations are easy and hard to do from a database.
2
3 !!!Adjacency List
4 The Adjacency List model is the most common and most obvious way of representing a tree structure in a database. Unfortunately, retrieving all descendants of a node, usually the most common operation, is difficult and expensive, requiring many queries on large trees.
5
6 <?plugin OldStyleTable
7 |< *Columns* |< parent_id
8 |< *Cheap operations* |< Retrieve parent, Retrieve children, Insert, Change parent, Delete
9 |< *Expensive operations* |< Retrieve tree, Retrieve descendants, Retrieve ancestors
10 ?>
11
12 !!!Adjacency list with root ID
13 This method is similar to the Adjacency List method, with the exception that it stores the ID of the root of the tree in addition to the ID of the direct parent node. When storing multiple trees in a table, such as threads in a discussion board, this makes retrieving the entire tree easy. Retrieving descendants of a small part of the tree remains difficult, however.
14
15 <?plugin OldStyleTable
16 |< *Columns* |< root_id, parent_id
17 |< *Cheap operations* |< Retrieve parent, Retrieve children, Insert, Change parent, Delete, Retrieve tree
18 |< *Expensive operations* |< Retrieve descendants, Retrieve ancestors
19 ?>
20
21 !!!Ancestor List
22 This method works by storing in each node a complete list of ancestors, from the root down. The list is typically either stored as a delimited string (eg, "1,24,500") or as an array of IDs. For ease of querying, the id of the node itself is usually included as the last element in the list. This method makes most tree operations easy, but requires the ancestor list for the parent node to be retrieved before a node can be inserted. On deep trees, this method also results in very large lists, resulting in a lot of storage overhead.
23
24 <?plugin OldStyleTable
25 |< *Columns* |< ancestors (array or string)
26 |< *Cheap operations* |< Retrieve parent, Retrieve children, Insert, Change parent, Delete, Retrieve tree, Retrieve descendants, Retrieve ancestors
27 |< *Expensive operations* |< None
28 ?>
29
30 !!!Nested Set
31 This method works by assigning a 'left' and 'right' ID to each node. Nodes are numbered in depth-first order, so every node is guaranteed to have a 'left' value greater than the 'left' of all its ancestors, and a 'right' value greater than the 'right' of all its ancestors, and sorting a (sub)tree by its 'left' values will return it in depth-first order.
32
33 This is also the only method listed here that allows nodes to have an explicit internal ordering of children of each node other than any external ordering provided.
34
35 Note that 'retrieve parent' and 'retrieve children' are listed as expensive operations because they are non-optimal. However, they are still no less efficient than 'retrieve ancestors' and 'retrieve descendants'.
36
37 'Delete' is listed as a cheap operation under the assumption that 'left' and 'right' IDs do not need to remain contiguous.
38
39 <?plugin OldStyleTable
40 |< *Columns* |< left, right
41 |< *Cheap operations* |< Delete, Retrieve tree, Retrieve descendants, Retrieve ancestors
42 |< *Expensive operations* |< Retrieve parent, Retrieve children, Insert, Change parent
43 ?>
44
45 !!!Nested Set with Adjacency List
46 This method functions as a combination of the Nested Set and Adjacency List methods, conferring most of the benefits of each.
47
48 <?plugin OldStyleTable
49 |< *Columns* |< parent_id, left, right
50 |< *Cheap operations* |< Retrieve parent, Retrieve children, Delete, Retrieve tree, Retrieve descendants, Retrieve ancestors
2 NickJohnson 51 |< *Expensive operations* |< Insert, Change parent
52 ?>
53
54 !!!Nested Set with depth
55 This method operates identically to the Nested Set method, described above, with the addition of a depth field, which contains the count of nodes between the current node and the root of the tree. This makes querying direct children and parents cheap.
56
57 This is generally an improvement on Nested Set with Adjacency List, since you can retrieve children only to a certain depth.
58
59 <?plugin OldStyleTable
60 |< *Columns* |< left, right, depth
61 |< *Cheap operations |< Retrieve parent, Retrieve children, Delete, Retrieve tree, Retrieve descendants, Retrieve ancestors
1 NickJohnson 62 |< *Expensive operations* |< Insert, Change parent
63 ?>
3 CraigBox 64
65 CategoryProgramming