GarbageCollection means letting the computer track memory usage and free unused blocks automatically. This works surprisingly well, and is used in many ProgrammingLanguages -- in just about every very high level language, and also Java. Since manually managing memory is notoriously difficult, GarbageCollection can cut a huge portion out of development time and nowadays won't even negatively impact (in fact, may improve) the performance of an application.
GarbageCollection protects programmers from
- forgetting to deallocate memory, ie creating memory leaks
- getting in trouble with responsibility, where one part of the code may deallocate a DataStructures still needed by another part
but not from
- filling up all avaliable memory
- attempting to dereference null pointers or references (although languages such as ML use the type system to do this)
There are several algorithms
- Reference counting. Antique GarbageCollection systems (and some C++ Templates) use reference counting algorithms. This involves marking every block in memory with a counter to how many times it is referenced and deallocating it when the count reaches zero. This is gentle on the cache and closely ties the deallocation to the operation that made a block unreachable. Unfortunately they are unable to reclaim cyclic DataStructures (such as doubly linked lists).
- Copying algorithms periodically sweep the entire memory for objects still in use and move these to one "end" of the memory, much as some disk de fragmenters work. This technique can be very efficient, since only the active objects are moved the cost is dependant only on the size of the active memory rather than the total memory (this assumes that no finalisers are being run and that page-level memory operations are avaliable). Unfortunately this copying requires a global read/write lock on memory and the ability to change pointer on the stack to point to the new locations of objects. Because copying algorithms leave all the free memory in a single block they also solve memory fragmentation problems.
- Mark and Sweep algorithms periodically sweep the entire memory for objects no longer in use and deallocate them. With a little inguenity this can be done in a back-ground thread or process. These are less efficient than copying algorithms, because the deallocation cannot normally be achived using the page-based memory operations.
- Train Algorithms are an innovation on copying and mark and sweep algorithms. Based on the observation that most objects are very short lived they divide memory into generations (either physically or virtually) and concentrate their attention on the most recently allocated generations.
StuartYeates did his MSc on GarbageCollection