Differences between version 16 and revision by previous author of VersionControlSystem.
Other diffs: Previous Major Revision, Previous Revision, or view the Annotated Edit History
Newer page: | version 16 | Last edited on Saturday, June 20, 2009 6:22:55 am | by AristotlePagaltzis | Revert |
Older page: | version 10 | Last edited on Monday, June 15, 2009 7:55:18 pm | by LawrenceDoliveiro | Revert |
@@ -8,13 +8,42 @@
* GnuArch
* [RCS]
* SubVersion
-See EricRaymond's essay ''[Understanding Version Control|http://www.catb.org/~esr/writings/version-control/version-control.html]'' for an introduction to the main concepts and a comparison of the major systems. The main stages of development can be summarized as follows:
-*
The very
earliest systems were centralized
. Not
only that
, but they required users
to check out
files specifically for writing before they were allowed
to be changed
. This
was to prevent multiple users
from modifying
the same file at the same time
, which was thought
to lead
to chaos
.
-*
This one-writer-at-
a-time restriction turned out
to be
a lot more trouble than
it was worth
. Later
systems (CVS
, Subversion) dispensed with locking
. Instead
, when
one user tried
to check
in a set
of changes, and someone else had modified files (whether
the same or different ones) since
the first user did their checkout
, the first user would be blocked from doing
the checkin until they had manually incorporated those changes
that had been checked in in
the meantime
.
-* Even later
, clever techniques were invented
to handle ''merging'' of overlapped changes
. This was immensely liberating
, by allowing parallel development
, in a more collaborative way than before
. The natural step from
this was the evolution
of [
DistributedVersionControlSystem]
s. From this point on, the evolution of VCSes
has basically been about developing increasingly sophisticated architectures for handling merging.
+See EricRaymond's essay ''[Understanding Version Control|http://www.catb.org/~esr/writings/version-control/version-control.html]'' for an introduction to the main concepts and a comparison of the major systems.
+
+!!! Evolutionary stages of Version Control
+
+
The main stages of development can be summarized as follows:
+
+#
The design of the
earliest systems revolved around versioning a single working copy, directly edited by all users
. To prevent attempts at simultaneous modification of a single file, editing was not allowed without checking files out, which
only one user at a time could do for any given file.
+
+ Having to give each user access to the same machine and FileSystem in order to work on code was natural at the time these systems were designed, in the mainframe era
, but today would obviously be a problem. Also, the requirement
to check files out was a cause of friction even at the time, since everyone has
to wait on one another – not to mention that someone might forget to check a file back in before leaving on vacation
.
+
+# The next evolutionary step
was to decouple the repository
from the working copy
, so that there may then be many working copies. The exemplar in this class of systems, known as centralised [VCS]s, is [CVS]. It lifts the obvious restrictions of earlier systems with a design in
which the repository is mediated by a server. Multiple users can collaborate by each checking out a private working copy of the project.
+
+ Note that “checking out” no longer implies locking. Checking in changes is simply blocked if someone else has already checked in other changes in the meantime. The latecomer simply has
to update their working copy with the upstream changes, resolving any conflicts manually, before they are allowed
to check in their own changes
.
+
+
This works reasonably well, so [CVS] was the de facto standard for
a decade. However, its single
-repository nature, subsequently adopted by most following major systems, perpetuates problems harking back
to the earlier model and adds new ones.
+
+ Firstly, checking in changes under such
a system requires a network connection, as do most operations related to the project history. Besides the fact that this makes offline work nearly impossible,
it also imposes a major performance penalty, since networked operations are inescapably slow
. Some
systems, like SubVersion, try to selectively speed up some of these operations by keeping more data in the working copy, but the benefit of this is uneven across operations
. Further
, high traffic repositories may require rather beefy servers and connections to sustain.
+
+ Anything checked in is always public; this means
one has
to be very careful about the state of commits. It also makes it impossible to touch up history (eg. to fix common mistakes like forgetting to include a new file
in a commit). Branches become a big deal: all commits are publicly visible, no matter how experimental. Also, branch names are forced into a global namespace so a lot
of thought has to be given to choosing them.
+
+ Branching is problematic for more reasons too. Most of these systems do not support branch merging very well: after you do it once, the
changes from the merged-in branch are mixed in without any tracking
, so if later attempts to merge
the same branch will result in lots of artificial conflicts. This makes it very difficult to keep branches in synch. But
the longer branches go without merging
, the more effort it takes to merge them. All this adds up to a large barrier, psychological and otherwise, against branching.
+
+ Finally,
the single-repository nature means
that anyone who wants
the safety of revision control needs to have write access to the same repository
. And since branching is badly supported
, everyone with access
to the repository is generally going to be working on the same trunk
. This means write access has to be given out selectively
, to competent people only
, resulting
in political headaches within projects, while outsiders are forced to create their patches in an unversioned ghetto
.
+
+#
The solution to all
this was to not only give each collaborator a separate working copy, but a separate repository also. This class
of system is known as
DistributedVersionControlSystem~
s. The technical basis that allows this is algorithmic merging: 3-way merging allows combining non-overlapping changes automatically, and merge point tracking allows repeatedly merging branches without unnecessary conflicts.
+
+ Since each collaborator has their own repository and can make commits, the effect is that everyone has their own private branch, with full versioning for local changes, and these branches can be published at the discretion of their author and can be merged by others easily. Actually, each collaborator often has several local branches – since merging is easy and branches never ''need'' be published, it is painless to create short-lived branches for experiments or tests, to use them as a general workflow aspect (eg. start a new branch for every separate bug fix), or for any other purpose, whether intended for public consumption or not.
+
+ Everyone has full offline access to the project history, and all repository operations (except pushing or pulling changes, obviously) take place at full local disk speed.
+
+ All this immensely accelerates collaborative development and removes the political headaches surrounding commit access.
+
+
From this point on, the evolution of [VCS]s
has basically been about developing increasingly sophisticated architectures for handling merging.
Each development took a long time, much of which was spent simply coming to recognize that there was a problem that needed to be solved. Many of the new developments remained controversial to adherents of older ways of doing things, even to the present day.
----
CategoryVersionControl