Differences between version 20 and previous revision of VersionControlSystem.
Other diffs: Previous Major Revision, Previous Author, or view the Annotated Edit History
Newer page: | version 20 | Last edited on Wednesday, June 24, 2009 8:54:13 pm | by AristotlePagaltzis | Revert |
Older page: | version 19 | Last edited on Wednesday, June 24, 2009 8:45:11 pm | by AristotlePagaltzis | Revert |
@@ -24,19 +24,21 @@
# ! 1+n: One Repository, Many Working Copies
The next evolutionary step was to decouple the repository from the working copy, so that there may then be many working copies. The exemplar in this class of systems, known as centralised [VCS]s, is [CVS]. It lifts the obvious restrictions of earlier systems with a design in which the repository is mediated by a server. Multiple users can collaborate by each checking out a private working copy of the project.
- Note that in [CVS], “checking out” no longer implies locking. (In other centralised [VCS]s, such as VisualSourceSafe, it may. In some, such as PerForce, it is optional.) Checking in changes is simply blocked if someone else has already checked in other changes in the meantime. The
latecomer simply has
to update their working copy with the upstream changes, resolving any conflicts manually, before they are allowed to check in their own changes
.
+ Note that in [CVS], “checking out” no longer implies locking. (In other centralised [VCS]s, such as VisualSourceSafe, it may. In some, such as PerForce, it is optional.) Checking in changes is simply blocked if someone else has already checked in other changes in the meantime. Before the
latecomer is allowed to check in their own changes, they have
to update their working copy with the upstream changes, resolving any conflicts manually.
- This works reasonably well, so
[CVS] was
the de facto standard for a decade. However, its single-repository nature, subsequently adopted by most following major systems, perpetuates problems harking back to the earlier model and adds new ones
.
+ This works reasonably well.
[CVS] ended up as
the de facto standard for a decade.
- Firstly
, checking in changes under such a system requires a network connection
, as do
most operations related to the project history. Besides the fact that this makes offline work nearly impossible, it also imposes a
major performance penalty, since networked operations are inescapably slow. Some
systems, like SubVersion, try
to selectively speed up some of these operations by keeping more data in
the working copy, but the benefit of this is uneven across operations. Further, high traffic repositories may require rather beefy servers
and connections to sustain.
+ However
, its single-repository nature
, subsequently adopted by
most following
major systems, perpetuates problems harking back
to the earlier model –
and adds new ones:''''
- Anything checked
in is always public; this means one has
to be very careful about
the state of commits
. It also
makes it
impossible to touch up history (eg
. to fix common mistakes
like forgetting
to include a new file
in a commit). Branches become a big deal: all commits are publicly visible
, no matter how experimental
. Also
, branch names are forced into a global namespace so a lot of thought has to be given
to choosing them
.
+ * Checking
in changes under such a system requires a network connection, as do most operations related
to the project history
. Besides the fact that this
makes offline work nearly
impossible, it also imposes a major performance penalty, since networked operations are inescapably slow
. Some systems,
like SubVersion, try
to selectively speed up some of these operations by keeping more data
in the working copy
, but the benefit of this is uneven across operations
. Further
, high traffic repositories may require rather beefy servers and connections
to sustain
.<br><br>
- Branching
is problematic for more reasons too. Most of these systems do not support branch merging
very well: after you do it once,
the changes from the merged-in branch are mixed in without any tracking, so if later attempts to merge the same branch will result in lots
of artificial conflicts
. This
makes it very difficult
to keep branches in synch
. But the longer branches go without merging, the more effort it takes
to merge them. All this adds up
to a large barrier
, psychological and otherwise
, against branching
.
+ * Anything checked in
is always public; this means one has to be
very careful about
the state
of commits
. It also
makes it impossible
to touch up history (eg
. to fix common mistakes like forgetting
to include
a new file in a commit). Branches become a big deal: all commits are publicly visible
, no matter how experimental. Also
, branch names are forced into a global namespace so a lot of thought has to be given to choosing them
.<br><br>
- Finally
, the single-repository nature means that anyone who wants the safety of revision control needs to have write access to the same repository. And since branching is badly supported, everyone with access to the repository is generally going to be working on the same trunk. This means write access has to be given out selectively, to competent people only, resulting in political headaches within projects, while outsiders are forced to create their patches in an unversioned ghetto.
+ * Branching is problematic for more reasons too. Most of these systems do not support branch merging very well: after you do it once
, the changes from the merged-in branch are mixed in without any tracking, so later attempts to merge the same branch will result in lots of artificial conflicts. This makes it very difficult to keep branches in synch. But the longer branches go without merging, the more effort it takes to merge them. All this adds up to a large barrier, psychological and otherwise, against branching.<br><br>
+
+ * The
single-repository nature means that anyone who wants the safety of revision control needs to have write access to the same repository. And since branching is badly supported, everyone with access to the repository is generally going to be working on the same trunk. This means write access has to be given out selectively, to competent people only, resulting in political headaches within projects, while outsiders are forced to create their patches in an unversioned ghetto.
# ! n+n: Many Working Copies, Paired With Equally Many Repositories
The solution to all this was to not only give each collaborator a separate working copy, but a separate repository also. This class of system, whose pioneering solid implementation was BitKeeper, is known as DistributedVersionControlSystem~s. The technical basis that allows this is algorithmic merging: 3-way merging allows combining non-overlapping changes automatically, and merge point tracking allows repeatedly merging branches without unnecessary conflicts.
@@ -46,10 +48,10 @@
Everyone has full offline access to the project history, and all repository operations (except pushing or pulling changes, obviously) take place at full local disk speed.
All this immensely accelerates collaborative development and removes the political headaches surrounding commit access.
-From this point on, the evolution of [VCS]s has basically been about developing increasingly sophisticated architectures for handling merging.
+From this point on, the evolution of [VCS]s has basically been about developing increasingly sophisticated architectures for handling merging. (Classic 3-way merging is only the simplest possibility.)
Each development took a long time, much of which was spent simply coming to recognize that there was a problem that needed to be solved. Many of the new developments remained controversial to adherents of older ways of doing things, even to the present day.
----
CategoryVersionControl