Version control is an essential tool for software developers, but with so many concepts it can be difficult for beginners to understand. This high level introduction describes its key concepts and the reasons you should be using it on all your projects.
What is version control?
Developers use version control, also known as revision control or source control, to keep track of changes made to source code over time. This allows developers to go back in time and see the code at any point in its history.
You might wonder why this is useful. If it's old code, surely you don't need it anymore, right? It turns out that recording the changes to code has some really beneficial effects.
How is version control used?
Source code is stored in a version control repository. The repository contains all the changes that have been made to the source code.
There are two main types of version control. In the distributed model, a copy of the entire repository is stored on the developer's machine. This is in contrast to the client/server model where the repository is stored on a central server and developers only have a copy of a single commit, usually the most recent.
The client/server model of version control is outdated, but you might still see it in use. The disadvantage of the client/server model is that all operations on the repository need to connect to the server. This is both slow and unreliable. Distributed version control software stores the entire repository so all commands are fast and developers can work even in the absence of network connection.
When using distributed version control it's still common to have a repository act as a central repository so developers can synchronize their changes. After committing to their local copy of the repository, developers push their changes to the central repository. Other developers can pull from that repository to get those changes.
The distributed model has led to the creation of online services centered around social coding, which I'll talk about a bit later on.
When a developer has finished making changes to the code they can be saved in version control. This is known as a commit or revision.
The following image shows four commits. If you're wondering why the arrows points backwards, it's because each commit points to its parent.
Each commit has metadata associated with it, including:
- A unique identifier known as a commit or revision id;
- The date and time the commit was made;
- The name and email address of the person who made the commit;
- A message about what the commit contains;
Saving this information alongside the code changes has some surprisingly useful side-effects, which you'll see in the section on the benefits of version control.
Each commit is made in a branch. When a repository is first created it's initialized with a default branch, known as the trunk or master. Depending on the workflow being used, additional branches are created to allow development tasks to continue in parallel.
In one such workflow a branch is created for each release. This allows developers to support a released version in one branch and work on the next release in another.
When a change is made in one branch it's often needed in another branch. In fact, some workflows, such as feature branching, require this.
Changes in one branch are merged into another branch. It's often good practice to first merge other people's changes into your branch to test that your changes work with the changes other developers have made. This helps ensure bugs are caught as early as possible.
In some cases a merge will result in a merge conflict. This happens when the version control software can't automatically merge because the commits contain different changes to the same lines and it doesn't know which change is the correct one.
Although modern version control software is good at automatically merging, it's still good practice to commit and merge frequently to reduce the chance of a conflict occurring.
Some commits represent milestones in the project, such as a release. Developers could refer to this commit by it's unique id, but that isn't fun. Instead these commits are given a meaningful tag, such as the release version.
Think of a tag as an alias for the commit id. Instead of talking about commit
ec3ff781da9b7b, you talk about version
Which version control software should I use?
There is a lot of version control software available. The most common are shown in the table below, but you can see a much larger list on Wikipedia.
I strongly recommend you use distributed version control software. The client/server model is outdated and lacks many of the benefits of distributed version control.
What is social coding?
Social coding is the combination of a social network, like Facebook or Twitter, with software development. Communities of developers form out of a shared interest in solving a problem. They're also great platforms for discovering new and interesting software.
The social coding platform acts as the central repository for the project. It also provides additional features such as issue tracking, continuous integration, and a wiki.
There are a few big players in the social coding space, summarized in the table below, with GitHub being by far the most popular.
|Platform||Version Control Software|
Forks and pull requests
Users can fork a project on social coding platforms. This creates an identical copy of the repository under their own account.
The most common reason for forking a project is to implement a feature but because users don't have write permissions to projects they don't own, they first need their own copy. The changes made in the fork can be sent upstream by submitting a pull request, also known as a merge request, to the originating project.
There is another reason for forking a project. When an open source project stagnates or goes in a direction that users aren't happy with, some fork a project to take it in a new direction. Both the original and the fork live on as separate projects.
What are the benefits of version control?
Now you know what version control is and how it relates to social coding, you might be wondering what the benefits are.
1. Work on the same code as another developer
When multiple developers are working on the same code it's important they can share their changes. Doing this by emailing files or preventing some developers from accessing certain files is a nightmare to coordinate even for two developers, and on much larger teams it borders the impossible.
Version control makes it easy for developers to share the code changes they've made, and even takes care in most cases, of merging them together.
2. Undo mistakes
Imagine you are working on an issue and you've made some changes to several files when you realize your approach is wrong and you need to try something else. It would be a pain to have to manually undo those changes. Even worse, imagine you committed those changes and shared them with other developers?
With version control you can undo most changes in a single command. For changes that have been committed or pushed, additional commands may be required. In all cases it's much faster and more manageable than manually backing up files before making changes.
3. Find code that introduced a bug
Not all bugs that are introduced are discovered straight away. They can sit in the repository for a while before someone notices it. It's also possible that someone notices the incorrect behavior of the software but doesn't know which code is causing the problem.
Modern version control software contains a
bisect command that when given a commit that's known to work and the current failing commit, it will run a user-defined test against each commit in order to find the first commit where that test fails.
Looking at the changes made in that commit reveals the bug that was introduced. It's also has an interesting side-effect allowing you to see who made that change.
4. Maintain multiple versions
When software is released, developers need to support both the released version and work on the next release in parallel. When a bug is discovered in the release version it needs to be fixed and released. At the same time, the bug probably also exists in the code for the next release, so it needs to be fixed there as well.
Version control allows you to branch your code. Developers can easily switch between branches, and even share changes with multiple branches.
5. Distribute software
In the context of social coding, version control makes it really easy to distribute your software. Not only do social coding platforms turn tags into archived releases, but software development platforms such as node use social coding platforms as a backend to package management.
6. Contribute to projects
Using version control, especially in the context of social coding, makes it much easier to contribute to projects. The more contributors, the fast the software grows and evolves, and the more your own skills as a developer improve.
I hope this gives you a better understanding of the core concepts of version control. If you're not using version control now on your projects I hope it comes as no surprise that you should. Even if you are working alone, version control offers benefits, including helping you learn a tool that many employers require.
Did you find this post useful? If so, please share it with others or follow me on twitter.