diff options
Diffstat (limited to 'lets_not_store_versions_in_versioned_files.html')
-rw-r--r-- | lets_not_store_versions_in_versioned_files.html | 69 |
1 files changed, 69 insertions, 0 deletions
diff --git a/lets_not_store_versions_in_versioned_files.html b/lets_not_store_versions_in_versioned_files.html new file mode 100644 index 0000000..725decd --- /dev/null +++ b/lets_not_store_versions_in_versioned_files.html @@ -0,0 +1,69 @@ +<!doctype html> +<html lang="en"> +<meta charset="utf-8"> +<meta name="viewport" content="width=device-width, initial-scale=1"> +<meta name="author" content="aki"> +<meta name="tags" content="versioning, deployment, software development"> +<link rel="icon" type="image/png" href="cylo.png"> +<link rel="stylesheet" href="style.css"> + +<title>Let's Not Store Versions in Versioned Files</title> + +<nav><p><a href="https://ignore.pl">ignore.pl</a></p></nav> + +<article> +<h1>Let's Not Store Versions in Versioned Files</h1> +<p class="subtitle">Published on 2022-07-02 23:33:33+02:00 +<p>It is a rather common practice to include version numbers directly in the meta files or sources that are stored +inside a code repository. To list few examples: <a href="https://www.npmjs.com/">npm</a> makes it part of its regular +practices, python's <a href="https://setuptools.pypa.io/">setuptools</a> usually involves various hacks, be it normal +imports, reading file, or anything else. Even in their more declarative approach their support limits itself to string +literals and file or module attribute reading. +<p>OK, we have a version number in a configuration file, or some other special file, or directly in the code. What's so +wrong about it? The natural enemy of this blog - duplication. In this case, we're talking responsibilities. +<p>There's a rather high chance that you are using Git to track changes. If not Git then Mercurial, SVN, Fossil, Darcs, +or really any other version control system, distributed or not, it doesn't matter. What matters is that these tools are +designed to help you control different versions of your software. When you add your software version into the source +code of said software, you create a new independent layer of versioning. +<p>Now, not everyone is a minimalist and a mere threat of an additional entity handling the same thing might not scare +you. Same thing regarding the duplication of the version data. The problems begin when the version data is actually not +duplicated between VCS and the source code. Native identification of commits in Git - SHA - doesn't really fit for +distribution use, where <a href="https://semver.org/">Semantic Versioning</a> makes much more sense for users. The point +stands for other VCSes as well.</p> +<img src="lets_not_store_versions_in_versioned_files-1.png" alt="say no!"> +<p>We end up with two distinct layers of versioning where one controls the other. This usually leads to very awkward +workflows. In a commercial project I have worked in, we wanted to mark mainline branch as unstable and make it visible +through a version number. For a released (and validated; the whole workflow was heavily oriented on paperwork) piece of +software we wanted a regular version number. This resulted in an interesting process that was required for back-fixing: +find merge base between release branch and mainline, make fix, merge to mainline, merge release branch into fix branch, +increment version file, merge to release branch. +<p>Other tendency that I see as a result of the duplication - the versions are out-of-sync or simply meaningless. Let's +consider two approaches to incrementing version number in a file: before and after the release. First one makes it that +the first commit of the release is a commit that increments the version number. Now, between this commit and the next +release file is out-of-sync, because with each change the state only becomes more and more different from the version +that is described by the file. The second approach is: create release - deploy application and whatnot - and then +increment the version number to what is expected to be the next release. This requires strict management of what changes +will get merged or good fortune-telling skills, otherwise that predicted number is meaningless as you won't be able to +ensure that the release is a major/minor/patch. +<p>Happily for us, most of the VCSes have built-in functionalities to help us control version numbers that are +meaningful in a context of distribution and deployment. They are called <em>tags</em> or rarely <em>baselines</em>. With +them we can mark arbitrary repository states with arbitrary strings that can be later referenced. Usually, they are used +to mark commits that a certain release originated from. Sometimes it might be even the same commit that incremented +version number in a file. We're not doing the last part. Instead we want to tag a commit, and read it from our build, +distribution, deployment, or packaging system. +<p>Some of them support it better, some worse. Luckily, good chunk of them allow for arbitrary logic to be executed, so +we can implement it by ourselves. Additionally, there is a good chance that the VCS provides some kind of helper for +getting the human-readable tag-driven version description. In case of Git there is <b>git-describe</b>(1), which fits +this use case directly. We can just call it from CMake or setup.py and read its output. +<p>Of course, a full-pledged support would be way nicer and stable, but surprisingly, we're not there yet. +<p>Well, some of us are. To contrast the list of bad examples consider Go programming language, which recommends this +method of versioning. Interestingly, it also uses code repositories as a form of package distribution, so any arguments +saying that file with a version is needed because the repository is used as a means of distribution are baseless. +<p>What to take away from this post? Next time when you will start a project, consider keeping meaningful version only +in the VCS. If your building/distribution/whatever-else system does not support it fully out-of-box - try implementing +it. Once you have a working implementation - push it upstream. For instance, there are CMake modules that handle it, +but they are external and not part of main CMake for whatever reason. It would be much better to have it standardized +and part of the tool. Who knows, maybe in some time we might be able to have a consistent support in all across the +ecosystem. As for now, back to experimenting, and until next time! +</article> +<script src="https://stats.ignore.pl/track.js"></script> |