How to Generate Files From Templates in Shell
Published on 2023-03-26 18:05:00+02:00
This is a total rewrite of an old article. I no longer liked it and my methods have changed.
Generating or "configuring" files from a template is a common occurrence. A prime example from what I usually use is
configure_file in CMake. Another example
would be service configuration files sitting in a staging location before getting deployed (e.g., version controlled
configs for e-mail or web server). Today let's focus on these kinds of use and not verbose template engines for e.g.,
HTML.
Common notations to mark replacement spots in templates are: $VARIABLE
, ${VARIABLE}
, and
@VARIABLE@
. First two are obviously coming directly from shell-like notation of variable substitution. The
latter exists exactly to be different from these for when we want to create a template that may contain $
notation in the output as part of its natural syntax. In other words: when the syntax of generated file uses
$
.
Using shell itself
POSIX-compliant shells support a mechanism called
heredoc. We can
use it in combination with cat(1):
#!/bin/sh
cat <<CONTENT
server {
listen 80;
server_name $USER.$DOMAIN;
root /srv/http/$UESR/public;
}
CONTENT
This case has an obvious problem. This isn't really a template that I promised. Instead, it is a script that
generates the intended output. Someone could also try to argue about useless use of cat here.
Using cat and heredocs gives us a lot of flexibility. We can wrap some content with a common header and footer
if we want to:
#!/bin/sh
cat /dev/fd/3 $@ /dev/fd/3 3<<HEAD 4<<FOOT
<!doctype html>
<html lang="en">
HEAD
<script src="script.js"></script>
FOOT
Using envsubst
If we want a real template instead of an executable script we can use envsubst(1). This tool is extremely
straight-forward in use: put template in standard input and get substituted text in standard output:
server {
listen 80;
server_name $USER.$DOMAIN;
root /srv/http/$USER/public;
}
Then:
$ export DOMAIN=example.tld
$ envsubst <nginx.conf.in
server {
listen 80;
server_name aki.ignore.pl;
root /srv/http/aki/public;
}
Envsubst supports ${VARIABLE}
, too.
Major potential problem with envsubst is that it substitutes everything as it goes. It doesn't matter whether
the variable exists in the environment or not. This is the usual expected behaviour from shell, but it might not be
well suited for handling any output that uses $
in a meaningful way. We can partially workaround it using
SHELL-FORMAT argument:
$ envsubst '$USER, $DOMAIN' <nginx.conf.in >nginx.conf
This limits the substitutions to selected variables. The format of this argument is not important. Whatever is a
conformant variable reference will work: '$USER$DOMAIN'
, '$USER $DOMAIN'
,
'$USER,$DOMAIN'
, and the first example are all equivalent. Just remember to not substitute the variables
when calling envsubst by accident and to put it in one argument (hence why single-quotes are used).
Using sed
Finally, we can use sed(1) to gain even more control over what happens to our templates. This comes at the
cost: sed does not have access to environment variables on its own. Usually, we can find it being used like this:
$ sed "s/@VERSION@/$VERSION/g" <version.h.in >version.h
Shell will substitute $VERSION
there with the variable and any use of @VERSION@
in template
file will be replaced. Note that sed can replace anything it wants - @
are used here to make it more
strict and to make template more readable.
If we are feeling like over-engineering, we can generate script for sed and use that instead. The variable
values may come from anywhere you want at that point, let's use shell:
#!/bin/sh
if tag=$(git describe --tags --exact); then
echo s/@VERSION@/$tag/g
else
echo s/@VERSION@/@BRANCH@-@HASH@/g
fi
echo s/@HASH@/$(git rev-parse --short HEAD)/g
echo s/@BRANCH@/$(git symbolic-ref --short HEAD || echo detached)/g
Here rather than using cat I used echo
. Depending on the state of repository it is used in, it
may output something similar to:
s/@VERSION@/@BRANCH@-@HASH@/g
s/@HASH@/4242424/g
s/@BRANCH@/nightly/g
We can then feed it into sed:
$ sed -f subst.sed <version.h.in >version.h
Now, if we want to over-engineer it for real, let's put it into a Makefile:
subst.sed: subst.sed.sh
./$< >$@
%.h: %.h.in subst.sed
sed -f subst.sed <$< >$@
Other Alternatives
Otherwise one could potentially use: perl(1), python(1), awk(1), maybe shell's eval
if feeling adventurous (and malicious, I guess). CMakes configure_file
is very nice but is limited to
CMake. I'm starting to feel like it could be a nice weekend project to make a utility after a beer or two.