It was this time of year again and I was asked to prepare a definition for
clang-format. I had a style guideline document to work with,
so the task was rather straightforward. Similarly to Google's C++ Style Guideline it requested to group headers in: C
system and standard headers, C++ standard library headers, other library headers, and project headers. And so, my first
thought was to use the default Regroup behaviour.
Until I noticed that they use angle-brackets for other libraries together with an .h extension. Of course,
this matched with the default C-group regex. The strictest and the second easiest solution here is to make the regex
contain a list of alternatives with all of the headers. This requires some maintenance and I decided to have a bit of
fun with it.
There is a somewhat common practice for serializing data for later use in Lua scripts that looks similar to this:
return {
name = "Henry",
position = {x=0, y=0},
}
This makes use of how modules are imported. In short,
module script is interpreted and value of the final return
is used as the value for the "module". In this
case the script is a lone return-statement with no logic involved. This somewhat declarative-like style is purely
conventional. More commonly the returned value is a table of functions, exactly what we would consider a "normal
module", or a class.
In the example above we are not really sure what kind of thing we are dealing with. To stay consistent we could add a
type = "slime",
to the table. Now reader would know what they are dealing with. Of course, scheme would
remain assumed on the user side. Building an object from there could be a straightforward table lookup. Alternatively,
we could rely on function call syntactic sugar and
prefix the definitions with types:
return Slime{
name = "Henry",
position = Vec2{x=0, y=0},
}
This too is a table lookup, but the responsibility shifts a bit from the user implementation to the execution
environment. Loading would become trickier this way, but increased preparation complexity can be desired to e.g., cause
errors in an unknown or otherwise unintended environment. If we simply want to make it run, setting global callable
Slime
and Vec2
is enough here.
It is somewhat similar to some use cases from history of Lua. The main
common part is the syntactic sugar but this feels like a stretch as it can be observed quite often in a "regular" Lua
code. Let's go one step further and remove the return
:
Slime {
name = "Henry",
position = Vec2{x=0, y=0},
}
It now looks like generic markup language but the module loading mechanism will no longer work for us. Instead
reader needs to load it and execute. Functions
handling data types here are expected to cause side-effects in order to record the entries. This may be coupled with a
way of detecting top-level statements. Load conveniently has an option to specify execution environment.
This approach is somewhat similar to what Premake does. And surprisingly,
this is also pretty close to a regular register-event-callback approach that some plugin systems use (e.g.,
vis). How so? Here, point is to modify loader state as side-effect. In
callback approach, when plugin is loading it has access to register itself for certain events, modifying the plugin or
event system's state. Additionally, API is usually exposed through an environment (and not e.g., user function
argument and plugin returned as module never able to directly interact with the API).
For my standards definitions I settled for the last style. It allows for multiple items without indention and I liked
the idea at the time. It allowed me to play around and neatly layer parser, environment, and model. Final definitions
looked like this:
scheme "headers/1"
aliases "ANSI C" {"ANSI X3.159-1989", "C89", "C90", "ISO/IEC 9899:1990"}
headers "ANSI C" {
"assert.h",
"ctype.h",
...
}
It allowed for more complex structures with include
and remove
, for example:
headers "C++20" {
include "C++17",
remove "ciso646",
"concepts",
...
}
See headers for full source code. Command allowed me to get the list of
headers and join into a regex:
$ headers C11 POSIX
aio.h
arpa/inet.h
assert.h
...
Except that I never joined them into a regex. After all considerations and some discussions we decided to use
Preserve instead of Regroup, so that we wouldn't need to bother with any of the costs of grouping includes
automatically.
I feel like re-implementing this in SQL.