summaryrefslogtreecommitdiff
path: root/lua_as_human_readable_serialization_format.html
diff options
context:
space:
mode:
authorAki <please@ignore.pl>2024-08-28 13:09:26 +0200
committerAki <please@ignore.pl>2024-08-28 13:20:20 +0200
commitcce90246757b4567888de0889663e3e3c53dca40 (patch)
tree1ca885367ab46d5d4daeaf6965d52e9b1115fa6d /lua_as_human_readable_serialization_format.html
parent899ea7ddd0fb1f976996c0c0678c387b61ff3f63 (diff)
downloadignore.pl-cce90246757b4567888de0889663e3e3c53dca40.zip
ignore.pl-cce90246757b4567888de0889663e3e3c53dca40.tar.gz
ignore.pl-cce90246757b4567888de0889663e3e3c53dca40.tar.bz2
Published Lua serialization thing
Diffstat (limited to 'lua_as_human_readable_serialization_format.html')
-rw-r--r--lua_as_human_readable_serialization_format.html111
1 files changed, 111 insertions, 0 deletions
diff --git a/lua_as_human_readable_serialization_format.html b/lua_as_human_readable_serialization_format.html
new file mode 100644
index 0000000..1802aa1
--- /dev/null
+++ b/lua_as_human_readable_serialization_format.html
@@ -0,0 +1,111 @@
+<!doctype html>
+<html lang="en">
+<meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1">
+<meta name="author" content="aki">
+<meta name="tags" content="Lua, serialization, markup">
+<meta name="published-on" content="2024-08-28T13:09:26+02:00">
+<link rel="icon" type="image/png" href="favicon.png">
+<link rel="stylesheet" href="style.css">
+
+<title>Lua as Human-Readable Serialization Format</title>
+
+<header>
+<nav><a href="https://ignore.pl">ignore.pl</a></nav>
+<time>28 August 2024</time>
+<h1>Lua as Human-Readable Serialization Format</h1>
+</header>
+
+<article>
+<p>It was this time of year again and I was asked to prepare a definition for
+<a href="https://clang.llvm.org/docs/ClangFormat.html">clang-format</a>. I had a style guideline document to work with,
+so the task was rather straightforward. Similarly to Google's C++ Style Guideline it requested to group headers in: C
+system and standard headers, C++ standard library headers, other library headers, and project headers. And so, my first
+thought was to use the default <i>Regroup</i> behaviour.
+<p>Until I noticed that they use angle-brackets for other libraries together with an <i>.h</i> extension. Of course,
+this matched with the default C-group regex. The strictest and the second easiest solution here is to make the regex
+contain a list of alternatives with all of the headers. This requires some maintenance and I decided to have a bit of
+fun with it.
+<p>There is a somewhat common practice for serializing data for later use in Lua scripts that looks similar to this:
+<pre>
+return {
+ name = "Henry",
+ position = {x=0, y=0},
+}
+</pre>
+<p>This makes use of how <a href="https://www.lua.org/manual/5.4/manual.html#6.3">modules</a> and importing them works.
+In short, module script is interpreted and value of the final <code>return</code> is used as the value for the "module".
+In this case the script is a lone return-statement with no logic involved. This somewhat declarative-like style is
+purely conventional. More commonly the returned value is a table of functions, exactly what we would consider a "normal
+module", or a class.
+<p>In the example above we are not really sure what kind of thing we are dealing with. To stay consistent we could add a
+<code>type = "slime",</code> to the table. Now reader would know what they are dealing with. Of course, scheme would
+remain assumed on the user side. Building an object from there could be a straightforward table lookup. Alternatively,
+we could rely on <a href="https://www.lua.org/manual/5.4/manual.html#3.4.10">function call syntactic sugar</a> and
+prefix the definitions with types:
+<pre>
+return Slime{
+ name = "Henry",
+ position = Vec2{x=0, y=0},
+}
+</pre>
+<p>This too is a table lookup, but the responsibility shifts a bit from the user implementation to the execution
+environment. Loading would become trickier this way, but increased preparation complexity can be desired to e.g., cause
+errors in an unknown or otherwise unintended environment. If we simply want to make it run, setting global callable
+<code>Slime</code> and <code>Vec2</code> is enough here.</p>
+<img src="lua_as_human_readable_serialization_format-1.png" alt="henry the slime">
+<p>It is somewhat similar to some use cases from <a href="https://www.lua.org/history.html">history of Lua</a>. The main
+common part is the syntactic sugar but this feels like a stretch as it can be observed quite often in a "regular" Lua
+code. Let's go one step further and remove the <code>return</code>:
+<pre>
+Slime {
+ name = "Henry",
+ position = Vec2{x=0, y=0},
+}
+</pre>
+<p>It now looks like some generic markup language. But the module loading mechanism will no longer work for us. Instead
+reader needs to use <a href="https://www.lua.org/manual/5.4/manual.html#pdf-load">load</a>. It conveniently has an
+option to specify execution environment. Additionally, a mechanism for tracking top-level statements in one way or
+another is needed.
+<p>This approach is somewhat similar to what <a href="https://premake.github.io/">Premake</a> does. Surprisingly, this
+is also pretty close to regular register-event-callback approach for plugin systems (e.g., in
+<a href="https://github.com/martanne/vis">vis</a>). How so? The "tracking top-level statements" will result in a
+side-effect in some global or loader state. In callback approach, it's the event dispatcher or otherwise plugin system
+state that fulfils similar role. Additionally, API is usually exposed through environment (and not e.g., user function
+argument and plugin returned as module never able to directly interact with the API).
+<p>For my standards definitions I settled for the last style. It allows for multiple items without indention and I liked
+the idea at the time. It allowed me to play around and neatly layer parser, environment, and model. Final definitions
+looked like this:
+<pre>
+scheme "headers/1"
+aliases "ANSI C" {"ANSI X3.159-1989", "C89", "C90", "ISO/IEC 9899:1990"}
+headers "ANSI C" {
+ "assert.h",
+ "ctype.h",
+ ...
+}
+</pre>
+<p>It allowed for more complex structures with <code>include</code> and <code>remove</code>, for example:
+<pre>
+headers "C++20" {
+ include "C++17",
+ remove "ciso646",
+ "concepts",
+ ...
+}
+</pre>
+<p>See <a href="https://git.ignore.pl/headers/">headers</a> for full source code. Command allowed me to get the list of
+headers and join into a regex:
+<pre>
+$ headers C11 POSIX
+aio.h
+arpa/inet.h
+assert.h
+...
+</pre>
+<p>Except that I never joined them into a regex. After all considerations and some discussions we decided to use
+<i>Preserve</i> instead of <i>Regroup</i>, so that we wouldn't need to bother with any of the costs of grouping includes
+automatically.
+<p>I feel like re-implementing this in SQL.
+</article>
+<script src="https://stats.ignore.pl/track.js"></script>