From 58d63ace4c170f17ab79dde1e363b147b09ed712 Mon Sep 17 00:00:00 2001 From: Aki Date: Sun, 19 Jun 2022 17:33:21 +0200 Subject: Published using pacman to manage emscripten packages --- using_pacman_to_manage_emscripten_packages.html | 255 ++++++++++++++++++++++++ 1 file changed, 255 insertions(+) create mode 100644 using_pacman_to_manage_emscripten_packages.html (limited to 'using_pacman_to_manage_emscripten_packages.html') diff --git a/using_pacman_to_manage_emscripten_packages.html b/using_pacman_to_manage_emscripten_packages.html new file mode 100644 index 0000000..e128f72 --- /dev/null +++ b/using_pacman_to_manage_emscripten_packages.html @@ -0,0 +1,255 @@ + + + + + + + + + +Using pacman to Manage Emscripten Packages + + + +
+

Using pacman to Manage Emscripten Packages

+

Published on 2022-06-19 17:35:00+02:00 +

C was created for use with Unix. Quite quickly it became one of the most used programming languages of all time. +After some additional years it made its way into Linux kernel and operating system. To this year it is the primary +language that is used to interface with the kernel or to write any sort of utilities. If not directly then through +various bindings. +

To allow use of external libraries C has a mechanism for including header files in your own source code. Then during +linking stage compiled implementation of these headers is linked along with your code into final executable (or through +dynamic linker with some additional steps). The configuration of what is visible for including and linking employs use +of several PATH-like variables with some defaults, and sometimes (if you were having a good day and it had to be ruined) +hidden or undocumented behaviour. +

Management of the available packages that contain headers and libraries is usually offloaded to the system-wide +package manager. Considering the relation between C and system it's hosted by, it isn't that bad of a choice. Now, it +is not perfect, but with a well maintained upstream and local ecosystems it'll be just right. +

Problems may appear when we change or take away one of the parts: operating system, C toolchain, or package manager. +The most prominent examples of when it happens is: Windows, cross-compiling, and porting software between distros. Such +cases, especially the first one, resulted in creation of external package managers e.g., +vcpkg, Conan. In other cases they pushed people toward +build generators such as CMake.

+emscripten logo +

Recently, I've been playing around with Emscripten. I built some things here +and there, and now, you guessed it, I'm trying out different approaches to handling libraries and decided to explore +pacman(1) as means to it. I hope you enjoy this little experiment. +

Without going into internals, pacman is a package manager used by Arch +Linux, a distribution that describes itself as lightweight, flexible, and simple. It focuses on bleeding-edge +packages. I picked it because I happen to use it on a daily basis. +

Packages are distributed in a binary form and come from remote repositories. Package is an archive that contains +files meant for installation and some meta information, all built by +makepkg(8). Repository is really just a set of files managed +by a repo-add(8). + + +

Building Sample Package

+

I started by creating a sample package that provides raylib. To do that, I +wrote a rather simple PKGBUILD file: + +

+pkgname=raylib
+pkgver=4.0.0
+pkgrel=1
+arch=(wasm32)
+license=(zlib)
+makedepends=(cmake emscripten)
+source=("${pkgname}.tar.gz::https://github.com/raysan5/raylib/archive/refs/tags/${pkgver}.tar.gz")
+sha256sums=("11f6087dc7bedf9efb3f69c0c872f637e421d914e5ecea99bbe7781f173dc38c")
+
+ +

Stop, right now! If you are a seasoned package maintainer or maybe you just cross-compiled enough software, you will +notice that something is not right in here. Yeah, arch is wrong. It's a little bit counter-intuitive, so +take a look at another example +aarch64-linux-gnu-glibc, GNU C +Library for ARM64 targets: + +

+$ asp checkout aarch64-linux-gnu-glibc
+$ cd aarch64-linux-gnu-glibc/trunk
+$ grep arch= PKGBUILD
+arch=(any)
+
+ +

This is different for a good reason: none of this is going to be used on the host system. Only the compiler and any +binutils will be used, and they are actually targeted for the architecture of build host: x86_64 in this case. +

Then why am I specifying wasm32 for my package? +

Emscripten uses cache directories that contain a copy of sysroot. Host system may contain several caches and each +will have own sysroot. I'm not entirely sure what is the reasoning behind it, but that's how it looks like at the moment +of writing. +

glibc package specifies any architecture, because it is intended to be installed in +/usr/aarch64-linux-gnu and that's where compiler is expecting to see it. I could technically try to make +my package operate in similar manner and install to /usr/lib/emscripten/system that acts as base for caches +and is provided by emscripten package from Arch Linux repositories. I didn't do that because I wanted installed packages +to be immediately available in my cache. To accomplish that, I decided to use pacman similarly to when you +bootstrap a new system installation, and because the package is technically targeted at wasm32 I wrote that in +PKGBUILD. +

I think the normal way is also worth exploring. Assuming, that I first figure out how to deal with caches, why +emscripten package does not install to usual /usr/wasm32-emscripten, and how to handle propagation of +packages.

+pacman, heh +

Anyway, I went the other way and I had to hack my way through. Let's continue with PKGBUILD: + +

+build() {
+  cd "${pkgname}-${pkgver}"
+  emcmake cmake . -B build \
+    -DPLATFORM=Web \
+    -DBUILD_EXAMPLES=OFF \
+    -DCMAKE_INSTALL_PREFIX=/usr
+  cd build
+  make
+}
+
+package() {
+  cd "${pkgname}-${pkgver}/build"
+  make DESTDIR="${pkgdir}" install
+  cd ..
+  install -Dm644 LICENSE "${pkgdir}/usr/share/licenses/${pkgname}/LICENSE"
+}
+
+ +

I use CMake wrapper from Emscripten tools. The only part that's worth noting is that by default, CMake with +Emscripten would set CMAKE_INSTALL_PREFIX to the path of currently used cache directory. That's not +feasible for staging packages meant for distribution, so I use plain /usr instead. Thing is Emscripten uses +include and lib directories located directly in the sysroot and not /usr, so I +will need to adjust it somehow at later stage. Not now because Raylib uses GNUInstallDirs, which expands / +prefix to /usr. +

Package is ready to be build: + +

+$ makepkg --printsrcinfo >.SRCINFO
+$ CFLAGS='' CARCH=wasm32 makepkg
+==> Making package: raylib 4.0.0-1
+==> ...
+==> Finished making: raylib 4.0.0-1
+$ ls *.pkg.tar.zst
+raylib-4.0.0-1-wasm32.pkg.tar.zst
+
+ +

First off, I unset CFLAGS to avoid default options from /etc/makepkg.conf causing problems. +I also need to set CARCH to inform makepkg that I'm cross-compiling to wasm32. + + +

Setting up Repository

+

Now that I had the package, I needed to "distribute" it. Repositories used by pacman are dead simple. They can +be served over HTTP, FTP, or even local files. The structure for all methods is the same and relies on file system, +paths, and central database file. The whole setup was: + +

+$ mkdir -p repo_path/wasm32/core
+$ cd repo_path/wasm32/core
+$ mv package_path/raylib-4.0.0-1-wasm32.pkg.tar.zst .
+$ repo-add core.db.tar.gz *.pkg.tar.zst
+
+ +

Yeah, that's it. First create a directory for the repository and move there. Path contains both: architecture and +name of the repository. After that move the built package to the same directory, and finally add it to the database that +has the same name as the repository. Now, it's a matter of making pacman use it. + + +

Installing the Package

+

This section may contain wrong uses of tools for the sake of experimentation. If you are faint-hearted or feel the +need of saying "this is not how to do it" or "this is not how you use it" without elaborating or suggesting another +direction, then it's probably better for you to not continue or have a drink first. +

Before doing anything I fixed the directory structure of cache to match one that pacman expects: + +

+$ cd cache/sysroot
+$ mkdir usr
+$ mv include lib bin usr
+$ ln -s usr/{include,lib,bin} .
+
+ +

Symlinks should make everyone happy for now. +

Next step was to create directories used directly by pacman: + +

+$ mkdir -p etc/pacman.d/{gnupg,hooks} var/{cache/pacman,lib/pacman,log}
+
+ +

And finally first thing that's worth attention - config file located at etc/pacman.conf. The plan was to +use pacman in a bootstrap fashion for the sysroot located in cache directory, so I needed to write that in config +terms: + +

+[options]
+RootDir = cache/sysroot/
+CacheDir = cache/sysroot/var/cache/pacman
+HookDir = cache/sysroot/etc/pacman.d/hooks
+GPGDir = cache/sysroot/etc/pacman.d/gnupg
+Architecture = wasm32
+CheckSpace
+SigLevel = TrustAll
+
+ +

Some directories were automatically re-rooted and some weren't. I simply experimented with -v option to +see what is used and adjusted config until I ended up with this version. I don't need to mention TrustAll. +Don't do it. +

That's not all; repositories also reside in the config file: + +

+[core]
+Server = file:///repo_path/$arch/$repo
+
+ +

What's left is to sync database and install package. pacman assumes that it needs to be run as root user, but +because I'm working with a user-owned cache as my root directory I'd prefer to not raise its privileges, especially +considering that misconfiguration could break packages in host system. Let's try it out: + +

+$ fakeroot pacman --config cache/sysroot/etc/pacman.conf -Sy
+:: Synchronising package databases...
+ core   418.0   B   408 KiB/s 00:00 [###################################] 100%
+$ fakeroot pacman --config cache/sysroot/etc/pacman.conf -S raylib
+resolving dependencies...
+looking for conflicting packages...
+
+Packages (1) raylib-4.0.0-1
+
+Total Download Size:   2.04 MiB
+Total Installed Size:  4.70 MiB
+
+:: Proceed with installation? [Y/n]
+:: ...
+(1/1) installing raylib             [###################################] 100%
+
+ +

Looks like the installation process succeeded. Time to try it out. + + +

Trying It Out and Adjusting pkg-config

+

Turns out it doesn't work just yet. Some samples would work but not this one. +

raylib CMake module has a very peculiar way of defining its target. At first it asks pkg-config for +hints and then uses them in a slightly inconsistent way. Long story short, CMake target will have linker options set +based on output from pkg-config --libs --static disregarding any attempts to remove -L +options. +

Since I built my package with CMAKE_INSTALL_PREFIX set to /usr, the prefix +variable in installed raylib.pc will be set to /usr. This will result in +-L/usr/lib appearing in public linker options for raylib target, which will break the entire build process. +

The problem here is the prefix=/usr in the module definition file. It should point to the actual root +which is located in cache directory. +

There are several ways to address it. My favourite was to simply rewrite the prefix as part of install hook that +would be run by pacman. Sadly it failed because hooks are run in chroot. There are ways to fake it, but I didn't +find them worth exploring at that moment. The other way was PKG_CONFIG_SYSROOT_DIR, and that's what I did. +I tried to avoid it due to uncertain situation between pkgconf and pkg-config. +

Luckily, it turned out good enough for me to wrap up the whole experiment. I patched Emscripten.cmake +toolchain file and was able to build a sample project that used the installed sample package. +

Should I show here something? Nah

+tool... chain? + + +

Final Notes

+

This was a fun experiment. For some reason I really enjoyed that fakeroot use. +

Package management or rather dependency management in cross-compilation context sounds like a good next direction to +explore. I found various takes on it. GNU is a little bit more standardized and there are projects like +crosstool-NG that at the very least ease configuration of toolchains. +I couldn't find many examples of installable binary packages for target with the exception of the standard library. +Instead, it seems that the usual approach is compiling ports by yourself (which is fine) from e.g., incredibly complex +CMake trees (which is fine, but with flames in background). Otherwise, using vcpkg or similar manager. Or doing +something wild. +

As for anything else worth noting... I hope I pointed out everything that I wanted in the article itself. If not, +well, it happens. +

+ -- cgit v1.1