summaryrefslogtreecommitdiff
path: root/plumbing_your_own_browser.html
diff options
context:
space:
mode:
Diffstat (limited to 'plumbing_your_own_browser.html')
-rw-r--r--plumbing_your_own_browser.html99
1 files changed, 99 insertions, 0 deletions
diff --git a/plumbing_your_own_browser.html b/plumbing_your_own_browser.html
new file mode 100644
index 0000000..4f9b999
--- /dev/null
+++ b/plumbing_your_own_browser.html
@@ -0,0 +1,99 @@
+<!doctype html>
+<html lang="en">
+<meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1">
+<meta name="author" content="aki">
+<meta name="tags" content="web, web browser, linux, shell">
+<link rel="icon" type="image/png" href="cylo.png">
+<link rel="stylesheet" type="text/css" href="style.css">
+
+<title>Plumbing Your Own Browser</title>
+
+<nav><p><a href="https://ignore.pl">ignore.pl</a></p></nav>
+
+<article>
+<h1>Plumbing Your Own Browser</h1>
+<p class="subtitle">Published on 2020-08-01 21:38:00+02:00</p>
+<img src="plumbing_your_own_browser-1.png" alt="plumbing">
+<p>In spirit of the previous post about <a href="web_browsers_are_no_more.html">web browsers</a>, how about a little
+experiment? Let's write a simple tool that implements downloading, history management and displaying the content. This
+is intended as a trivial and fun experiment.
+<p>Ideally, I think the architecture would divide into: protocol daemon, navigator, opener and view engines. However,
+even with this setup some of them would have wide responsibilities. I don't really like that, but I leave it to future
+to deal with. Anyway, what do they do?</p>
+<dl>
+ <dt>protocol daemon<dd>Responsible for data acquisition and caching. For instance HTTP protocol daemon.
+ <dt>navigator<dd>The quickest way to explain it: the address bar. It handles history, probably sessions, windows,
+ initial requests to protocol daemon from the user. This one would need some attention to properly integrate it with
+ the environment and make sure that its responsibilities don't go too far.
+ <dt>opener<dd>Not really xdg-open or rifle, but something of this sort. Gets data marked for display from the
+ protocol server and acts as a demux for view engines.
+ <dt>view engine<dd>Your usual browser excluding things that already appeared earlier. It may also be something else,
+ like completely normal image viewer, hyperlinked markdown viewer or even less. Or more like sandboxed application
+ environment that is not a web application.
+</dl>
+<p>Sounds like a complex system, but we can do it easily in a short shell script. I won't bother with view engines, as
+right now, that's rather time consuming to get them work, especially that browsers weren't written with this use case in
+mind. Even those minimal ones can't do. Generally, they would need to communicate with protocol server to retrieve
+secondary data (like stylesheet or images) and communicate with navigator when user clicked some kind of link.
+<p>Anyway, let's start with protocol daemon! Our target is web browser, so we need something to handle HTTP for us. What
+else could we use if not curl? Frankly speaking, just curl could be sufficient to view things:</p>
+<pre>
+$ curl -sL https://ignore.pl/plumbing_your_own_browser.html
+...
+...
+...
+</pre>
+<p>Yeah, if you use st as terminal emulator like I do, then you need to add <code>| less</code> at the end, so that you
+can read it. Honestly, with documents that are written in a way that allows people to read them as plain text, that's
+enough (posts in this websites can be read in plain text).
+<p>However, although it's tempting to not, I'll do more than that. Now that we have a protocol daemon that is not a
+daemon, the next one is the opener. Why not navigator? For now interactive shell will be the navigator. You'll see how.
+<p>It's possible that you already have something that could act as an opener (like rifle from ranger file manager).
+There are plenty of similar programs, including xdg-open. I believe that they could be configured to work nicely in this
+setup, but let's write our own:</p>
+<pre>
+#!/bin/sh
+TMP=$(mktemp -p /dev/shm) &&
+ { TYPE=$(curl -sLw "%{content_type}\n" $@ -o "$TMP") &&
+ case "$TYPE" in
+ application/pdf) zathura "$TMP";;
+ image/*) sxiv "$TMP";;
+ text/*) less "$TMP";;
+ *) hexdump "$TMP";;
+ esac }
+rm -f "$TMP"
+</pre>
+<p>That's a lot of things to explain! First two, up to <code>case "$TYPE" in</code> are actually protocol daemon. The
+<code>$@</code> is what comes from the navigator. In our case, it's the arguments from the shell that run our command.
+Next up, the case statement is the opener. Based on the output of curl's write-out the script selects program to open
+the temporary file from the web. After that, the file is removed, in other words caching is not supported yet.
+<p>Surprisingly, that's it, hell of a minimal browser. Works nicely with pdf files, images and text formats that are not
+extremely bloated. Possibly with some tinkering around xdg-open and x default applications some hyperlinks between the
+formats could be made (e.g. a pdf links to an external image).
+<p>Now, I could go further and suggest something an option like this:</p>
+<pre>
+application/lua) lua_gui_sandbox "$TMP";;
+</pre>
+<p>I find it interesting and worth looking into. I'll leave it as an open thing to try out.
+<p>The are some more things to consider. For instance, the views should know the base directory the file comes from as
+some hyperlinks are relative. In other words, programs used as views should allow to state base of the address in some
+way:</p>
+<pre>
+{ curl -sLw "%{content_type}\n${url_effective}\n" $@ -o "$TMP" | {
+ read TYPE
+ read URL
+ BASE_URL=$(strip_filename_from_url "$URL") } &&
+ case "$TYPE" in
+ text/html) html_view --base "$BASE_URL" "$TMP";;
+ text/markdown) markdown --base "$BASE_URL" "$TMP";;
+ # ...
+ esac }
+</pre>
+<p>By then, the <code>markdown</code> would know that if the user clicks some hyperlink with a relative path, then it
+should append the base path to it. It could also provide information that matters in e.g. CORS.
+<p>For now, that's it. The ideas are still unrefined, but at least they are moving somewhere. Hopefully, I will get
+myself to write something that could act as a view and respect this concept. My priority should be HTML view but I feel
+like starting with simplified Markdown (one without HTML).
+</article>
+<script src="https://stats.ignore.pl/track.js"></script>