summaryrefslogtreecommitdiff
path: root/deconstructing_web_browsers.html
diff options
context:
space:
mode:
Diffstat (limited to 'deconstructing_web_browsers.html')
-rw-r--r--deconstructing_web_browsers.html134
1 files changed, 134 insertions, 0 deletions
diff --git a/deconstructing_web_browsers.html b/deconstructing_web_browsers.html
new file mode 100644
index 0000000..855ee39
--- /dev/null
+++ b/deconstructing_web_browsers.html
@@ -0,0 +1,134 @@
+<!doctype html>
+<html lang="en">
+<meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1">
+<meta name="author" content="aki">
+<meta name="tags" content="web, browser">
+<link rel="icon" type="image/png" href="cylo.png">
+<link rel="stylesheet" href="style.css">
+
+<title>Deconstructing Web Browsers</title>
+
+<nav><p><a href="https://ignore.pl">ignore.pl</a></p></nav>
+
+<article>
+<h1>Deconstructing Web Browsers</h1>
+<p class="subtitle">Published on 2021-07-25 14:53:00+02:00
+<p>Welcome to one of my little experiments! The theme is simple: to deconstruct a web browser and create several
+utilities with distinct and clear responsibilities in its stead. This is not your regular blog post. Honestly, I'm not
+yet sure what it is, but I'll figure it out at some point. Expect this page to be updated and extend. (Well, just like
+my regular posts, for some reason, I need rethink it one more time...)
+
+<h2>Motivation and History</h2>
+<p>The idea started to sprout in my mind few years ago. In its early stages it wasn't really directed at web browsers
+but instead <a href="markdown_is_bad_for_you.html">it focused on markdown</a>. After giving it some more thinking, it
+changed target to <a href="web_browsers_are_no_more.html">web browsers</a>. Now, it also started to draw from Unix
+philosophy and my general aversion towards IDEs, or rather that this kind of modularity started to be visible as the
+real core of the motivation that drives this idea. I never really touched or explored it yet. I didn't try to discredit
+it either. Hopefully, once I reach that point, it will stand its ground.</p>
+
+<img src="deconstructing_web_browsers-1.png" alt="scroll with history">
+
+<p>Last year, I explored this idea a bit in a two-part text series and within a small project called
+<a href="https://git.ignore.pl/browse/">browse</a>. I naively split the responsibilities between programs and had some
+fun writing very simple scripts that did the work. And they did the work surprisingly good, but functionality
+constraints had to be extremely strict. Recently, I came back to it, read my own stuff, looked at my own code, and I
+still could relate to it. Instead of removing everything like I sometimes do, I decided to develop
+<a href="https://git.ignore.pl/markdown/">a new utility</a> and write this new summary and project status.
+
+<h2>Experimenting From a Terminal</h2>
+<p>Rather than jumping into design or development work straight away, let's see how far can we get, while using only
+shell and some usual utilities you can find in every shed. To access a webpage, one could potentially eat it raw:
+
+<pre>
+$ curl -sL https://ignore.pl/ | less
+...
+</pre>
+
+<p>Now, that's raw! With a page like this one, it's possible. I write them by hand and comply to my own rules that make
+it possible for the reader to consume them as plain text. However, it's not very useful considering how astoundingly
+obfuscated modern HTML pages can get.
+
+<p>It's not only extremely complex HTML hierarchies that we need to deal with. Another great opponents are web
+applications that pretend to be webpages. Separating those two will prove itself to be useful. Not only that, it will
+also open us to new possibilities. Consider a dead simple script that acts similarly to regular opener:
+
+<pre>
+#!/bin/sh
+TMP=$(mktemp -p /dev/shm) &&
+ { TYPE=$(curl -sLw "%{content_type}\n" $@ -o "$TMP") &&
+ case "$TYPE" in
+ application/pdf) zathura "$TMP";;
+ image/*) sxiv "$TMP";;
+ text/html*) html_viewer "$TMP";;
+ text/markdown*) markdown_viewer "$TMP";;
+ text/*) less "$TMP";;
+ *) echo "$TMP";;
+ esac }
+rm -f "$TMP"
+</pre>
+
+<p>You use it like this:
+
+<pre>
+$ ./script https://ignore.pl/
+</pre>
+
+<p>It shows the requested content using a program that's selected based on its mime type. Here, the difference between
+webpage and web application is blurred. Hypothetically, using mime or some other means we could do a switch cases like
+these:
+
+<pre>
+web-application/html+js) fork_of_chromium_or_something "$TMP";;
+web-application/lua) lua_gui_sandbox "$TMP";;
+</pre>
+
+<p>The ability to support multiple competing frameworks that are meant to run seamlessly loading sandboxed applications
+(so, web applications) is really making me interested.
+
+<p>That's not the only thing though. As you can see, in this example markdown and HTML are completely separated.
+Markdown is no longer a format that's supposed to generate HTML but instead it becomes a stand-alone hypertext format.
+Because the content requests are meant to run through such demultiplexer the hyperlinks can lead from one hypertext
+format to another. <b>This allows new formats and new ways of expression to grow and compete</b>, hopefully breathing
+some life into an ecosystem that's currently driven by monolithic giants.</p>
+
+<img src="deconstructing_web_browsers-2.png" alt="bacteria or something, dunno">
+
+<h2>Browser That’s Part of Your Environment</h2>
+<p>Of course, a single script like the example above is not the way to go, but it's a good start as it gives insight
+into data flow and responsibilities. At first, just by looking at it, I decided to naively distinguish four components:
+
+<dl>
+<dt>navigator
+<dd>Takes address of the request from user and forwards it to a <i>protocol daemon</i>. Retrieved content is then pushed
+to <i>opener</i>.
+<dt>protocol daemon
+<dd>Acquires and caches data using a single protocol e.g., HTTP.
+<dt>opener
+<dd>Chooses viewers based on content type.
+<dt>viewer
+<dd>Presents content to user and allows to interact with it.
+</dl>
+
+<p>I found it to be a decent starting point and played around with it getting encouraging results. All predicted
+obstacles made their appearances and thanks to working prototypes shortcomings of each role were shown. In the second
+iteration I wanted to divide <i>navigator</i> into several stand-alone parts but in the end I never committed to it.
+
+<p>Based on the description above, it doesn't seem as if <i>navigator</i> would require such division. Actually,
+<a href="https://git.ignore.pl/browse/tree/browse?id=9dca05999d355deb225938ba4f57858ca27ca130">current
+implementation</a> doesn't clearly show such need either. The only hints are <code>-f</code> option in <i>navigator</i>
+and <i>opener</i>, and direct calls to <i>protocol daemon</i> by <i>viewers</i> to retrieve auxiliary content (e.g.,
+stylesheet or embedded image). Meaning <i>navigator</i> is hiding a plumbing-capable <i>request resolver</i> below the
+porcelain interface that's dedicated to user.
+
+<p>More than that, <i>navigator</i> may also be hiding functionality meant to support browsing history that I didn't
+explore yet at all. Combining it with graphic interfaces, sessions management or tabs are all question marks.
+
+<p>Obviously, responsibilities of the components is not the only matter to think about. Interfaces in every form are
+also important. I'm talking here: communication between the components of the browser, interchangeability, communication
+between the browser and the rest of the environment it runs in, and integration with graphical user interfaces and
+window managers.
+
+<p>For now, I plan to split <i>navigator</i> and look into a equivalent of an address bar.
+</article>
+<script src="https://stats.ignore.pl/track.js"></script>