diff options
-rw-r--r-- | deconstructing_web_browsers-1.png | bin | 0 -> 1638 bytes | |||
-rw-r--r-- | deconstructing_web_browsers-2.png | bin | 0 -> 2027 bytes | |||
-rw-r--r-- | deconstructing_web_browsers.html | 134 | ||||
-rw-r--r-- | graveyard_of_the_drawings-10.png (renamed from plumbing_your_own_browser-1.png) | bin | 1809 -> 1809 bytes | |||
-rw-r--r-- | graveyard_of_the_drawings-11.png (renamed from integrating_browser_into_your_environment-1.png) | bin | 1707 -> 1707 bytes | |||
-rw-r--r-- | graveyard_of_the_drawings.html | 4 | ||||
-rw-r--r-- | index.html | 6 | ||||
-rw-r--r-- | integrating_browser_into_your_environment.html | 81 | ||||
-rw-r--r-- | plumbing_your_own_browser.html | 99 |
9 files changed, 141 insertions, 183 deletions
diff --git a/deconstructing_web_browsers-1.png b/deconstructing_web_browsers-1.png Binary files differnew file mode 100644 index 0000000..e4b5d59 --- /dev/null +++ b/deconstructing_web_browsers-1.png diff --git a/deconstructing_web_browsers-2.png b/deconstructing_web_browsers-2.png Binary files differnew file mode 100644 index 0000000..0fc3dc6 --- /dev/null +++ b/deconstructing_web_browsers-2.png diff --git a/deconstructing_web_browsers.html b/deconstructing_web_browsers.html new file mode 100644 index 0000000..855ee39 --- /dev/null +++ b/deconstructing_web_browsers.html @@ -0,0 +1,134 @@ +<!doctype html> +<html lang="en"> +<meta charset="utf-8"> +<meta name="viewport" content="width=device-width, initial-scale=1"> +<meta name="author" content="aki"> +<meta name="tags" content="web, browser"> +<link rel="icon" type="image/png" href="cylo.png"> +<link rel="stylesheet" href="style.css"> + +<title>Deconstructing Web Browsers</title> + +<nav><p><a href="https://ignore.pl">ignore.pl</a></p></nav> + +<article> +<h1>Deconstructing Web Browsers</h1> +<p class="subtitle">Published on 2021-07-25 14:53:00+02:00 +<p>Welcome to one of my little experiments! The theme is simple: to deconstruct a web browser and create several +utilities with distinct and clear responsibilities in its stead. This is not your regular blog post. Honestly, I'm not +yet sure what it is, but I'll figure it out at some point. Expect this page to be updated and extend. (Well, just like +my regular posts, for some reason, I need rethink it one more time...) + +<h2>Motivation and History</h2> +<p>The idea started to sprout in my mind few years ago. In its early stages it wasn't really directed at web browsers +but instead <a href="markdown_is_bad_for_you.html">it focused on markdown</a>. After giving it some more thinking, it +changed target to <a href="web_browsers_are_no_more.html">web browsers</a>. Now, it also started to draw from Unix +philosophy and my general aversion towards IDEs, or rather that this kind of modularity started to be visible as the +real core of the motivation that drives this idea. I never really touched or explored it yet. I didn't try to discredit +it either. Hopefully, once I reach that point, it will stand its ground.</p> + +<img src="deconstructing_web_browsers-1.png" alt="scroll with history"> + +<p>Last year, I explored this idea a bit in a two-part text series and within a small project called +<a href="https://git.ignore.pl/browse/">browse</a>. I naively split the responsibilities between programs and had some +fun writing very simple scripts that did the work. And they did the work surprisingly good, but functionality +constraints had to be extremely strict. Recently, I came back to it, read my own stuff, looked at my own code, and I +still could relate to it. Instead of removing everything like I sometimes do, I decided to develop +<a href="https://git.ignore.pl/markdown/">a new utility</a> and write this new summary and project status. + +<h2>Experimenting From a Terminal</h2> +<p>Rather than jumping into design or development work straight away, let's see how far can we get, while using only +shell and some usual utilities you can find in every shed. To access a webpage, one could potentially eat it raw: + +<pre> +$ curl -sL https://ignore.pl/ | less +... +</pre> + +<p>Now, that's raw! With a page like this one, it's possible. I write them by hand and comply to my own rules that make +it possible for the reader to consume them as plain text. However, it's not very useful considering how astoundingly +obfuscated modern HTML pages can get. + +<p>It's not only extremely complex HTML hierarchies that we need to deal with. Another great opponents are web +applications that pretend to be webpages. Separating those two will prove itself to be useful. Not only that, it will +also open us to new possibilities. Consider a dead simple script that acts similarly to regular opener: + +<pre> +#!/bin/sh +TMP=$(mktemp -p /dev/shm) && + { TYPE=$(curl -sLw "%{content_type}\n" $@ -o "$TMP") && + case "$TYPE" in + application/pdf) zathura "$TMP";; + image/*) sxiv "$TMP";; + text/html*) html_viewer "$TMP";; + text/markdown*) markdown_viewer "$TMP";; + text/*) less "$TMP";; + *) echo "$TMP";; + esac } +rm -f "$TMP" +</pre> + +<p>You use it like this: + +<pre> +$ ./script https://ignore.pl/ +</pre> + +<p>It shows the requested content using a program that's selected based on its mime type. Here, the difference between +webpage and web application is blurred. Hypothetically, using mime or some other means we could do a switch cases like +these: + +<pre> +web-application/html+js) fork_of_chromium_or_something "$TMP";; +web-application/lua) lua_gui_sandbox "$TMP";; +</pre> + +<p>The ability to support multiple competing frameworks that are meant to run seamlessly loading sandboxed applications +(so, web applications) is really making me interested. + +<p>That's not the only thing though. As you can see, in this example markdown and HTML are completely separated. +Markdown is no longer a format that's supposed to generate HTML but instead it becomes a stand-alone hypertext format. +Because the content requests are meant to run through such demultiplexer the hyperlinks can lead from one hypertext +format to another. <b>This allows new formats and new ways of expression to grow and compete</b>, hopefully breathing +some life into an ecosystem that's currently driven by monolithic giants.</p> + +<img src="deconstructing_web_browsers-2.png" alt="bacteria or something, dunno"> + +<h2>Browser That’s Part of Your Environment</h2> +<p>Of course, a single script like the example above is not the way to go, but it's a good start as it gives insight +into data flow and responsibilities. At first, just by looking at it, I decided to naively distinguish four components: + +<dl> +<dt>navigator +<dd>Takes address of the request from user and forwards it to a <i>protocol daemon</i>. Retrieved content is then pushed +to <i>opener</i>. +<dt>protocol daemon +<dd>Acquires and caches data using a single protocol e.g., HTTP. +<dt>opener +<dd>Chooses viewers based on content type. +<dt>viewer +<dd>Presents content to user and allows to interact with it. +</dl> + +<p>I found it to be a decent starting point and played around with it getting encouraging results. All predicted +obstacles made their appearances and thanks to working prototypes shortcomings of each role were shown. In the second +iteration I wanted to divide <i>navigator</i> into several stand-alone parts but in the end I never committed to it. + +<p>Based on the description above, it doesn't seem as if <i>navigator</i> would require such division. Actually, +<a href="https://git.ignore.pl/browse/tree/browse?id=9dca05999d355deb225938ba4f57858ca27ca130">current +implementation</a> doesn't clearly show such need either. The only hints are <code>-f</code> option in <i>navigator</i> +and <i>opener</i>, and direct calls to <i>protocol daemon</i> by <i>viewers</i> to retrieve auxiliary content (e.g., +stylesheet or embedded image). Meaning <i>navigator</i> is hiding a plumbing-capable <i>request resolver</i> below the +porcelain interface that's dedicated to user. + +<p>More than that, <i>navigator</i> may also be hiding functionality meant to support browsing history that I didn't +explore yet at all. Combining it with graphic interfaces, sessions management or tabs are all question marks. + +<p>Obviously, responsibilities of the components is not the only matter to think about. Interfaces in every form are +also important. I'm talking here: communication between the components of the browser, interchangeability, communication +between the browser and the rest of the environment it runs in, and integration with graphical user interfaces and +window managers. + +<p>For now, I plan to split <i>navigator</i> and look into a equivalent of an address bar. +</article> +<script src="https://stats.ignore.pl/track.js"></script> diff --git a/plumbing_your_own_browser-1.png b/graveyard_of_the_drawings-10.png Binary files differindex bbfebec..bbfebec 100644 --- a/plumbing_your_own_browser-1.png +++ b/graveyard_of_the_drawings-10.png diff --git a/integrating_browser_into_your_environment-1.png b/graveyard_of_the_drawings-11.png Binary files differindex 4c2d87a..4c2d87a 100644 --- a/integrating_browser_into_your_environment-1.png +++ b/graveyard_of_the_drawings-11.png diff --git a/graveyard_of_the_drawings.html b/graveyard_of_the_drawings.html index 2e67da6..7c3ac65 100644 --- a/graveyard_of_the_drawings.html +++ b/graveyard_of_the_drawings.html @@ -13,7 +13,7 @@ <article> <h1>Graveyard of the Drawings</h1> -<p class="subtitle">Last modified on 2021-03-19 19:53+01:00 +<p class="subtitle">Last modified on 2021-07-25 19:21+02:00 <p>Here are the drawings I made for articles that I decided to remove. No context, no nothing. Just images. Despite the style, I still think that it'd be a little bit of waste to just remove them along the texts and reusing them in different articles is just lazy.</p> @@ -26,4 +26,6 @@ different articles is just lazy.</p> <img src="graveyard_of_the_drawings-7.png"> <img src="graveyard_of_the_drawings-8.png"> <img src="graveyard_of_the_drawings-9.png"> +<img src="graveyard_of_the_drawings-10.png"> +<img src="graveyard_of_the_drawings-11.png"> </article> @@ -31,8 +31,7 @@ completely discard the concept of a keyboard. <li>Rebuilding Web Browsing <ol> <li><a href="web_browsers_are_no_more.html">Web Browsers Are No More</a> - <li><a href="plumbing_your_own_browser.html">Plumbing Your Own Browser</a> - <li><a href="integrating_browser_into_your_environment.html">Integrating Browser Into Your Environment</a> + <li><a href="deconstructing_web_browsers.html">Deconstructing Web Browsers<a> </ol> <li><a href="of_privacy_and_traffic_tracking.html">Of Privacy and Traffic Tracking</a> <li><a href="how_to_write_a_minimal_html5_document.html">How to Write a Minimal HTML5 Document</a> @@ -69,6 +68,9 @@ completely discard the concept of a keyboard. <section id="news"> <h2>News</h2> <p><strong><time>2021-07-25</time></strong> +Published <a href="deconstructing_web_browsers.html">Deconstructing Web Browsers<a>, summary of now-removed <i>Plumbing +Your Own Browser</i> and <i>Integrating Browser into Your Environment</i>. +<p><strong><time>2021-07-25</time></strong> Initialized website as git repository. Let's see if it will be useful. <p><strong><time>2021-07-25</time></strong> Rewritten parts of and updated <a href="web_browsers_are_no_more.html">We Browsers Are No More</a>. diff --git a/integrating_browser_into_your_environment.html b/integrating_browser_into_your_environment.html deleted file mode 100644 index e67bfea..0000000 --- a/integrating_browser_into_your_environment.html +++ /dev/null @@ -1,81 +0,0 @@ -<!doctype html> -<html lang="en"> -<meta charset="utf-8"> -<meta name="viewport" content="width=device-width, initial-scale=1"> -<meta name="author" content="aki"> -<meta name="tags" content="web, browser, unix philosophy"> -<link rel="icon" type="image/png" href="cylo.png"> -<link rel="stylesheet" href="style.css"> - -<title>Integrating Browser Into Your Environment</title> - -<nav><p><a href="https://ignore.pl">ignore.pl</a></p></nav> - -<article> -<h1>Integrating Browser Into Your Environment</h1> -<p class="subtitle">Published on 2020-08-12 23:15:00+02:00 -<p>Not so long ago I've finally started to play around with a little idea I had when I was writing -<a href="markdown_is_bad_for_you.html">the rant about markdown</a>. That little idea was to split web browser into -possibly several smaller utilities with a distinct responsibilities. In other words, to apply Unix-ish philosophy in a -web browser. I've touched this idea in <a href="web_browsers_are_no_more.html">Web browsers are no more</a> and then -did some initial tinkering in <a href="plumbing_your_own_browser.html">Plumbing your own browser</a>. Now time has come -to draw conclusions. Think of this post as a direct update to the plumbing one. -<p>I don't like IDEs. I have hand-crafted environments that I "live in" when I'm working on any of my computers. Window -manager that I tinkered to my liking, my preferred utilities, my text editor, my shortcuts. Whole operating system is -configured with one thing kept in mind: it belongs to me. IDEs invade this personal space of mine. And so do web -browsers. Of course, you can configure both web browsers and IDEs to some extent. You can even integrate them closer to -your normal environment, but in my experience sooner or later you'll run into limitations. Or you will end up with IDE -consuming your entire operating system (hello, emacs!). I didn't like that. -<p>Thanks to the amount of alternatives I can happily avoid using IDEs. I can't say that about browsers. Moreover modern -browsers are enormous and hermetic. Usually the only utility you have to interface with them is <code>browse</code> -which in turn is usually just a symbolic link to <code>xdg-open</code>. Not only that, but they only to open links in -their rendering engine and may allow to save a file, so that user can use it once he leaves the browser alone. -<p>Because of that, and because of other reasons I described in before-mentioned articles, I decided to try if splitting -browser into smaller utilities is a viable option, and just play around this idea. -<p>For now, I've split it into four parts, but I can see more utilities emerging: -<dl> -<dt>request solver -<dd>Previously, I referred to it as "browse" utility. But the way I have "browse" implemented now implies more than just -one responsibility. On the other, the request solver is meant to only oversee a request. It means it has all the pieces -of information and passes them to utilities in order to complete the request. It interacts with most of other programs -and may interact with user.<br> -It's one of the most important parts of this system. Due to nature of more verbose media like websites it should support -more than just "get this URI and show it in a view". For instance, it should be able to allow user (or view) to open the -resource in currently used active window or just retrieve files without opening them (in case of e.g. stylesheets). I -believe that there is enough room in here to separate even more utilities. -<dt>protocol demulitplexer -<dd>This one is also a part of the "browse" as of now, just because at this stage it can be a simple switch case or even -non-existent, assuming I plan to support only one protocol (e.g. http). One could pass this responsibility to the file -system, if protocols were to be implemented at this level (the Hurd-ish way). -<dt>protocol daemon -<dd>Not really a daemon (but it can be one!). Retrieves and points to data needed by the request solver. -<dt>opener/view demultiplexer -<dd>Your usual <code>xdg-open</code> clone. A more verbose switch case that opens the resources in appropriate views. -<dt>view/view engine -<dd>Displays the retrieved resource to a user. It's aware of its content and may request secondary files through request -solver (again, e.g. stylesheet or an image). Displays hyperlinks and redirects them to request solver. It's almost -completely agnostic to how they should be handled. It may suggest request solver to open the link in current view, if -the resource type is supported and the view is desired to handle this type of resource. -</dl> -<p>Now then, implementation currently have request solver and protocol demultiplexer in one utility called "browse". I -see quite a lot of opportunities to split the request solver a little bit more, or at least move some of the tasks to -already existing programs. Nonetheless, they're way more separated than most modern browsers.</p> -<img src="integrating_browser_into_your_environment-1.png" alt="demux, I really like this word"> -<p>The biggest pain in all of this is an HTML engine. The more verbose ones were never intended to be used like this. -On the other hand the limited one that I wrote just for this experiment is... Well, way too limited. It allows me to -browse simpler websites like my own, but has problems in those that have CSS that's longer than the website content. -Of course, I don't even mention modern web applications, obviously they won't work without Javascript. -<p>Surprisingly, despite the enormity of problems mostly related to HTML, CSS or Javascript, I'm staying positive. It -works, it can be integrated in the environment and it's an interesting idea to explore. For some reason it feels like -I took <code>xdg-open</code> to extremes (that's why I keep mentioning it), but I think it's just because I am yet to -polish the concept. -<p>For now, <a href="https://git.ignore.pl/browse/">the utilities</a> are available publicly. You can use them to try -out the idea. I've left there one simple example that uses <code>dmenu</code> for opening an URI either from list of -bookmarks or one entered by hand. Moving base address and some mime type to command line options, should give the -utilities enough flexibility to use e.g. opener to open local files as well. Then it can be used with <code>lf</code> or -any file manager of your choice, and you'll have single utility to handle all kinds of openings. -<p>I'll move now to other ideas that I left without any conclusion. However, I'm looking forward to seeing if this one -can bring more in the future and most certainly I'll return to it with full focus. - -</article> -<script src="https://stats.ignore.pl/track.js"></script> diff --git a/plumbing_your_own_browser.html b/plumbing_your_own_browser.html deleted file mode 100644 index 4f9b999..0000000 --- a/plumbing_your_own_browser.html +++ /dev/null @@ -1,99 +0,0 @@ -<!doctype html> -<html lang="en"> -<meta charset="utf-8"> -<meta name="viewport" content="width=device-width, initial-scale=1"> -<meta name="author" content="aki"> -<meta name="tags" content="web, web browser, linux, shell"> -<link rel="icon" type="image/png" href="cylo.png"> -<link rel="stylesheet" type="text/css" href="style.css"> - -<title>Plumbing Your Own Browser</title> - -<nav><p><a href="https://ignore.pl">ignore.pl</a></p></nav> - -<article> -<h1>Plumbing Your Own Browser</h1> -<p class="subtitle">Published on 2020-08-01 21:38:00+02:00</p> -<img src="plumbing_your_own_browser-1.png" alt="plumbing"> -<p>In spirit of the previous post about <a href="web_browsers_are_no_more.html">web browsers</a>, how about a little -experiment? Let's write a simple tool that implements downloading, history management and displaying the content. This -is intended as a trivial and fun experiment. -<p>Ideally, I think the architecture would divide into: protocol daemon, navigator, opener and view engines. However, -even with this setup some of them would have wide responsibilities. I don't really like that, but I leave it to future -to deal with. Anyway, what do they do?</p> -<dl> - <dt>protocol daemon<dd>Responsible for data acquisition and caching. For instance HTTP protocol daemon. - <dt>navigator<dd>The quickest way to explain it: the address bar. It handles history, probably sessions, windows, - initial requests to protocol daemon from the user. This one would need some attention to properly integrate it with - the environment and make sure that its responsibilities don't go too far. - <dt>opener<dd>Not really xdg-open or rifle, but something of this sort. Gets data marked for display from the - protocol server and acts as a demux for view engines. - <dt>view engine<dd>Your usual browser excluding things that already appeared earlier. It may also be something else, - like completely normal image viewer, hyperlinked markdown viewer or even less. Or more like sandboxed application - environment that is not a web application. -</dl> -<p>Sounds like a complex system, but we can do it easily in a short shell script. I won't bother with view engines, as -right now, that's rather time consuming to get them work, especially that browsers weren't written with this use case in -mind. Even those minimal ones can't do. Generally, they would need to communicate with protocol server to retrieve -secondary data (like stylesheet or images) and communicate with navigator when user clicked some kind of link. -<p>Anyway, let's start with protocol daemon! Our target is web browser, so we need something to handle HTTP for us. What -else could we use if not curl? Frankly speaking, just curl could be sufficient to view things:</p> -<pre> -$ curl -sL https://ignore.pl/plumbing_your_own_browser.html -... -... -... -</pre> -<p>Yeah, if you use st as terminal emulator like I do, then you need to add <code>| less</code> at the end, so that you -can read it. Honestly, with documents that are written in a way that allows people to read them as plain text, that's -enough (posts in this websites can be read in plain text). -<p>However, although it's tempting to not, I'll do more than that. Now that we have a protocol daemon that is not a -daemon, the next one is the opener. Why not navigator? For now interactive shell will be the navigator. You'll see how. -<p>It's possible that you already have something that could act as an opener (like rifle from ranger file manager). -There are plenty of similar programs, including xdg-open. I believe that they could be configured to work nicely in this -setup, but let's write our own:</p> -<pre> -#!/bin/sh -TMP=$(mktemp -p /dev/shm) && - { TYPE=$(curl -sLw "%{content_type}\n" $@ -o "$TMP") && - case "$TYPE" in - application/pdf) zathura "$TMP";; - image/*) sxiv "$TMP";; - text/*) less "$TMP";; - *) hexdump "$TMP";; - esac } -rm -f "$TMP" -</pre> -<p>That's a lot of things to explain! First two, up to <code>case "$TYPE" in</code> are actually protocol daemon. The -<code>$@</code> is what comes from the navigator. In our case, it's the arguments from the shell that run our command. -Next up, the case statement is the opener. Based on the output of curl's write-out the script selects program to open -the temporary file from the web. After that, the file is removed, in other words caching is not supported yet. -<p>Surprisingly, that's it, hell of a minimal browser. Works nicely with pdf files, images and text formats that are not -extremely bloated. Possibly with some tinkering around xdg-open and x default applications some hyperlinks between the -formats could be made (e.g. a pdf links to an external image). -<p>Now, I could go further and suggest something an option like this:</p> -<pre> -application/lua) lua_gui_sandbox "$TMP";; -</pre> -<p>I find it interesting and worth looking into. I'll leave it as an open thing to try out. -<p>The are some more things to consider. For instance, the views should know the base directory the file comes from as -some hyperlinks are relative. In other words, programs used as views should allow to state base of the address in some -way:</p> -<pre> -{ curl -sLw "%{content_type}\n${url_effective}\n" $@ -o "$TMP" | { - read TYPE - read URL - BASE_URL=$(strip_filename_from_url "$URL") } && - case "$TYPE" in - text/html) html_view --base "$BASE_URL" "$TMP";; - text/markdown) markdown --base "$BASE_URL" "$TMP";; - # ... - esac } -</pre> -<p>By then, the <code>markdown</code> would know that if the user clicks some hyperlink with a relative path, then it -should append the base path to it. It could also provide information that matters in e.g. CORS. -<p>For now, that's it. The ideas are still unrefined, but at least they are moving somewhere. Hopefully, I will get -myself to write something that could act as a view and respect this concept. My priority should be HTML view but I feel -like starting with simplified Markdown (one without HTML). -</article> -<script src="https://stats.ignore.pl/track.js"></script> |