From c0b3870dde1d355de40515376ffd5bc87442e21f Mon Sep 17 00:00:00 2001 From: Aki Date: Sun, 25 Jul 2021 19:24:21 +0200 Subject: Published Deconstructing Web Browsers --- deconstructing_web_browsers-1.png | Bin 0 -> 1638 bytes deconstructing_web_browsers-2.png | Bin 0 -> 2027 bytes deconstructing_web_browsers.html | 134 ++++++++++++++++++++++++ graveyard_of_the_drawings-10.png | Bin 0 -> 1809 bytes graveyard_of_the_drawings-11.png | Bin 0 -> 1707 bytes graveyard_of_the_drawings.html | 4 +- index.html | 6 +- integrating_browser_into_your_environment-1.png | Bin 1707 -> 0 bytes integrating_browser_into_your_environment.html | 81 -------------- plumbing_your_own_browser-1.png | Bin 1809 -> 0 bytes plumbing_your_own_browser.html | 99 ----------------- 11 files changed, 141 insertions(+), 183 deletions(-) create mode 100644 deconstructing_web_browsers-1.png create mode 100644 deconstructing_web_browsers-2.png create mode 100644 deconstructing_web_browsers.html create mode 100644 graveyard_of_the_drawings-10.png create mode 100644 graveyard_of_the_drawings-11.png delete mode 100644 integrating_browser_into_your_environment-1.png delete mode 100644 integrating_browser_into_your_environment.html delete mode 100644 plumbing_your_own_browser-1.png delete mode 100644 plumbing_your_own_browser.html diff --git a/deconstructing_web_browsers-1.png b/deconstructing_web_browsers-1.png new file mode 100644 index 0000000..e4b5d59 Binary files /dev/null and b/deconstructing_web_browsers-1.png differ diff --git a/deconstructing_web_browsers-2.png b/deconstructing_web_browsers-2.png new file mode 100644 index 0000000..0fc3dc6 Binary files /dev/null and b/deconstructing_web_browsers-2.png differ diff --git a/deconstructing_web_browsers.html b/deconstructing_web_browsers.html new file mode 100644 index 0000000..855ee39 --- /dev/null +++ b/deconstructing_web_browsers.html @@ -0,0 +1,134 @@ + + + + + + + + + +Deconstructing Web Browsers + + + +
+

Deconstructing Web Browsers

+

Published on 2021-07-25 14:53:00+02:00 +

Welcome to one of my little experiments! The theme is simple: to deconstruct a web browser and create several +utilities with distinct and clear responsibilities in its stead. This is not your regular blog post. Honestly, I'm not +yet sure what it is, but I'll figure it out at some point. Expect this page to be updated and extend. (Well, just like +my regular posts, for some reason, I need rethink it one more time...) + +

Motivation and History

+

The idea started to sprout in my mind few years ago. In its early stages it wasn't really directed at web browsers +but instead it focused on markdown. After giving it some more thinking, it +changed target to web browsers. Now, it also started to draw from Unix +philosophy and my general aversion towards IDEs, or rather that this kind of modularity started to be visible as the +real core of the motivation that drives this idea. I never really touched or explored it yet. I didn't try to discredit +it either. Hopefully, once I reach that point, it will stand its ground.

+ +scroll with history + +

Last year, I explored this idea a bit in a two-part text series and within a small project called +browse. I naively split the responsibilities between programs and had some +fun writing very simple scripts that did the work. And they did the work surprisingly good, but functionality +constraints had to be extremely strict. Recently, I came back to it, read my own stuff, looked at my own code, and I +still could relate to it. Instead of removing everything like I sometimes do, I decided to develop +a new utility and write this new summary and project status. + +

Experimenting From a Terminal

+

Rather than jumping into design or development work straight away, let's see how far can we get, while using only +shell and some usual utilities you can find in every shed. To access a webpage, one could potentially eat it raw: + +

+$ curl -sL https://ignore.pl/ | less
+...
+
+ +

Now, that's raw! With a page like this one, it's possible. I write them by hand and comply to my own rules that make +it possible for the reader to consume them as plain text. However, it's not very useful considering how astoundingly +obfuscated modern HTML pages can get. + +

It's not only extremely complex HTML hierarchies that we need to deal with. Another great opponents are web +applications that pretend to be webpages. Separating those two will prove itself to be useful. Not only that, it will +also open us to new possibilities. Consider a dead simple script that acts similarly to regular opener: + +

+#!/bin/sh
+TMP=$(mktemp -p /dev/shm) &&
+	{ TYPE=$(curl -sLw "%{content_type}\n" $@ -o "$TMP") &&
+		case "$TYPE" in
+			application/pdf) zathura "$TMP";;
+			image/*) sxiv "$TMP";;
+			text/html*) html_viewer "$TMP";;
+			text/markdown*) markdown_viewer "$TMP";;
+			text/*) less "$TMP";;
+			*) echo "$TMP";;
+		esac }
+rm -f "$TMP"
+
+ +

You use it like this: + +

+$ ./script https://ignore.pl/
+
+ +

It shows the requested content using a program that's selected based on its mime type. Here, the difference between +webpage and web application is blurred. Hypothetically, using mime or some other means we could do a switch cases like +these: + +

+web-application/html+js) fork_of_chromium_or_something "$TMP";;
+web-application/lua) lua_gui_sandbox "$TMP";;
+
+ +

The ability to support multiple competing frameworks that are meant to run seamlessly loading sandboxed applications +(so, web applications) is really making me interested. + +

That's not the only thing though. As you can see, in this example markdown and HTML are completely separated. +Markdown is no longer a format that's supposed to generate HTML but instead it becomes a stand-alone hypertext format. +Because the content requests are meant to run through such demultiplexer the hyperlinks can lead from one hypertext +format to another. This allows new formats and new ways of expression to grow and compete, hopefully breathing +some life into an ecosystem that's currently driven by monolithic giants.

+ +bacteria or something, dunno + +

Browser That’s Part of Your Environment

+

Of course, a single script like the example above is not the way to go, but it's a good start as it gives insight +into data flow and responsibilities. At first, just by looking at it, I decided to naively distinguish four components: + +

+
navigator +
Takes address of the request from user and forwards it to a protocol daemon. Retrieved content is then pushed +to opener. +
protocol daemon +
Acquires and caches data using a single protocol e.g., HTTP. +
opener +
Chooses viewers based on content type. +
viewer +
Presents content to user and allows to interact with it. +
+ +

I found it to be a decent starting point and played around with it getting encouraging results. All predicted +obstacles made their appearances and thanks to working prototypes shortcomings of each role were shown. In the second +iteration I wanted to divide navigator into several stand-alone parts but in the end I never committed to it. + +

Based on the description above, it doesn't seem as if navigator would require such division. Actually, +current +implementation doesn't clearly show such need either. The only hints are -f option in navigator +and opener, and direct calls to protocol daemon by viewers to retrieve auxiliary content (e.g., +stylesheet or embedded image). Meaning navigator is hiding a plumbing-capable request resolver below the +porcelain interface that's dedicated to user. + +

More than that, navigator may also be hiding functionality meant to support browsing history that I didn't +explore yet at all. Combining it with graphic interfaces, sessions management or tabs are all question marks. + +

Obviously, responsibilities of the components is not the only matter to think about. Interfaces in every form are +also important. I'm talking here: communication between the components of the browser, interchangeability, communication +between the browser and the rest of the environment it runs in, and integration with graphical user interfaces and +window managers. + +

For now, I plan to split navigator and look into a equivalent of an address bar. +

+ diff --git a/graveyard_of_the_drawings-10.png b/graveyard_of_the_drawings-10.png new file mode 100644 index 0000000..bbfebec Binary files /dev/null and b/graveyard_of_the_drawings-10.png differ diff --git a/graveyard_of_the_drawings-11.png b/graveyard_of_the_drawings-11.png new file mode 100644 index 0000000..4c2d87a Binary files /dev/null and b/graveyard_of_the_drawings-11.png differ diff --git a/graveyard_of_the_drawings.html b/graveyard_of_the_drawings.html index 2e67da6..7c3ac65 100644 --- a/graveyard_of_the_drawings.html +++ b/graveyard_of_the_drawings.html @@ -13,7 +13,7 @@

Graveyard of the Drawings

-

Last modified on 2021-03-19 19:53+01:00 +

Last modified on 2021-07-25 19:21+02:00

Here are the drawings I made for articles that I decided to remove. No context, no nothing. Just images. Despite the style, I still think that it'd be a little bit of waste to just remove them along the texts and reusing them in different articles is just lazy.

@@ -26,4 +26,6 @@ different articles is just lazy.

+ +
diff --git a/index.html b/index.html index a89d9a2..5a9b5e1 100644 --- a/index.html +++ b/index.html @@ -31,8 +31,7 @@ completely discard the concept of a keyboard.
  • Rebuilding Web Browsing
    1. Web Browsers Are No More -
    2. Plumbing Your Own Browser -
    3. Integrating Browser Into Your Environment +
    4. Deconstructing Web Browsers
  • Of Privacy and Traffic Tracking
  • How to Write a Minimal HTML5 Document @@ -69,6 +68,9 @@ completely discard the concept of a keyboard.

    News

    +Published Deconstructing Web Browsers, summary of now-removed Plumbing +Your Own Browser and Integrating Browser into Your Environment. +

    Initialized website as git repository. Let's see if it will be useful.

    Rewritten parts of and updated We Browsers Are No More. diff --git a/integrating_browser_into_your_environment-1.png b/integrating_browser_into_your_environment-1.png deleted file mode 100644 index 4c2d87a..0000000 Binary files a/integrating_browser_into_your_environment-1.png and /dev/null differ diff --git a/integrating_browser_into_your_environment.html b/integrating_browser_into_your_environment.html deleted file mode 100644 index e67bfea..0000000 --- a/integrating_browser_into_your_environment.html +++ /dev/null @@ -1,81 +0,0 @@ - - - - - - - - - -Integrating Browser Into Your Environment - -

    - -
    -

    Integrating Browser Into Your Environment

    -

    Published on 2020-08-12 23:15:00+02:00 -

    Not so long ago I've finally started to play around with a little idea I had when I was writing -the rant about markdown. That little idea was to split web browser into -possibly several smaller utilities with a distinct responsibilities. In other words, to apply Unix-ish philosophy in a -web browser. I've touched this idea in Web browsers are no more and then -did some initial tinkering in Plumbing your own browser. Now time has come -to draw conclusions. Think of this post as a direct update to the plumbing one. -

    I don't like IDEs. I have hand-crafted environments that I "live in" when I'm working on any of my computers. Window -manager that I tinkered to my liking, my preferred utilities, my text editor, my shortcuts. Whole operating system is -configured with one thing kept in mind: it belongs to me. IDEs invade this personal space of mine. And so do web -browsers. Of course, you can configure both web browsers and IDEs to some extent. You can even integrate them closer to -your normal environment, but in my experience sooner or later you'll run into limitations. Or you will end up with IDE -consuming your entire operating system (hello, emacs!). I didn't like that. -

    Thanks to the amount of alternatives I can happily avoid using IDEs. I can't say that about browsers. Moreover modern -browsers are enormous and hermetic. Usually the only utility you have to interface with them is browse -which in turn is usually just a symbolic link to xdg-open. Not only that, but they only to open links in -their rendering engine and may allow to save a file, so that user can use it once he leaves the browser alone. -

    Because of that, and because of other reasons I described in before-mentioned articles, I decided to try if splitting -browser into smaller utilities is a viable option, and just play around this idea. -

    For now, I've split it into four parts, but I can see more utilities emerging: -

    -
    request solver -
    Previously, I referred to it as "browse" utility. But the way I have "browse" implemented now implies more than just -one responsibility. On the other, the request solver is meant to only oversee a request. It means it has all the pieces -of information and passes them to utilities in order to complete the request. It interacts with most of other programs -and may interact with user.
    -It's one of the most important parts of this system. Due to nature of more verbose media like websites it should support -more than just "get this URI and show it in a view". For instance, it should be able to allow user (or view) to open the -resource in currently used active window or just retrieve files without opening them (in case of e.g. stylesheets). I -believe that there is enough room in here to separate even more utilities. -
    protocol demulitplexer -
    This one is also a part of the "browse" as of now, just because at this stage it can be a simple switch case or even -non-existent, assuming I plan to support only one protocol (e.g. http). One could pass this responsibility to the file -system, if protocols were to be implemented at this level (the Hurd-ish way). -
    protocol daemon -
    Not really a daemon (but it can be one!). Retrieves and points to data needed by the request solver. -
    opener/view demultiplexer -
    Your usual xdg-open clone. A more verbose switch case that opens the resources in appropriate views. -
    view/view engine -
    Displays the retrieved resource to a user. It's aware of its content and may request secondary files through request -solver (again, e.g. stylesheet or an image). Displays hyperlinks and redirects them to request solver. It's almost -completely agnostic to how they should be handled. It may suggest request solver to open the link in current view, if -the resource type is supported and the view is desired to handle this type of resource. -
    -

    Now then, implementation currently have request solver and protocol demultiplexer in one utility called "browse". I -see quite a lot of opportunities to split the request solver a little bit more, or at least move some of the tasks to -already existing programs. Nonetheless, they're way more separated than most modern browsers.

    -demux, I really like this word -

    The biggest pain in all of this is an HTML engine. The more verbose ones were never intended to be used like this. -On the other hand the limited one that I wrote just for this experiment is... Well, way too limited. It allows me to -browse simpler websites like my own, but has problems in those that have CSS that's longer than the website content. -Of course, I don't even mention modern web applications, obviously they won't work without Javascript. -

    Surprisingly, despite the enormity of problems mostly related to HTML, CSS or Javascript, I'm staying positive. It -works, it can be integrated in the environment and it's an interesting idea to explore. For some reason it feels like -I took xdg-open to extremes (that's why I keep mentioning it), but I think it's just because I am yet to -polish the concept. -

    For now, the utilities are available publicly. You can use them to try -out the idea. I've left there one simple example that uses dmenu for opening an URI either from list of -bookmarks or one entered by hand. Moving base address and some mime type to command line options, should give the -utilities enough flexibility to use e.g. opener to open local files as well. Then it can be used with lf or -any file manager of your choice, and you'll have single utility to handle all kinds of openings. -

    I'll move now to other ideas that I left without any conclusion. However, I'm looking forward to seeing if this one -can bring more in the future and most certainly I'll return to it with full focus. - -

    - diff --git a/plumbing_your_own_browser-1.png b/plumbing_your_own_browser-1.png deleted file mode 100644 index bbfebec..0000000 Binary files a/plumbing_your_own_browser-1.png and /dev/null differ diff --git a/plumbing_your_own_browser.html b/plumbing_your_own_browser.html deleted file mode 100644 index 4f9b999..0000000 --- a/plumbing_your_own_browser.html +++ /dev/null @@ -1,99 +0,0 @@ - - - - - - - - - -Plumbing Your Own Browser - - - -
    -

    Plumbing Your Own Browser

    -

    Published on 2020-08-01 21:38:00+02:00

    -plumbing -

    In spirit of the previous post about web browsers, how about a little -experiment? Let's write a simple tool that implements downloading, history management and displaying the content. This -is intended as a trivial and fun experiment. -

    Ideally, I think the architecture would divide into: protocol daemon, navigator, opener and view engines. However, -even with this setup some of them would have wide responsibilities. I don't really like that, but I leave it to future -to deal with. Anyway, what do they do?

    -
    -
    protocol daemon
    Responsible for data acquisition and caching. For instance HTTP protocol daemon. -
    navigator
    The quickest way to explain it: the address bar. It handles history, probably sessions, windows, - initial requests to protocol daemon from the user. This one would need some attention to properly integrate it with - the environment and make sure that its responsibilities don't go too far. -
    opener
    Not really xdg-open or rifle, but something of this sort. Gets data marked for display from the - protocol server and acts as a demux for view engines. -
    view engine
    Your usual browser excluding things that already appeared earlier. It may also be something else, - like completely normal image viewer, hyperlinked markdown viewer or even less. Or more like sandboxed application - environment that is not a web application. -
    -

    Sounds like a complex system, but we can do it easily in a short shell script. I won't bother with view engines, as -right now, that's rather time consuming to get them work, especially that browsers weren't written with this use case in -mind. Even those minimal ones can't do. Generally, they would need to communicate with protocol server to retrieve -secondary data (like stylesheet or images) and communicate with navigator when user clicked some kind of link. -

    Anyway, let's start with protocol daemon! Our target is web browser, so we need something to handle HTTP for us. What -else could we use if not curl? Frankly speaking, just curl could be sufficient to view things:

    -
    -$ curl -sL https://ignore.pl/plumbing_your_own_browser.html
    -...
    -...
    -...
    -
    -

    Yeah, if you use st as terminal emulator like I do, then you need to add | less at the end, so that you -can read it. Honestly, with documents that are written in a way that allows people to read them as plain text, that's -enough (posts in this websites can be read in plain text). -

    However, although it's tempting to not, I'll do more than that. Now that we have a protocol daemon that is not a -daemon, the next one is the opener. Why not navigator? For now interactive shell will be the navigator. You'll see how. -

    It's possible that you already have something that could act as an opener (like rifle from ranger file manager). -There are plenty of similar programs, including xdg-open. I believe that they could be configured to work nicely in this -setup, but let's write our own:

    -
    -#!/bin/sh
    -TMP=$(mktemp -p /dev/shm) &&
    -	{ TYPE=$(curl -sLw "%{content_type}\n" $@ -o "$TMP") &&
    -		case "$TYPE" in
    -			application/pdf) zathura "$TMP";;
    -			image/*) sxiv "$TMP";;
    -			text/*) less "$TMP";;
    -			*) hexdump "$TMP";;
    -		esac }
    -rm -f "$TMP"
    -
    -

    That's a lot of things to explain! First two, up to case "$TYPE" in are actually protocol daemon. The -$@ is what comes from the navigator. In our case, it's the arguments from the shell that run our command. -Next up, the case statement is the opener. Based on the output of curl's write-out the script selects program to open -the temporary file from the web. After that, the file is removed, in other words caching is not supported yet. -

    Surprisingly, that's it, hell of a minimal browser. Works nicely with pdf files, images and text formats that are not -extremely bloated. Possibly with some tinkering around xdg-open and x default applications some hyperlinks between the -formats could be made (e.g. a pdf links to an external image). -

    Now, I could go further and suggest something an option like this:

    -
    -application/lua) lua_gui_sandbox "$TMP";;
    -
    -

    I find it interesting and worth looking into. I'll leave it as an open thing to try out. -

    The are some more things to consider. For instance, the views should know the base directory the file comes from as -some hyperlinks are relative. In other words, programs used as views should allow to state base of the address in some -way:

    -
    -{ curl -sLw "%{content_type}\n${url_effective}\n" $@ -o "$TMP" | {
    -	read TYPE
    -	read URL
    -	BASE_URL=$(strip_filename_from_url "$URL") } &&
    -		case "$TYPE" in
    -			text/html) html_view --base "$BASE_URL" "$TMP";;
    -			text/markdown) markdown --base "$BASE_URL" "$TMP";;
    -			# ...
    -		esac }
    -
    -

    By then, the markdown would know that if the user clicks some hyperlink with a relative path, then it -should append the base path to it. It could also provide information that matters in e.g. CORS. -

    For now, that's it. The ideas are still unrefined, but at least they are moving somewhere. Hopefully, I will get -myself to write something that could act as a view and respect this concept. My priority should be HTML view but I feel -like starting with simplified Markdown (one without HTML). -

    - -- cgit v1.1