Do we need browsers?
What do you do if you have a house and you need a car? Do you attempt to turn the house into a car? Of course not! You keep the house and get a car. Otherwise, you would probably end up with something that’s terrible to live in and doesn’t work for transportation either.
What do you do if you have a document viewer platform and you want to execute arbitrary applications?
Historically, web browsers were created to display and navigate hyperlinked documents. This is clearly reflected by composition of the “web stack”:
- HTML was designed to describe the structure of documents
- CSS was later developed to describe the styling of the documents
- JavaScript was added to provide some interactivity (and supposedly be used even by non-programmers)
Fast forward ~20 years and this very same platform (with some version upgrades along the way) is being used for all kinds of highly interactive, dynamic applications, trying to recreate native experience. The extremities of this shift can be best observed in the so called single-page applications:
- Instead of structural document elements, SPAs logically consist of “application components”
- Instead of navigating via hyperlinks, these components are dynamically created, modified, moved and removed
- Instead of styling static documents, dynamic components need to be laid out on various screen sizes
- Instead of scripting some interactivity, the entire behavior of the components and their orchestration is programmed
The tools and frameworks utilized to build such applications try really hard to hide the original intent and static nature of the underlying technologies by piling up abstractions: preprocessing and transpiling languages, constructing and diffing virtual DOMs and other techniques. Such attempts not only add a significant resource overhead to the applications and end up providing a terrible user experience, but also present a mental overhead for the developers.
Of course, the web browsers themselves have also changed quite a bit in an attempt to fulfill such a different role. HTML, CSS and JavaScript have all gone through several backwards compatible iterations, adding more and more features. Web APIs have been added to provide some access to OS and device capabilities. Security features have been introduced. Browsers have added support for plugins, extensions and intagrated various tools (eg. developer tools, synchronization features and even proprietary tools). They manage and prioritize the resources for multiple tabs. And the list goes on…
With all these features adding up over the years, web browsers have become resource hungry beasts with millions of lines of code. It’s not uncommon for a browser to become unresponsive for seconds and/or use several gigabytes of RAM, even when just displaying document-like content. And even now, the web as a platform is nowhere near close to poviding enough features for arbitrary applications. Access to OS/hardware features, arbitrary protocols and sockets, daemon-like applications, IPC, threading capabilities and a lot more are either lacking or non-existent. Could a smartwatch run a web browser efficiently today? Could an application built on this “web platform” make use of the features of a smartwatch? What about future devices? Innovation never stops and browsers will never catch up.
How much longer is it possible to keep adding features to both browsers and application frameworks within the browsers to support use-cases that keep drifting away from the original intentions of the technology? Why do we assume that the same set of abstractions will work for every use-case, ignoring all evidence to the contrary? And how do we know browser vendors will even implement features that may threaten their business interests, such as functionality that would enable decentralization and p2p applications?
Even if it all works out by some miracle, we’ll be left with an extremely inefficient platform that reimplements much of the features of an operating system on top of a real operating system, providing a limited set of abstractions that can not possibly suit most applications, implemented and controlled by a handful of vendors where implementations may diverge at any time. Something that’s terrible to live in and doesn’t work for transportation either.
Okay, so web browsers are awful for applications. What can be done? Is there a better way?
Rather than piling up abstractions within a decades old framework, maybe we should take a step back and revisit our options knowing what we know today. What if we could take the best ideas from web browsers and ditch the awkward parts?
The most desirable features of browers are probably:
- URLs - connect resources and provide convenient access to them
- Security - isolate execution of application code
- Platform independence - abstract over devices and operating systems
How could we have these without the whole package? Here’s a sketch of one possibility:
URLs
Imagine something like xdg-open. A URI would be passed to the tool, which would analyze it. It could potentially consult the package manager of the operating system and/or inspect the resource itself to discover some metadata, such as dependencies. Finally, it would dispatch the URI or the resource to the correct/preferred handler, or execute executable resources. As any application could be registered to handle specific URIs and this tool could be invoked by any application, it would serve to both interlink resources and provide a way to easily access and share them.
Security
Applications should be sandboxed using kernel facilities. I’m no expert in security (or any other area for that matter) but looking at cgroups, namespaces, seccomp-bpf or even complete sandbox solutions built on similar features, native application isolation and syscall filtering/interception doesn’t seem to be completely unrealistic. With all the hype around containerization lately, we can probably expect some interesting activity in this space from most vendors. Ideally, both preconfigured policies and asking permissions from the user at runtime would be supported.
Platform independence
Platform independence is just a very high level abstraction. And of course, as with any abstraction, there’s a tradeoff of flexibility and performance. Different use-cases require different abstractions, there’s no one size fits all. With a system like this, developers could choose the abstractions best suited to their needs: languages, libraries, frameworks, interpreters, virtual machines and so on. The building blocks of the system, such as convenient resource addressing and security featues would be available independently of your choice of abstractions.
Instead of trying to shoehorn everything into browsers, such a system would effectively bring the advantages of the web to arbitrary applications by separating the overly broad range of concerns of web browsers.
Incidentally, this set of tools would even be compatible with today’s web: web browsers could simply be registered as the handler for current web URLs.
To reach the future where your desktop, tablet, phone and other devices act as the building blocks of your interconnected computing platform rather than individual devices, we need an ecosystem that naturally promotes open platforms and operating systems, not isolated app stores and SDKs. But, unlike web browsers, this ecosystem must be lightweight and empower developers to create their own abstractions and utilize all capabilities of our computing devices.
I’m aware that this is a rather idealistic sketch and not a trivial goal to achieve. I’m not even saying this is the right solution. But one thing is certain: the web platform we have today is already bloated, does not suit our needs and severely limits innovation. I’m confident that the way forward is to - one way or another - climb out of this hole we’ve been digging for way too long called a web browser.