> The specification must contain a non-ambiguous formal grammar that can be parsed easily. A page can then be tested against the standard and reject or accept as compliant. Pages that don't conform with the specification won't be rendered. It is explicitly forbidden for clients to accept any page that doesn't conform with the specification.
This is what XHTML was, and it was a complete disaster. There's a reason almost nobody serves XHTML with the application/xhtml+xml MIME type, and that reason is that getting a “parser error” (this is what browsers still do! try it!) is always worse than getting a page that 99% works.[0] I strongly believe that rejecting the robustness principle is a fatal mistake for a web-replacement project. The fact that horribly broken old sites can stay online and stay readable is a huge part of the web's value. Without that, it's not really “the web”, spiritually or otherwise.
[0] It's particularly “cool” how they simply do not work in the Internet Archive's Wayback machine. The page can be retrieved, but nobody can read it.
Agreed. There may be some situations where I may want to ensure 100% correctness. I'm thinking life or death scenarios, (which if so, maybe should use a different protocol). However, checking the sports score or looking at cat memes isn't that.
Yes, this is what you'd want. It doesn't have to be a complicated as the HTML5 algorithm either. That's complicated because it was a harmonization of at least 3 browser's multi-decade heuristics and untold terabytes of existing HTML practice. An algorithm unconcerned with backwards compatibility could much simpler, but still clearly define error behavior much easier to use than "scream and die".
And it's still unambiguous. You can cringe at what some people do, but it would be strictly a taste issue rather than a technical one, as the parse would still be unambiguous. And if you think you can fix taste issues with technical specification, well, you've already lost anyhow.
I mean, the linked page and the comment above say it is:
> It is explicitly forbidden for clients to accept any page that doesn't conform with the specification. This prevents the standardized diabolic rules that one must implement in order to correct a
No scripting is a tell, it's about wanting other people to accommodate their concerns about running a complex browser, not about solving a real problem.
If it did somehow happen that a good deal of interesting content was published using the standard, the most popular client would probably be nonconforming, ignoring the rule to not render ambiguous content.
Every modern alternative web protocol is about accommodating the author's concerns and pet peeves about the modern web (and usually gatekeeping it from capitalists and normies.)
Protocols used to be limited by technology, now they're defined by ideology.
Author here. I agree that you cannot go from HTML to XHTML because users and UA devs will always go towards "it mostly works".
However, I don't see it that clearly that this cannot be done since the start so that the expectations are right since the beginning. For example, I don't see the same problem in other formats like JPEG or PNG where you expect the image to work perfectly or fail with a decoding error.
Other than implementing it and see how it goes, can you propose a feasible experiment to see how an new strict spec will measurably fail?
browsers will display invalid/corrupt images (best effort)
tried it right now - took a PNG and a JPEG, opened them in a text editor, literally deleted the second half of the file, saved, and dragged them into both Firefox and Chrome - they are displayed instead of erroring out.
there is a classic article why a minimal version of the web with features removed will fail - you removed 80% of the features that YOU think are not important. thats a classic fatal mistake
search the web for different proposals for a minimal web and you will understand - they will have removed some feature they think is bloat but which you kept in your proposal because you consider it critical. which is why you created a new proposal - their minimal proposal is not the right one for you
I think what is lost on many people, ironically even the ones who want to retvrn the web to its former glory, is that the browser tries to display broken, half transmitted content because it happened so frequently due to circumstances completely out of the website operator or the user's control. And in most cases showing a half transmitted web page with half of the closing tags missing is almost certainly better than just outright refusing to show anything.
> I agree that you cannot go from HTML to XHTML because users and UA devs will always go towards "it mostly works".
That... is not how anything happened.
> I don't see the same problem in other formats like JPEG or PNG where you expect the image to work perfectly or fail with a decoding error.
Browsers absolutely decode as much as they can, and if the file is corrupted halfway through you generally get garbling, not the entire image being replaced by "fuck off". The only case where that is so is if the browser can't parse anything at all, or can't retrieve the file.
> Other than implementing it and see how it goes, can you propose a feasible experiment to see how an new strict spec will measurably fail?
> Browsers absolutely decode as much as they can, and if the file is corrupted halfway through you generally get garbling, not the entire image being replaced by "fuck off". The only case where that is so is if the browser can't parse anything at all, or can't retrieve the file.
What I meant is that you don't expect PNG or JPEG images to be created in a way that the parser needs to run a complex process to reconstruct the bits that are broken and interpret what you meant to say. Like this one:
Perhaps a better example is a C program being compiled into an executable. You don't expect the compiler to guess what you meant while parsing.
The current expectation is that a web browser must load any broken HTML and still display what it can, and is this expectation what I would like to change.
I don't propose humans to write this format directly (although it should be human readable), but compile it from something that is easy to write, like Markdown or a similar language. The objective is to enforce tools that make the transformation to produce a strictly conformant document.
Having a context-free grammar allows simple and fast parsing tools that can process your document, in a similar way that you can query or manipulate a JSON file with tools like jq because the grammar is simple and strict.
> What I meant is that you don't expect PNG or JPEG images to be created in a way that the parser needs to run a complex process to reconstruct the bits that are broken and interpret what you meant to say.
So what you meant is neither what you wrote, nor what you advocate for?
because in case you have forgotten here is what you advocate for:
> Pages that don't conform with the specification won't be rendered.
That is not how image rendering in browsers works. That is how XHTML does not work.
> Perhaps a better example is a C program being compiled into an executable.
It's not a better example, because it's a completely different and unrelated use case: C programs are usually not dynamically generated, and even when they are the person who compiles the code is usually either the person who wrote it or a person who has ways to fix it or report errors.
Not so when trying to read a web random page on the web.
> I don't propose humans to write this format directly (although it should be human readable)
Approximately nobody wrote xhtml by hand, didn't save it.
This is also a nonsensical constraint-set on its face, there is no point to a human readable format which is not human writeable.
> The objective is to enforce tools that make the transformation to produce a strictly conformant document.
Ah, an open and non-monopolizable format which can only be written via an official toolchain.
> Having a context-free grammar allows simple and fast parsing tools that can process your document in a similar way that you can query or manipulate a JSON^H^H^H^HXML file with tools like jq^H^Hxmlstarlet because the grammar is simple and strict.
None of which seems of any use to a format which pretends to human production and consumption. JSON is an interchange format between machines.
> Pages that don't conform with the specification won't be rendered.
I agree on what I wrote here. They will fail with an error indicating where the mistake is so you can correct it (more likely the tool that produced it).
>> The objective is to enforce tools that make the transformation to produce a strictly conformant document.
>
>Ah, an open and non-monopolizable format which can only be written via an official toolchain.
??
The objective is that when you make a tool like markdown-to-foo, the output follows the spec. There is no mention of any "official toolchain".
> xmlstarlet
XML is strict. Try to find the same tool for HTML5, especially for transformations.
> JSON is an interchange format between machines.
Is pretty much what the specification would try to cover.
People didn't go towards "it mostly works", people go towards "it works at all". A lot of people tried to use xhtml, and it didn't work, broken content was pervasive and the experience when facing broken content was irredeemable.
XHTML failed in an era when writers (even normies) were writing some HTML of their own and they could't be trusted to close their tags properly. XHTML also assumed writers would be personally invested in semantic markup like distinguishing e.g. the italics of book titles from the italics of emphasis.
Today, when writers are using visual editors (or Markdown), few are writing their own HTML any more. A web standard requiring compliance would work differently today.
Markdown sux and so do visual editors. I think visual editors were just invented to make it so cut-and-paste never quite works right. There's been some conceptual problem with the whole idea ever since MS Word and the industry has never dealt with it.
> XHTML failed in an era when writers (even normies) were writing some HTML of their own
I'd say it was a minority of writers that were handcrafting XHTML. And it was the case that everyone or their handcrafting or using tools could validate their compliance using a browser which made it very easy to adjust your tools or your handcrafted code. We are now in a situation where there is no schema for HTML.
I, for one, am very much in favor of forking the web with a document format with a schema. It really seems like a small and simple change to me.
Note that when I say "writing their own HTML", I don't mean handcrafting a whole webpage. I mean that people were writing i or b tags in their Wordpress editors or in online comment boxes, because back then such text fields did not have visual editors and would accept raw tags. Under XHTML, if the writer did not close tags properly, such input would have broken the whole page, so obviously back then such a standard was DOA.
Those cases were easy to fix by using eg htmltidy on the UGC.
Honestly I don't think it was killed by one thing, or by anything. Just no platform really cared and it wasn't a win for anyone and occasionally a loss.
> There's a reason almost nobody serves XHTML with the application/xhtml+xml MIME type, and that reason is that getting a “parser error” (this is what browsers still do! try it!) is always worse than getting a page that 99% works.
That’s not the reason almost nobody serves XHTML.
The real reason is Internet Explorer. Okay, it’s a little more nuanced than that, but I think it’s accurate enough. Microsoft killed XHTML by inaction.
It’s 2004. XHTML is now a few years old, and all the rage. You decide to use it for your new project which you’re developing. At the start, you serve pages as application/xhtml+xml, and that works well in Firefox; but you know that won’t work because Internet Explorer still doesn’t support XHTML, and 90% of your viewers will be using that. So, a little frustrated, you serve your nice XHTML as text/html. You still validate it manually for a while, but then that habit disappears. Eventually you make one or two small mistakes that would have been caught easily if it were parsed as XML—but it’s not, because of Internet Explorer. Over time this disparity grows.
People have been complaining of the inefficacy of XHTML for this exact reason for two or three years by this point.
It’s 2006. XHTML is acknowledged to have failed. Everything else supports it, but as long as IE doesn’t, you can’t serve as application/xhtml+xml, and so you can’t get the advantages of XML syntax.
Seriously, early failure is good—so long as you’re working with it from the start. The problems only occur when you try to add strictness later.
Just look at typing in code bases. Adding strictness to existing JavaScript or Python or Ruby? Nightmare. Starting with static types? Somewhere between fine and extremely desirable.
(I might be overselling strictness’s popularity at the time—people don’t always like what’s good for them. We’ve largely realised now that unfettered dynamic typing is a bad idea, but ten years ago that was not settled. People get used to things. If IE had permitted XHTML early on, people would have got used to the idea of XHTML’s strictness and, I think, got to mostly like it.)
XHTML did not fail because of XML’s catastrophic parse failure mode. It failed because HTML already worked, and Internet Explorer took way too losng to accept XHTML. If you’re forking the web and compatibility with existing documents is not a goal, you can’t use XHTML’s failure as an argument: it failed because of compatibility issues.
Well, Internet Explorer did eventually support application/xhtml+xml: in 2011, IE9. Way too late to matter. And so only by around 2015 or 2016 could you finally serve with XML syntax. And now why would you? For your system is big and has tiny errors here and there and your CMS just drops markup in and never got round to validating it and and and and so on. By that time, HTML had given up on the XML path, and although it worked, the momentum was entirely gone, so you’d run into difficulties due to inadequate documentation, inferior tooling (ironic), and various more.
Web browsers turned into application engines because it was a path to get useable software on PCs without having to deal with Microsoft. IE6 stayed broken forever for a reason.
Now, they enable applications to exist without going through app store gateways.
A new document-only protocol aligned the Web's original intention would be very useful simply for security reasons. I liked Gemini because, by design, a Gemini document is not executable in any way; there's no popups, plugins, or even cookies; all this is out of the box without having to manage settings, and Gemini documents are very readable without an app at all.
But replacing the modern browser rather than being another option will actually lock in people further than they already are-open protocols require apps which are all behind a gateway now on the primary computing device of users: phones.
It probably won't matter in a few years as the Web will likely be equally locked down soon, though.
> Web browsers turned into application engines because it was a path to get useable software on PCs without having to deal with Microsoft. [...] Now, they enable applications to exist without going through app store gateways.
What? You could deploy software without dealing with Microsoft back then and you still can today. Unless you meant avoiding building for Windows natively.
>Web browsers turned into application engines because it was a path to get useable software on PCs without having to deal with Microsoft. IE6 stayed broken forever for a reason.
Nonsense, lots of software were just local, I've even see MSN clones written in TCL/Tk, and Lazarus still used in some places, and tons of VB6/C# software. Back in the day except for Intranet turds (which in the end causes disasters like Iloveyou.VBS "thanks" to IE/Outlook deeply tied to Windows 9x software ) everyone serious about programming security and correctness flew away from the web model for the good. It was everything about Java (and applets) and later C#. The web had an overgrowth and languages which shouldn't be part of the desktop.
I think at least part of the reason for this is acknowledging that the web isn't much of a web any longer. You've got three or four vendors that serve the vast majority of all internet traffic. And it's not happenstance that those same vendors now control something which was originally meant to be democratic.
Most of this document reads to me like that's the problem they're trying to solve, not just chrome's huge marketshare, so simply not targeting it doesn't serve their purpose.
Hacker News is obviously a very corporate-centred website, so most of the posts in this thread are about profitability and economic value. If that's the lens through which you see things, forking the web seems like a waste of time. It's obviously not profitable.
I don't care about any of that, I just want to have fun on the internet. By that metric, most of the criticisms in this thread are irrelevant. It doesn't need to make money, it doesn't need to be used by more than a few nerds, and it doesn't need a zillion bells and whistles. Whether rdg (the author of the blog post) shares this goal, I don't know.
Yeah, I avoided sharing it here because I could see that it would immediately backlash. I also didn't even consider adding a more elaborate "introduction" section because these are my quick notes on what it had in mind at that moment.
Being on the development of Dillo for a few years makes you see things from a different perspective. I also think that it should be fun to make your own tools from scratch and be able to understand the specs in a couple of weekends. Pretty much what happened with Gemini and the explosion of clients and servers:
> A page can then be tested against the standard and reject or accept as compliant. Pages that don't conform with the specification won't be rendered. It is explicitly forbidden for clients to accept any page that doesn't conform with the specification.
it's as if nothing was learned from the XHTML debacle
I think XHTML failed because it didn't give web devs any new capabilities, so most didn't feel the need to learn it and do the extra work of getting their tags correct.
Then html5 came along, providing all kinds of shiny goodies and saying not to bother with the tags. In the end, a more rigid standard would have been nice.. (Though this is mostly about the skin deep part of the standards.)
That is not how I remember it (for a data point of one shop in New England during the time): we embraced it because of the binary validation under multiple theories. There was a strong suggestion valid html did better from an SEO perspective, so we could sell that, a suggestion browsers would be less buggy with properly formed xhtml and a number of theories about what the future held for bots and scrapers to be able to easily ingest and parse your content (seen as a good thing then).
It failed because the smallest error by a client after the fact was like a server crash. Plus it would have created a mild barrier to entry when learning html at all.
> I think XHTML failed because it didn't give web devs any new capabilities, so most didn't feel the need to learn it and do the extra work of getting their tags correct.
xhtml was entirely opt-in, people opted into it, then served broken content. xhtml failed because that broke content (from people who, again, had specifically opted into serving xhtml) was an utterly terrible for everyone involved, as the user would get a big fuck off page devoid of any content, information, or means of redress, and there was no way for administrators or authors to get notified that their content was broken.
Meanwhile HTML would usually let you do the things you wanted to, and if you noticed something was broken you'd usually be able to hunt down a contact form and send a notice.
HTML5 is not what killed xhtml, xhtml is what did that, because it was a dreadful experience all around and had absolutely no redeeming quality.
Hell, the W3C was so into xml at the time there was an xhtml5 serialisation for html5. Technically it's still there (https://html.spec.whatwg.org/multipage/xhtml.html). That was of great use to the nobody whatsoever who was interested.
I’ve been surfing the web for a month with a ‘push to enable JavaScript’ button and it’s going pretty well. Very few sites are worth my time to enable JS for them, and they tend to lose the privilege immediately after I’m done extracting whatever my value is from them. Don’t have to charge my phone as often, so if nothing else that’s a win.
So... I think scripting is actually really important -- otherwise not only are you stuck with the lowest common denominator of all browsers, but the browsers need to implement a billion bug-prone views -- that map view link mentioned? Now you need a map viewer!
What you want is to have scripting with capabilities -- preferably on top of WebAssembly (JS is a sin).
The best part is this improves the experience of noscript users -- rather than nice graphical widgets being broken, instead, they can just run scripts without any "network" capability -- which should forbid the scripts not only from accessing the network, but make it so anything they modify becomes "tainted" and is not allowed to show up on a network call (so e.g. if they encode some data in a form, trying to later submit that form somewhere else on the app will give a warning).
Now -- most people don't care and don't want this. And that's a good thing -- capabilities put the power in the hands of the user agent where they belong.
More interestingly-- capabilities can be shimmed! Rather than "you are not allowed to access my GPS", it should be a first-class feature to feed the WASM a GPS stream of your choice.
> So... I think scripting is actually really important -- otherwise not only are you stuck with the lowest common denominator of all browsers, but the browsers need to implement a billion bug-prone views -- that map view link mentioned? Now you need a map viewer!
In the browser? The map viewer could just be a separate programme entirely, like a PDF viewer, etc. I remember watching rdg (the current main Dillo developer) demonstrating this with a separate map programme.
Most of your post seems to assume this "everything must be in the browser" approach, which is actively not what Dillo is about. (I would know, I use Dillo regularly.) It adheres to the Unix philosophy.
EDIT: Looking at it closely, did I just respond to an LLM post?
I think original web standards were solving a completely different problem: sharing information.
Modern Internet is 45% appearances and 50% search traffic optimizations. For better or worse we lost all usable registries of websites, we lost appearance-less and traffic considerations-less websites. Information-focused Web is pretty much dead.
Maybe these ideas did not scale and did not monetize that well, but we will never really know what information-focused version of Internet would have looked like because evolution took it elsewhere. Unless we try building another one with different principles and limitations at the core.
The current web supports flat information delivery, and it's there if you want it. Wikipedia can be presented in pure text. If you write a story or an essay you can post it in many places, including your own web site.
Perhaps what's needed is for an alternative search engine. Assert that you will only index a site that meets some strict set of limits. If that's what people want they will use that engine. If it's popular sites will have have to find ways to get listed, e.g. "simple.amazon.com" which supports that standard.
I agree. Even where blogging and sharing information is still around, it is strongly linked with brand-building, monetization, and engagement-maxxing. Look at all the old Wordpress bloggers who switch to Substack in order to have some eyeballs on their posts, and then inevitably begin conforming to its ethos willingly or unwillingly.
For me, the information-sharing part of the internet now is the shadow libraries. I can get access to all (well, still not quite all) journals and university-press publications from the last century? Awesome. Vastly more informative than some blogger who nowadays is probably trying to monetize my attention.
The only sort of problem this might solve is the insanely low barrier of entry that the Web has in 2026. The Web was arguably a better (albeit imperfect) place when it was dominated by geeks and kids who could learn to use it faster than their elders. It was a club in a sense. Today it's a club where everyone on the planet is invited, meaning it's no longer a club. I know that sounds great to a lot of people, but I don't agree that systems become better with more participation and fewer criteria for that participation.
Even so, those who want to share and access information can already do that via the Web. Nobody has to use scripting. Nobody has to use The Google as their search. Nobody has to rely on an LLM. If there is demand for simple webpages that are free of scripting, they can be built and shared today. Because of this, the proposal comes off as very out of touch and deep within the HN bubble. Strict grammar for declaring documents is merely a fetish. If there's no scripting, then there's no reason for a document to break for some silly reason.
I don’t see how this helps someone who wants to create a website. You don’t have to use JavaScript on your website if you don’t want to and you can use a different format for your text files that translates to HTML. (Markdown is a popular alternative, but you can invent something different.) What’s the upside in requiring your audience to use a different browser than they normally use?
I have been thinking about this problem for a couple of years so far. but I also needed a Censorship free with built-in authentication and authorization. And I ended up creating a protocol called Mau that hasn't been implemented or used yet, so it's been waiting around here. Check it out if you think it can help
https://mau.social/
Why not try to define a strict subset of the current specs, that would target ease of implementation & graceful degradation? I'd rather have many different clients compatible with a "web-lite" spec that is enough to navigate on 95% of websites, which would have an incentive to officially support that subset if it becomes popular enough.
History explains why HTML is now a living standard: https://whatwg.org/faq (Ctrl+F Living and keep reading).
> A published version of the standard NEVER, EVER, EVER, EVER changes.
WhatWG does have per-commit snapshots of the standard. They're just not semantically versioned because it is a living standard.
I think what the author wants is something like Gemini instead of HTML, but that has its own set of problems. My plea for Dillo would be to instead just support a text/markdown mime-type natively and we can try for adoption in more browsers.
> The objective is not to create a feature-by-feature clone of the Web, but to create an specification that allows humans to exchange knowledge, notes, and other forms of information without the imposed requirement of having to run a full blown VM to read it.
Markdown in browsers fits your objective! Only gotcha is commonmark extensions, and they can work with sub-type declarations in the mimetype.
> I think what the author wants is something like Gemini instead of HTML, but that has its own set of problems.
Yes and no. I want it to be simpler than HTML (which implies less features) but easy to parse. The problem with Markdown and other "text-like" formats is that they are designed to be written by humans (which is good) but complicates the parsing. I guess is more similar to the device independent format used by groff/troff before layouting.
>My plea for Dillo would be to instead just support a text/markdown mime-type natively and we can try for adoption in more browsers.
Dillo only supports a subset of HTML. Other formats like markdown are converted to HTML with plugins or read as plain text.
This sentence highlights the reason why these efforts fail despite any original good intentions:
"as soon as a monopolistic entity can build a mechanism to extract revenue from it, there will be an incentive to capture the standard and change it to for their own benefit"
Personally I'd love a simple semantic versioned subset of the web. The required traction and buy-in from existing key players (browser vendors, web hosting platforms etc) makes it largely a non-starter though. I'd love to be wrong though.
Instead of "forking", it may be more prudent to extend or revive something more like Gopher, so you don't constantly get baraged by incompatible sites (like you would in a forked web)
I mostly agree with the article - I believe the differentiation should be between documents and applications.
While HTML serves its purpose, especially for documents, the modern web is a giant mess of that legacy, combined with unfriendly ergonomics and glue/hacks built on top just so we as developers can have better DX for creating complex software on top of it.
Building a browser means having to deal with all that legacy, wether we like it or not, so most of the browser market got captured by the big players who have enough manpower to cover all those edge cases. That also means we have to deal with whatever technical choices or bloat they make, causing an infinite stream of issues, from memory usage, to size, to limitations that don't make sense in 2026 but are still there because someone 20 years ago decided to write them like that. As I deal with mobile webviews a lot in my daily work, I unfortunately had to get familiar with quite many gotcha's and edge cases, and some are just... absurd in this day and age.
I believe we need a separation between an application layer and the document layer, and especially between the UI language and the actual application code - script tags serve their purpose, but again, they are a hacky solution with its own bag of tricks, and those tricks impact all of the software built upon it.
Now, a bit of a shameless plug I've been working on something to fill that gap, at least for myself and hopefully for others who encounter the same issue - it's called Hypen (https://hypen.space) and it's a DSL for building apps that work natively on all platforms, with strict separation of code/UI/state, and support for as many languages and platforms as I can maintain, not "just javascript". While currently it's focused on streaming UI, it's built with Rust and WASM at it's core and will soon allow fully "compileable" apps.
While it may not be the future of software, once you get into building something like that, it becomes obvious that the way we are building now is at least wrong, and at best kafkaesque.
> Adding scripting capabilities was a mistake, so we can avoid it now.
> Instead, you can provide a Geo link to open the location in any client that supports the protocol.
Sorry but as someone old enough to remember when the web was mostly non interactive I vastly prefer the current situation despite its many shortcomings. I want to keep a minimal number of softwares on my computer. I don't want to give a hundred "clients" access to my computer when I can just run JavaScript sandboxed in my browser. If someone sends me a link and tells me it's a cool game he found online I will open it in my browser and have a look but I will not just run random binaries on my computer. Oh, and I like being able to access any website just from my browser on my Linux, instead of hoping that there is a Linux client that isn't 5 years out of date or fiddling with wine to figure out why the windows binary wouldn't run.
I understand why people dislike the web sandbox or having to run a full blown VM for everything, but please understand that this is also what makes the web great. You can run everything and fear nothing.
You've misunderstood. The blog post is not talking about running random binaries. It's talking about opening links and files using different programmes, like PDF viewers, video players, etc. There's a video of a talk that the developer gave, which I can't find the link to at the moment, where he demonstrates running a map programme (already installed on the machine, not just fetched from a random website) to open a link with lat/lon coordinates with an interactive map.
In general, Dillo follows the Unix philosophy. You use separate programmes to handle things that Dillo can't itself, like watching videos.
I completely agree. The situation before the "Web Platform" was the "Windows Platform" where you had to give money to Microsoft to use a computer because few developers wanted to make cross-platform software, and almost nobody wanted to make good cross-platform software. As Mac user it was miserable.
> One of the problems with the Web is that as soon as a monopolistic entity can build a mechanism to extract revenue from it, there will be an incentive to capture the standard and change it to for their own benefit. In the particular case of the Web, this has resulted in a standard that grows out of control in complexity so it increases the barrier of entry for new browsers and reduces the competition.
Maybe I'm just stupid, but I don't really know what the author is talking about here. What parts of the standard? HTTP? HTML? DOM APIs? What?
The purpose should also be defined. It should answer the question why. Also, what's broken with scripting and what alternatives are proposed? What's the end state (with an example usage of the new web).
Can't say I hate the HTML 5 spec. It resolves the ambiguities that made previous HTML specs insufficient to make a working web browser.
The standards that make my life miserable at times are the secondary standards like GDPR and WCAG as well as the de facto "standard" systems we are forced to participate in such as Cloudflare, the advertising economy, etc.
It's easy to say "WebUSB is bloat" and I'd certainly say PWA is something that could only come out of the mind that brought us Kubernetes, but lately I've been building biosignals applications and what should my choice be: write fragile GUI applications for the desktop that look like they came out of a lab and crash from memory leaks or spend 1/5 the time to make web applications that look like they belong in the cockpit of a Gundam and "just work"?
>I'd certainly say PWA is something that could only come out of the mind that brought us Kubernetes
How so? PWAs are awesome! Democratizing for users. Democratizing for developers. They work well for the right class of apps. They would go much further if there weren't forces actively resisting them. Think of all the electron type-apps out there. Now imagine if the average Joe could just install them from the web with 2 clicks.
(Regular ole bookmarks get you a decent percent of the way but clearly something extra than that was needed.)
I am generally interested in approaches to cut down complexities of fundamental web technologies. Creating a browser from scratch shouldn't be impossible or a trillion dollar experiment. But...
> No scripting
How is will it be possible to go back? The average ecom presence usually relies heavily on JS. I haven't checked in a long time that any relevant sites work without JS. I think going back to more basic approaches could even improve user experience, as many usage patterns probably would converge and simply look and function as intended. But considering that the whole web world is so fixated to solve everything with JS seems like targeting the highest resistance target you can find. Don't get me wrong, I hate this situation and we must not have a single language that dominates everything.
I also don't believe is that enthusiasts will create a significant shift. They can surely provide the fundamentals, but if there isn't a huge mainstream impact, it will not change anything.
Good idea, we absolutely should replace the Web, but I have some issues with this proposal:
- We don't want multiple versions (1.1.1, 1.2.1), but we also don't want constant churn (the current dev/product fad). What we want is one thing that works well indefinitely, is backwards compatible, changes infrequently, and can be expanded if necessary. In order to achieve that, we have to abandon the idea of monolithic web browsers.
"The Web" is not a hypertext document viewer, as much as some people (myself, and Dillo probably) would like it to be. It is an application platform. So you must consider the needs of an application platform if you want a "new Web". The browser interfaces with the entire OS + a slew of protocols and libraries. It's Android in userland. It will change as constantly as OSes and tech changes, which is constant. So to get away from churn, we need to break up the application platform into layers. Those layers need to have simple, well-defined backwards-compatible interfaces, with extensions. The model for that has been around for decades; network protocols last 60+ years without needing to be replaced, but add features over time, without getting feature creep, and remain backwards compatible. There aren't a ton of versions of common internet network protocols. And importantly, you don't have to use one implementation, the way people get stuck on one browser.
The standard should follow this extremely well established pattern of layers of independent components which aren't built into a monolith. It can still have a version (initially), but we shouldn't need to change the version, we add feature flags and handshakes, the way network protocols do. The end result should be a combination of a "web POSIX" + "layered protocols/specs".
- "Pages that don't conform with the specification won't be rendered" - this simply is never going to happen. The history of software development is littered with examples of having to work around implementations of specifications. Your client can try to render strictly, but it will inevitably break on someone's implementation, and you will be forced to deal with it, or lose your customers/users.
"Having a strict grammar will likely cause humans to migrate to a language that is easy to write and is more forgiving ... The objective is that parsers can be simplified and the cost of creating tools that can manipulate the content is lowered" - This sounds like you're saying, programming is hard, so let's make the user have to work around our inability to solve hard problems. Easy is not always better.
- "Resistance to standard capture"* - I think this goes back to the layers. Remember you are building an entire Application Platform. Think about Linux and Open Source. How does it resist capture? Independent organizations and authors, loose associations, cobbled together components. There is nobody in control, so you can't capture it. This is actually the same with network protocols (other than HTTP, we all know Google controls the spec). We can take ideas from many places. As just one random example: MCP is a simple yet powerful way for independent entities to add functionality to an application both locally and remotely, yet is independent of both the client and the server. Another example is Plan9, where you can support anything in the world and use it as a file (both locally and remotely), as long as you make and run the driver for it.
- "Text first" - You just lost the room. If you want text only, stick to Gopher. An application platform requires multimedia. You would do well to craft the spec so that it can convert application presentation into a text structure. Sell it as accessibility.
- "No scripting" - Now your proposal is dead. Again, Application Platform!! People want a way to cheaply deliver and run application code in real time. I think this needs a lot of careful attention, because you don't want to continue the status quo of requiring a single monolith to interpret and execute logic for the entire application platform.
Seems like somebody is not accepting that every successful project will grow and become unwieldy like this. This is all legacy backwards compatibility of all iterated ideas that now you have to support.
Ah yes, another "If I Were King" blog post. For an example of how it will turn out, look at how many JavaScript frameworks have been built to replace an overly complicated, unwieldy previous one.
> The specification must contain a non-ambiguous formal grammar that can be parsed easily. A page can then be tested against the standard and reject or accept as compliant. Pages that don't conform with the specification won't be rendered. It is explicitly forbidden for clients to accept any page that doesn't conform with the specification.
This is what XHTML was, and it was a complete disaster. There's a reason almost nobody serves XHTML with the application/xhtml+xml MIME type, and that reason is that getting a “parser error” (this is what browsers still do! try it!) is always worse than getting a page that 99% works.[0] I strongly believe that rejecting the robustness principle is a fatal mistake for a web-replacement project. The fact that horribly broken old sites can stay online and stay readable is a huge part of the web's value. Without that, it's not really “the web”, spiritually or otherwise.
[0] It's particularly “cool” how they simply do not work in the Internet Archive's Wayback machine. The page can be retrieved, but nobody can read it.
What if you don't output invalid XML? If you can manage a valid HTTP response then you can manage valid XML, can't you?
Agreed. There may be some situations where I may want to ensure 100% correctness. I'm thinking life or death scenarios, (which if so, maybe should use a different protocol). However, checking the sports score or looking at cat memes isn't that.
To be fair, HTML5 also has a defined parsing algorithm. It just happens to always work on any input to produce a webpage
Yes, this is what you'd want. It doesn't have to be a complicated as the HTML5 algorithm either. That's complicated because it was a harmonization of at least 3 browser's multi-decade heuristics and untold terabytes of existing HTML practice. An algorithm unconcerned with backwards compatibility could much simpler, but still clearly define error behavior much easier to use than "scream and die".
And it's still unambiguous. You can cringe at what some people do, but it would be strictly a taste issue rather than a technical one, as the parse would still be unambiguous. And if you think you can fix taste issues with technical specification, well, you've already lost anyhow.
I don't get this reply. GP didn't say anything about parsing algorithms, they said (correct) things about hard errors on the web.
why for? the reply is about factual historical experience with webpage hard errors.
Would you like to have a law that forbids you, under penalty of fine, to read any book you buy or borrow that is lacking or has damaged pages?
I thought they were just bolstering the refutation of TFA's assertion that XHTML is strictly better because of its parsing algorithm.
I think the GP has an issue not with the specification part, but with the part where it's forbidden for clients to render a noncompliant page.
It's not forbidden. They just don't render certain noncompliant pages. Namely the ones with gross syntax errors.
Why are we okay with formats like PDF that have similarly catastrophic error handling?
I mean, the linked page and the comment above say it is:
> It is explicitly forbidden for clients to accept any page that doesn't conform with the specification. This prevents the standardized diabolic rules that one must implement in order to correct a
No scripting is a tell, it's about wanting other people to accommodate their concerns about running a complex browser, not about solving a real problem.
If it did somehow happen that a good deal of interesting content was published using the standard, the most popular client would probably be nonconforming, ignoring the rule to not render ambiguous content.
Every modern alternative web protocol is about accommodating the author's concerns and pet peeves about the modern web (and usually gatekeeping it from capitalists and normies.)
Protocols used to be limited by technology, now they're defined by ideology.
Author here. I agree that you cannot go from HTML to XHTML because users and UA devs will always go towards "it mostly works".
However, I don't see it that clearly that this cannot be done since the start so that the expectations are right since the beginning. For example, I don't see the same problem in other formats like JPEG or PNG where you expect the image to work perfectly or fail with a decoding error.
Other than implementing it and see how it goes, can you propose a feasible experiment to see how an new strict spec will measurably fail?
browsers will display invalid/corrupt images (best effort)
tried it right now - took a PNG and a JPEG, opened them in a text editor, literally deleted the second half of the file, saved, and dragged them into both Firefox and Chrome - they are displayed instead of erroring out.
there is a classic article why a minimal version of the web with features removed will fail - you removed 80% of the features that YOU think are not important. thats a classic fatal mistake
search the web for different proposals for a minimal web and you will understand - they will have removed some feature they think is bloat but which you kept in your proposal because you consider it critical. which is why you created a new proposal - their minimal proposal is not the right one for you
https://www.joelonsoftware.com/2001/03/23/strategy-letter-iv...
> they are displayed instead of erroring out.
I think what is lost on many people, ironically even the ones who want to retvrn the web to its former glory, is that the browser tries to display broken, half transmitted content because it happened so frequently due to circumstances completely out of the website operator or the user's control. And in most cases showing a half transmitted web page with half of the closing tags missing is almost certainly better than just outright refusing to show anything.
> I agree that you cannot go from HTML to XHTML because users and UA devs will always go towards "it mostly works".
That... is not how anything happened.
> I don't see the same problem in other formats like JPEG or PNG where you expect the image to work perfectly or fail with a decoding error.
Browsers absolutely decode as much as they can, and if the file is corrupted halfway through you generally get garbling, not the entire image being replaced by "fuck off". The only case where that is so is if the browser can't parse anything at all, or can't retrieve the file.
> Other than implementing it and see how it goes, can you propose a feasible experiment to see how an new strict spec will measurably fail?
We already did that and saw where it went.
> Browsers absolutely decode as much as they can, and if the file is corrupted halfway through you generally get garbling, not the entire image being replaced by "fuck off". The only case where that is so is if the browser can't parse anything at all, or can't retrieve the file.
What I meant is that you don't expect PNG or JPEG images to be created in a way that the parser needs to run a complex process to reconstruct the bits that are broken and interpret what you meant to say. Like this one:
https://html.spec.whatwg.org/multipage/parsing.html#adoption...
Perhaps a better example is a C program being compiled into an executable. You don't expect the compiler to guess what you meant while parsing.
The current expectation is that a web browser must load any broken HTML and still display what it can, and is this expectation what I would like to change.
I don't propose humans to write this format directly (although it should be human readable), but compile it from something that is easy to write, like Markdown or a similar language. The objective is to enforce tools that make the transformation to produce a strictly conformant document.
Having a context-free grammar allows simple and fast parsing tools that can process your document, in a similar way that you can query or manipulate a JSON file with tools like jq because the grammar is simple and strict.
> What I meant is that you don't expect PNG or JPEG images to be created in a way that the parser needs to run a complex process to reconstruct the bits that are broken and interpret what you meant to say.
So what you meant is neither what you wrote, nor what you advocate for?
because in case you have forgotten here is what you advocate for:
> Pages that don't conform with the specification won't be rendered.
That is not how image rendering in browsers works. That is how XHTML does not work.
> Perhaps a better example is a C program being compiled into an executable.
It's not a better example, because it's a completely different and unrelated use case: C programs are usually not dynamically generated, and even when they are the person who compiles the code is usually either the person who wrote it or a person who has ways to fix it or report errors.
Not so when trying to read a web random page on the web.
> I don't propose humans to write this format directly (although it should be human readable)
Approximately nobody wrote xhtml by hand, didn't save it.
This is also a nonsensical constraint-set on its face, there is no point to a human readable format which is not human writeable.
> The objective is to enforce tools that make the transformation to produce a strictly conformant document.
Ah, an open and non-monopolizable format which can only be written via an official toolchain.
> Having a context-free grammar allows simple and fast parsing tools that can process your document in a similar way that you can query or manipulate a JSON^H^H^H^HXML file with tools like jq^H^Hxmlstarlet because the grammar is simple and strict.
None of which seems of any use to a format which pretends to human production and consumption. JSON is an interchange format between machines.
> Pages that don't conform with the specification won't be rendered.
I agree on what I wrote here. They will fail with an error indicating where the mistake is so you can correct it (more likely the tool that produced it).
>> The objective is to enforce tools that make the transformation to produce a strictly conformant document. > >Ah, an open and non-monopolizable format which can only be written via an official toolchain.
??
The objective is that when you make a tool like markdown-to-foo, the output follows the spec. There is no mention of any "official toolchain".
> xmlstarlet
XML is strict. Try to find the same tool for HTML5, especially for transformations.
> JSON is an interchange format between machines.
Is pretty much what the specification would try to cover.
> That... is not how anything happened.
What the heck are you talking about? User agent devs and users did indeed always go toward it mostly works.
People didn't go towards "it mostly works", people go towards "it works at all". A lot of people tried to use xhtml, and it didn't work, broken content was pervasive and the experience when facing broken content was irredeemable.
What was the exact nature of how devs found themselves unable to emit valid XML in all scenarios? What kind of bugs did they run into?
XHTML failed in an era when writers (even normies) were writing some HTML of their own and they could't be trusted to close their tags properly. XHTML also assumed writers would be personally invested in semantic markup like distinguishing e.g. the italics of book titles from the italics of emphasis.
Today, when writers are using visual editors (or Markdown), few are writing their own HTML any more. A web standard requiring compliance would work differently today.
Markdown sux and so do visual editors. I think visual editors were just invented to make it so cut-and-paste never quite works right. There's been some conceptual problem with the whole idea ever since MS Word and the industry has never dealt with it.
> XHTML failed in an era when writers (even normies) were writing some HTML of their own
I'd say it was a minority of writers that were handcrafting XHTML. And it was the case that everyone or their handcrafting or using tools could validate their compliance using a browser which made it very easy to adjust your tools or your handcrafted code. We are now in a situation where there is no schema for HTML.
I, for one, am very much in favor of forking the web with a document format with a schema. It really seems like a small and simple change to me.
Note that when I say "writing their own HTML", I don't mean handcrafting a whole webpage. I mean that people were writing i or b tags in their Wordpress editors or in online comment boxes, because back then such text fields did not have visual editors and would accept raw tags. Under XHTML, if the writer did not close tags properly, such input would have broken the whole page, so obviously back then such a standard was DOA.
Those cases were easy to fix by using eg htmltidy on the UGC.
Honestly I don't think it was killed by one thing, or by anything. Just no platform really cared and it wasn't a win for anyone and occasionally a loss.
> There's a reason almost nobody serves XHTML with the application/xhtml+xml MIME type, and that reason is that getting a “parser error” (this is what browsers still do! try it!) is always worse than getting a page that 99% works.
That’s not the reason almost nobody serves XHTML.
The real reason is Internet Explorer. Okay, it’s a little more nuanced than that, but I think it’s accurate enough. Microsoft killed XHTML by inaction.
It’s 2004. XHTML is now a few years old, and all the rage. You decide to use it for your new project which you’re developing. At the start, you serve pages as application/xhtml+xml, and that works well in Firefox; but you know that won’t work because Internet Explorer still doesn’t support XHTML, and 90% of your viewers will be using that. So, a little frustrated, you serve your nice XHTML as text/html. You still validate it manually for a while, but then that habit disappears. Eventually you make one or two small mistakes that would have been caught easily if it were parsed as XML—but it’s not, because of Internet Explorer. Over time this disparity grows.
People have been complaining of the inefficacy of XHTML for this exact reason for two or three years by this point.
It’s 2006. XHTML is acknowledged to have failed. Everything else supports it, but as long as IE doesn’t, you can’t serve as application/xhtml+xml, and so you can’t get the advantages of XML syntax.
Seriously, early failure is good—so long as you’re working with it from the start. The problems only occur when you try to add strictness later.
Just look at typing in code bases. Adding strictness to existing JavaScript or Python or Ruby? Nightmare. Starting with static types? Somewhere between fine and extremely desirable.
(I might be overselling strictness’s popularity at the time—people don’t always like what’s good for them. We’ve largely realised now that unfettered dynamic typing is a bad idea, but ten years ago that was not settled. People get used to things. If IE had permitted XHTML early on, people would have got used to the idea of XHTML’s strictness and, I think, got to mostly like it.)
XHTML did not fail because of XML’s catastrophic parse failure mode. It failed because HTML already worked, and Internet Explorer took way too losng to accept XHTML. If you’re forking the web and compatibility with existing documents is not a goal, you can’t use XHTML’s failure as an argument: it failed because of compatibility issues.
Well, Internet Explorer did eventually support application/xhtml+xml: in 2011, IE9. Way too late to matter. And so only by around 2015 or 2016 could you finally serve with XML syntax. And now why would you? For your system is big and has tiny errors here and there and your CMS just drops markup in and never got round to validating it and and and and so on. By that time, HTML had given up on the XML path, and although it worked, the momentum was entirely gone, so you’d run into difficulties due to inadequate documentation, inferior tooling (ironic), and various more.
Web browsers turned into application engines because it was a path to get useable software on PCs without having to deal with Microsoft. IE6 stayed broken forever for a reason.
Now, they enable applications to exist without going through app store gateways.
A new document-only protocol aligned the Web's original intention would be very useful simply for security reasons. I liked Gemini because, by design, a Gemini document is not executable in any way; there's no popups, plugins, or even cookies; all this is out of the box without having to manage settings, and Gemini documents are very readable without an app at all.
But replacing the modern browser rather than being another option will actually lock in people further than they already are-open protocols require apps which are all behind a gateway now on the primary computing device of users: phones.
It probably won't matter in a few years as the Web will likely be equally locked down soon, though.
> Web browsers turned into application engines because it was a path to get useable software on PCs without having to deal with Microsoft. [...] Now, they enable applications to exist without going through app store gateways.
What? You could deploy software without dealing with Microsoft back then and you still can today. Unless you meant avoiding building for Windows natively.
>Web browsers turned into application engines because it was a path to get useable software on PCs without having to deal with Microsoft. IE6 stayed broken forever for a reason.
Nonsense, lots of software were just local, I've even see MSN clones written in TCL/Tk, and Lazarus still used in some places, and tons of VB6/C# software. Back in the day except for Intranet turds (which in the end causes disasters like Iloveyou.VBS "thanks" to IE/Outlook deeply tied to Windows 9x software ) everyone serious about programming security and correctness flew away from the web model for the good. It was everything about Java (and applets) and later C#. The web had an overgrowth and languages which shouldn't be part of the desktop.
Developers would rather fork the Web than admit Chrome is the new IE6 and stop targeting it.
I think at least part of the reason for this is acknowledging that the web isn't much of a web any longer. You've got three or four vendors that serve the vast majority of all internet traffic. And it's not happenstance that those same vendors now control something which was originally meant to be democratic.
Most of this document reads to me like that's the problem they're trying to solve, not just chrome's huge marketshare, so simply not targeting it doesn't serve their purpose.
how is the web not democratic?
in real democracies the populists (facebook, tiktok, chrome) always win. because that's what the masses want
> in real democracies the populists (facebook, tiktok, chrome) always win. because that's what the masses want
Is Friedrich Merz a populist? Was Angela Merkel a populist? This theory seems to have considerable limits.
All I can say is if OP's name were "xhtmlenjoyye" they'd respond like so:
The context is real democracies, not messy extant nation-state governments. Please delete your comment so no one can read it.
Whoever has the most compute controls the narrative. It's AIs biggest contribution to the internet.
Google drop the mask
If I could, I would post an Amen gif.
Perhaps if we fork the site …
There are already 1,800 forks: https://github.com/search?type=repositories&q=hackernews+clo...
All we need is one fork that merges all of them into a single standard!
Hacker News is obviously a very corporate-centred website, so most of the posts in this thread are about profitability and economic value. If that's the lens through which you see things, forking the web seems like a waste of time. It's obviously not profitable.
I don't care about any of that, I just want to have fun on the internet. By that metric, most of the criticisms in this thread are irrelevant. It doesn't need to make money, it doesn't need to be used by more than a few nerds, and it doesn't need a zillion bells and whistles. Whether rdg (the author of the blog post) shares this goal, I don't know.
Yeah, I avoided sharing it here because I could see that it would immediately backlash. I also didn't even consider adding a more elaborate "introduction" section because these are my quick notes on what it had in mind at that moment.
Being on the development of Dillo for a few years makes you see things from a different perspective. I also think that it should be fun to make your own tools from scratch and be able to understand the specs in a couple of weekends. Pretty much what happened with Gemini and the explosion of clients and servers:
https://geminiprotocol.net/software/
> A page can then be tested against the standard and reject or accept as compliant. Pages that don't conform with the specification won't be rendered. It is explicitly forbidden for clients to accept any page that doesn't conform with the specification.
it's as if nothing was learned from the XHTML debacle
I think XHTML failed because it didn't give web devs any new capabilities, so most didn't feel the need to learn it and do the extra work of getting their tags correct.
Then html5 came along, providing all kinds of shiny goodies and saying not to bother with the tags. In the end, a more rigid standard would have been nice.. (Though this is mostly about the skin deep part of the standards.)
That is not how I remember it (for a data point of one shop in New England during the time): we embraced it because of the binary validation under multiple theories. There was a strong suggestion valid html did better from an SEO perspective, so we could sell that, a suggestion browsers would be less buggy with properly formed xhtml and a number of theories about what the future held for bots and scrapers to be able to easily ingest and parse your content (seen as a good thing then).
It failed because the smallest error by a client after the fact was like a server crash. Plus it would have created a mild barrier to entry when learning html at all.
> I think XHTML failed because it didn't give web devs any new capabilities, so most didn't feel the need to learn it and do the extra work of getting their tags correct.
xhtml was entirely opt-in, people opted into it, then served broken content. xhtml failed because that broke content (from people who, again, had specifically opted into serving xhtml) was an utterly terrible for everyone involved, as the user would get a big fuck off page devoid of any content, information, or means of redress, and there was no way for administrators or authors to get notified that their content was broken.
Meanwhile HTML would usually let you do the things you wanted to, and if you noticed something was broken you'd usually be able to hunt down a contact form and send a notice.
HTML5 is not what killed xhtml, xhtml is what did that, because it was a dreadful experience all around and had absolutely no redeeming quality.
Hell, the W3C was so into xml at the time there was an xhtml5 serialisation for html5. Technically it's still there (https://html.spec.whatwg.org/multipage/xhtml.html). That was of great use to the nobody whatsoever who was interested.
> I think XHTML failed because it didn't give web devs any new capabilities,
and what new capabilities does this new proposal provide?
It's as if nothing was learned from AI winter[]! it's obviously a technology dead-end.
[] https://en.wikipedia.org/wiki/AI_winter
I’ve been surfing the web for a month with a ‘push to enable JavaScript’ button and it’s going pretty well. Very few sites are worth my time to enable JS for them, and they tend to lose the privilege immediately after I’m done extracting whatever my value is from them. Don’t have to charge my phone as often, so if nothing else that’s a win.
So... I think scripting is actually really important -- otherwise not only are you stuck with the lowest common denominator of all browsers, but the browsers need to implement a billion bug-prone views -- that map view link mentioned? Now you need a map viewer!
What you want is to have scripting with capabilities -- preferably on top of WebAssembly (JS is a sin).
The best part is this improves the experience of noscript users -- rather than nice graphical widgets being broken, instead, they can just run scripts without any "network" capability -- which should forbid the scripts not only from accessing the network, but make it so anything they modify becomes "tainted" and is not allowed to show up on a network call (so e.g. if they encode some data in a form, trying to later submit that form somewhere else on the app will give a warning).
Now -- most people don't care and don't want this. And that's a good thing -- capabilities put the power in the hands of the user agent where they belong.
More interestingly-- capabilities can be shimmed! Rather than "you are not allowed to access my GPS", it should be a first-class feature to feed the WASM a GPS stream of your choice.
> So... I think scripting is actually really important -- otherwise not only are you stuck with the lowest common denominator of all browsers, but the browsers need to implement a billion bug-prone views -- that map view link mentioned? Now you need a map viewer!
In the browser? The map viewer could just be a separate programme entirely, like a PDF viewer, etc. I remember watching rdg (the current main Dillo developer) demonstrating this with a separate map programme.
Most of your post seems to assume this "everything must be in the browser" approach, which is actively not what Dillo is about. (I would know, I use Dillo regularly.) It adheres to the Unix philosophy.
EDIT: Looking at it closely, did I just respond to an LLM post?
I feel like that's not solving any of the problems I think of the Web as having.
You can certainly make something with it, but I can't imagine most people finding a use for it.
I think original web standards were solving a completely different problem: sharing information.
Modern Internet is 45% appearances and 50% search traffic optimizations. For better or worse we lost all usable registries of websites, we lost appearance-less and traffic considerations-less websites. Information-focused Web is pretty much dead.
Maybe these ideas did not scale and did not monetize that well, but we will never really know what information-focused version of Internet would have looked like because evolution took it elsewhere. Unless we try building another one with different principles and limitations at the core.
The current web supports flat information delivery, and it's there if you want it. Wikipedia can be presented in pure text. If you write a story or an essay you can post it in many places, including your own web site.
Perhaps what's needed is for an alternative search engine. Assert that you will only index a site that meets some strict set of limits. If that's what people want they will use that engine. If it's popular sites will have have to find ways to get listed, e.g. "simple.amazon.com" which supports that standard.
I agree. Even where blogging and sharing information is still around, it is strongly linked with brand-building, monetization, and engagement-maxxing. Look at all the old Wordpress bloggers who switch to Substack in order to have some eyeballs on their posts, and then inevitably begin conforming to its ethos willingly or unwillingly.
For me, the information-sharing part of the internet now is the shadow libraries. I can get access to all (well, still not quite all) journals and university-press publications from the last century? Awesome. Vastly more informative than some blogger who nowadays is probably trying to monetize my attention.
The only sort of problem this might solve is the insanely low barrier of entry that the Web has in 2026. The Web was arguably a better (albeit imperfect) place when it was dominated by geeks and kids who could learn to use it faster than their elders. It was a club in a sense. Today it's a club where everyone on the planet is invited, meaning it's no longer a club. I know that sounds great to a lot of people, but I don't agree that systems become better with more participation and fewer criteria for that participation.
Even so, those who want to share and access information can already do that via the Web. Nobody has to use scripting. Nobody has to use The Google as their search. Nobody has to rely on an LLM. If there is demand for simple webpages that are free of scripting, they can be built and shared today. Because of this, the proposal comes off as very out of touch and deep within the HN bubble. Strict grammar for declaring documents is merely a fetish. If there's no scripting, then there's no reason for a document to break for some silly reason.
I don’t see how this helps someone who wants to create a website. You don’t have to use JavaScript on your website if you don’t want to and you can use a different format for your text files that translates to HTML. (Markdown is a popular alternative, but you can invent something different.) What’s the upside in requiring your audience to use a different browser than they normally use?
This is what you want: https://en.wikipedia.org/wiki/Gemini_(protocol)
I have been thinking about this problem for a couple of years so far. but I also needed a Censorship free with built-in authentication and authorization. And I ended up creating a protocol called Mau that hasn't been implemented or used yet, so it's been waiting around here. Check it out if you think it can help https://mau.social/
Why not try to define a strict subset of the current specs, that would target ease of implementation & graceful degradation? I'd rather have many different clients compatible with a "web-lite" spec that is enough to navigate on 95% of websites, which would have an incentive to officially support that subset if it becomes popular enough.
History explains why HTML is now a living standard: https://whatwg.org/faq (Ctrl+F Living and keep reading).
> A published version of the standard NEVER, EVER, EVER, EVER changes.
WhatWG does have per-commit snapshots of the standard. They're just not semantically versioned because it is a living standard.
I think what the author wants is something like Gemini instead of HTML, but that has its own set of problems. My plea for Dillo would be to instead just support a text/markdown mime-type natively and we can try for adoption in more browsers.
> The objective is not to create a feature-by-feature clone of the Web, but to create an specification that allows humans to exchange knowledge, notes, and other forms of information without the imposed requirement of having to run a full blown VM to read it.
Markdown in browsers fits your objective! Only gotcha is commonmark extensions, and they can work with sub-type declarations in the mimetype.
> I think what the author wants is something like Gemini instead of HTML, but that has its own set of problems.
Yes and no. I want it to be simpler than HTML (which implies less features) but easy to parse. The problem with Markdown and other "text-like" formats is that they are designed to be written by humans (which is good) but complicates the parsing. I guess is more similar to the device independent format used by groff/troff before layouting.
>My plea for Dillo would be to instead just support a text/markdown mime-type natively and we can try for adoption in more browsers.
Dillo only supports a subset of HTML. Other formats like markdown are converted to HTML with plugins or read as plain text.
We should start from a single, sane specification. That is not a descriptor for the Markdown ecosystem.
This sentence highlights the reason why these efforts fail despite any original good intentions:
"as soon as a monopolistic entity can build a mechanism to extract revenue from it, there will be an incentive to capture the standard and change it to for their own benefit"
Personally I'd love a simple semantic versioned subset of the web. The required traction and buy-in from existing key players (browser vendors, web hosting platforms etc) makes it largely a non-starter though. I'd love to be wrong though.
Instead of "forking", it may be more prudent to extend or revive something more like Gopher, so you don't constantly get baraged by incompatible sites (like you would in a forked web)
I mostly agree with the article - I believe the differentiation should be between documents and applications.
While HTML serves its purpose, especially for documents, the modern web is a giant mess of that legacy, combined with unfriendly ergonomics and glue/hacks built on top just so we as developers can have better DX for creating complex software on top of it.
Building a browser means having to deal with all that legacy, wether we like it or not, so most of the browser market got captured by the big players who have enough manpower to cover all those edge cases. That also means we have to deal with whatever technical choices or bloat they make, causing an infinite stream of issues, from memory usage, to size, to limitations that don't make sense in 2026 but are still there because someone 20 years ago decided to write them like that. As I deal with mobile webviews a lot in my daily work, I unfortunately had to get familiar with quite many gotcha's and edge cases, and some are just... absurd in this day and age.
I believe we need a separation between an application layer and the document layer, and especially between the UI language and the actual application code - script tags serve their purpose, but again, they are a hacky solution with its own bag of tricks, and those tricks impact all of the software built upon it.
Now, a bit of a shameless plug I've been working on something to fill that gap, at least for myself and hopefully for others who encounter the same issue - it's called Hypen (https://hypen.space) and it's a DSL for building apps that work natively on all platforms, with strict separation of code/UI/state, and support for as many languages and platforms as I can maintain, not "just javascript". While currently it's focused on streaming UI, it's built with Rust and WASM at it's core and will soon allow fully "compileable" apps.
While it may not be the future of software, once you get into building something like that, it becomes obvious that the way we are building now is at least wrong, and at best kafkaesque.
All documents eventually become applications if they're useful enough. For anything that doesn't match that description, we have PDF.
> Adding scripting capabilities was a mistake, so we can avoid it now.
> Instead, you can provide a Geo link to open the location in any client that supports the protocol.
Sorry but as someone old enough to remember when the web was mostly non interactive I vastly prefer the current situation despite its many shortcomings. I want to keep a minimal number of softwares on my computer. I don't want to give a hundred "clients" access to my computer when I can just run JavaScript sandboxed in my browser. If someone sends me a link and tells me it's a cool game he found online I will open it in my browser and have a look but I will not just run random binaries on my computer. Oh, and I like being able to access any website just from my browser on my Linux, instead of hoping that there is a Linux client that isn't 5 years out of date or fiddling with wine to figure out why the windows binary wouldn't run.
I understand why people dislike the web sandbox or having to run a full blown VM for everything, but please understand that this is also what makes the web great. You can run everything and fear nothing.
You've misunderstood. The blog post is not talking about running random binaries. It's talking about opening links and files using different programmes, like PDF viewers, video players, etc. There's a video of a talk that the developer gave, which I can't find the link to at the moment, where he demonstrates running a map programme (already installed on the machine, not just fetched from a random website) to open a link with lat/lon coordinates with an interactive map.
In general, Dillo follows the Unix philosophy. You use separate programmes to handle things that Dillo can't itself, like watching videos.
no, parent understood correctly.
i use 50 different interactive web apps, i do not want to install 50 different apps
most of them do not have a "protocol" - ehat is the desktop equivalent of ExcaliDraw
I completely agree. The situation before the "Web Platform" was the "Windows Platform" where you had to give money to Microsoft to use a computer because few developers wanted to make cross-platform software, and almost nobody wanted to make good cross-platform software. As Mac user it was miserable.
> One of the problems with the Web is that as soon as a monopolistic entity can build a mechanism to extract revenue from it, there will be an incentive to capture the standard and change it to for their own benefit. In the particular case of the Web, this has resulted in a standard that grows out of control in complexity so it increases the barrier of entry for new browsers and reduces the competition.
Maybe I'm just stupid, but I don't really know what the author is talking about here. What parts of the standard? HTTP? HTML? DOM APIs? What?
>Adding scripting capabilities was a mistake, so we can avoid it now
Gemini protocol?
The purpose should also be defined. It should answer the question why. Also, what's broken with scripting and what alternatives are proposed? What's the end state (with an example usage of the new web).
"Dillo Browser" was not what the first thing I read and wondered if me clicking the link was even a good idea... xD
Why? Dillo has been around forever, as long as w3m I think.
Edit: actually it looks like w3m was ‘95 and Dillo was ‘99.
dillo is close to dildo
From HN TOS:
> If you are under 13 years of age, you are not authorized to register to use the Site.
(By the way, are you aware that the largest bakery company in the US is named “Bimbo”? Tee hee! You should tell them to change their name!)
My moustache filter says I'm above 13, thankyouverymuch.
Can't say I hate the HTML 5 spec. It resolves the ambiguities that made previous HTML specs insufficient to make a working web browser.
The standards that make my life miserable at times are the secondary standards like GDPR and WCAG as well as the de facto "standard" systems we are forced to participate in such as Cloudflare, the advertising economy, etc.
It's easy to say "WebUSB is bloat" and I'd certainly say PWA is something that could only come out of the mind that brought us Kubernetes, but lately I've been building biosignals applications and what should my choice be: write fragile GUI applications for the desktop that look like they came out of a lab and crash from memory leaks or spend 1/5 the time to make web applications that look like they belong in the cockpit of a Gundam and "just work"?
>I'd certainly say PWA is something that could only come out of the mind that brought us Kubernetes
How so? PWAs are awesome! Democratizing for users. Democratizing for developers. They work well for the right class of apps. They would go much further if there weren't forces actively resisting them. Think of all the electron type-apps out there. Now imagine if the average Joe could just install them from the web with 2 clicks.
(Regular ole bookmarks get you a decent percent of the way but clearly something extra than that was needed.)
PWAs are it. They would go well with a PDAs but the existing frameworks are terrible. Next, React powering the future internet, don't make me laugh.
Just use HTML and CSS, ignore javascript.
There are technical limitations. What I want to do is now requiring JavaScript. I dislike JS, but I have maxed the ability's of the HTTP POST method.
I am generally interested in approaches to cut down complexities of fundamental web technologies. Creating a browser from scratch shouldn't be impossible or a trillion dollar experiment. But...
> No scripting
How is will it be possible to go back? The average ecom presence usually relies heavily on JS. I haven't checked in a long time that any relevant sites work without JS. I think going back to more basic approaches could even improve user experience, as many usage patterns probably would converge and simply look and function as intended. But considering that the whole web world is so fixated to solve everything with JS seems like targeting the highest resistance target you can find. Don't get me wrong, I hate this situation and we must not have a single language that dominates everything.
I also don't believe is that enthusiasts will create a significant shift. They can surely provide the fundamentals, but if there isn't a huge mainstream impact, it will not change anything.
JavaScript isn't inheritedly bad. It's won that reputation because of bloat.
No one codes standard vanilla any more so it's always a framework and the existing are failures.
It's shocking how 2mb of JavaScript could be done in 2kb and that JavaScript was never designed to conceive.
Isn’t the web forked enough already
Isn't there already smolweb?
Dillo. This is a hot take of hot takes. But, I think it's correct. Let me know how I can help?
Good idea, we absolutely should replace the Web, but I have some issues with this proposal:
- We don't want multiple versions (1.1.1, 1.2.1), but we also don't want constant churn (the current dev/product fad). What we want is one thing that works well indefinitely, is backwards compatible, changes infrequently, and can be expanded if necessary. In order to achieve that, we have to abandon the idea of monolithic web browsers.
"The Web" is not a hypertext document viewer, as much as some people (myself, and Dillo probably) would like it to be. It is an application platform. So you must consider the needs of an application platform if you want a "new Web". The browser interfaces with the entire OS + a slew of protocols and libraries. It's Android in userland. It will change as constantly as OSes and tech changes, which is constant. So to get away from churn, we need to break up the application platform into layers. Those layers need to have simple, well-defined backwards-compatible interfaces, with extensions. The model for that has been around for decades; network protocols last 60+ years without needing to be replaced, but add features over time, without getting feature creep, and remain backwards compatible. There aren't a ton of versions of common internet network protocols. And importantly, you don't have to use one implementation, the way people get stuck on one browser.
The standard should follow this extremely well established pattern of layers of independent components which aren't built into a monolith. It can still have a version (initially), but we shouldn't need to change the version, we add feature flags and handshakes, the way network protocols do. The end result should be a combination of a "web POSIX" + "layered protocols/specs".
- "Pages that don't conform with the specification won't be rendered" - this simply is never going to happen. The history of software development is littered with examples of having to work around implementations of specifications. Your client can try to render strictly, but it will inevitably break on someone's implementation, and you will be forced to deal with it, or lose your customers/users.
"Having a strict grammar will likely cause humans to migrate to a language that is easy to write and is more forgiving ... The objective is that parsers can be simplified and the cost of creating tools that can manipulate the content is lowered" - This sounds like you're saying, programming is hard, so let's make the user have to work around our inability to solve hard problems. Easy is not always better.
- "Resistance to standard capture"* - I think this goes back to the layers. Remember you are building an entire Application Platform. Think about Linux and Open Source. How does it resist capture? Independent organizations and authors, loose associations, cobbled together components. There is nobody in control, so you can't capture it. This is actually the same with network protocols (other than HTTP, we all know Google controls the spec). We can take ideas from many places. As just one random example: MCP is a simple yet powerful way for independent entities to add functionality to an application both locally and remotely, yet is independent of both the client and the server. Another example is Plan9, where you can support anything in the world and use it as a file (both locally and remotely), as long as you make and run the driver for it.
- "Text first" - You just lost the room. If you want text only, stick to Gopher. An application platform requires multimedia. You would do well to craft the spec so that it can convert application presentation into a text structure. Sell it as accessibility.
- "No scripting" - Now your proposal is dead. Again, Application Platform!! People want a way to cheaply deliver and run application code in real time. I think this needs a lot of careful attention, because you don't want to continue the status quo of requiring a single monolith to interpret and execute logic for the entire application platform.
Seems like somebody is not accepting that every successful project will grow and become unwieldy like this. This is all legacy backwards compatibility of all iterated ideas that now you have to support.
"No scripting" is essentially setting the watcwatch back ~30 years to Mosaic.
It would be great to differentiate between "static" and "dynamic" pages based upon scripting, IMO.
At this point we need a fork of not just the web but the entire internet, one built for privacy.
I support forking the web, into the simple information web services that the web started with. This is a magnificent idea.
Ah yes, another "If I Were King" blog post. For an example of how it will turn out, look at how many JavaScript frameworks have been built to replace an overly complicated, unwieldy previous one.
oh and also https://xkcd.com/927/
Dillo's author (and users) don't give a crap on JS.