So I just tried this with a bunch of medium-complex documents and it's wildly wrong. I suspect the authors have never seen an actually complicated Word document?
My examples are all internal/confidential, but if someone from the project wants an example I could probably do some search/replace redaction. It would be a lot of work though because there's photographs and such too, and indexes, and tables, and documents inserted by reference, cross references, conditional fields, bibliography fields, formula fields, etc etc.
Office XML is surprisingly complex under the hood. The format packages multiple XML streams, relationships, and content types into a ZIP — making debugging without specialized tooling painful.
Rendering to HTML Canvas is a pragmatic choice. We work with legal documents daily and the fidelity gap between native Office rendering and HTML-based viewers is one of those "last 10%" problems that takes 90% of the effort. Things like tracked changes formatting, table layout inheritance, and nested content controls rarely render correctly in lightweight viewers.
For document-heavy workflows (legal, compliance, procurement), having a viewer that preserves structural fidelity — especially revision marks and annotations — is table stakes. Most web-based solutions we tested lost formatting on documents with complex nested structures.
Interesting approach. Does the Canvas rendering handle tracked changes and inline comments? That is where most viewers break down.
I don't know why this was flagged, but you are right.
Google Docs [1] and OnlyOffice [2] also employ the canvas method to render office documents, and have found it reliable and consistent among different browsers.
If someone actually got "pixel-faithful" Office documents rendering correctly, MS would be screwed. That's actually really important for a lot of companies that carry around decades-old templates that never look exactly right in LibreOffice or any other software that attempted to replicate it.
The slightest misalignment of a paragraph means a line on page 27 of 120 now moved down by 2 pixels, screwing everything else out of alignment. Yes, plenty of companies pay Microsoft 365 subscriptions because of exactly this reason; it sounds ludicrous when you think they could just pay someone to replicate the formatting in a different suite a lot less than the subscription costs, but that's not how it works...
Sadly, Microsoft 365 is not “pixel perfect” compared to word. I often run into headaches where line numbers are different between the two and content ends up on different pages.
If Microsoft can’t get consistent rendering of word docs between Word for Windows, Word for macOS and Office 365, I don’t like anyone else’s chances.
Can the same version of Word now produce the same rendering on two PCs? In the past (I didn't really check recently, I'm thinking more about 10 years ago) the same file might have had page breaks in different positions and things like that. I never understood if it came from slightly different versions of the fonts, from some info derived from the default printing or anything else...
I would be very surprised if that's the case. I heard from a buddy that used to work in Office back when SDETs were a thing. They had labs of random machines rendering the same files (out of a library of thousands) and comparing the actual pixels for regressions.
Interesting because I'm building ooxml-cli right now, for editing pptx, docx, xlsx.
At work I had to adapt a pptx to a corporate template and tried via agent. It kept failing so I started building and then it was able to relatively quick and accurate do what I needed.
Then I needed it to make tables, add pictures. Recently wanted to get data from an xslx and replace text in a presentation etc.
So the tool is growing and maybe this would be interesting to have as the non LibreOffice dependent viewer...
Very nice, the rendered demo for all the file types appear to render flawlessly and load instantly on page load, and looking in the DevTools the parsers are split into different Wasm bundles for each file type xslx, docx and pptx:
docx 458kb raw 217kb gzipped
pptx 574kb raw 253kb gzipped
xslx 601kb raw 269kb gzipped
I expected the Wasm bundles to be large and a lot more bigger than that for some reason.
ChatGPT.com can benefit from using this library (or such a library) for rendering a preview of the file in a side panel on the right, instead of just giving me a download link to the outputted/transformed docx/pptx/xslx file.
I tried a few PPTX files from consulting firms that are available online. The rendering does not seem to be truly pixel-perfect, but all of them were quite readable and had a good layout, which is already an impressive feat.
Total "Damn if I do, damn if I don't" situation . I put a similar disclaimer on my AI stuff too. It would be much easier if they didn't mention it. But if they're like me, they want to give people the means for an informed decision. We can respect that and move on. Do we even know if actual big software companies aren't doing it?
>doesn’t seem to be replicating documents all too well based on the many other testimonies in these comments
Ironic, that you are agreeing with a post saying they put in little effort for the implementation when you have put in absolutely no effort in saying that it doesn't produce pixel-faithful documents, such as producing a single concrete example.
which means it probably gets all the halucinated assets correctly and any real world documents wrong.
Still, looks pretty; if it actually has proper testing, could close the gap. Code not being the hard part is a major impediment to good software coming out of these things.
Pretty cool, rendering PowerPoint files to an image is probably the only way for LLMs to make sense of them.
Does this work in Cloudflare’s workerd environment? Would be nice to have a cheap serverless render -> LLM (GLM-OCR / PaddleOCR) -> Markdown pipeline for the various MS Office formats.
This code creates a JSON intermediate representation that LLMs could probably consume. You might want to simplify it to focus on content and reduce token usage.
I literally had to solve the "preview Office files in the browser" problem last week. I couldn't find a decent solution, so I ended up making a endpoint that ran the files through headless libreoffice on the server to convert them to PDF.
For PPTX and DOCX, this solution is slightly worse than libreoffice conversion (this does not appear to output highlightable text, while PDF conversion does).
However, the XLSX preview BLEW my mind considering this was AI coded. Really good, even interactive!
I have been working on a similar implementation. However, wanted to understand whether you were able to make changes to the rendered content and then download the updated file again. will go through your code in detail.. Thanks.
Rendering Office XML directly to Canvas is clever — avoids the heavy DOM tree for large documents. Does it handle embedded media (images/charts) or just text + formatting for now?
The post title about it being "pixel-faithful" is a bit strange. I don't see that claim in the repo, and they don't seem to even claim full feature support at the moment. And for the features marked as supported in .pptx's, it does seem that at least slide image backgrounds and bullet point images aren't actually working, and some text objects have inverted text colors. Seems quite far away from being pixel-faithful in fact.
So I just tried this with a bunch of medium-complex documents and it's wildly wrong. I suspect the authors have never seen an actually complicated Word document?
Do you have any actual examples of documents where it did not work?
My examples are all internal/confidential, but if someone from the project wants an example I could probably do some search/replace redaction. It would be a lot of work though because there's photographs and such too, and indexes, and tables, and documents inserted by reference, cross references, conditional fields, bibliography fields, formula fields, etc etc.
vibe coded, it seems from the commit history (and readme lol)
Office XML is surprisingly complex under the hood. The format packages multiple XML streams, relationships, and content types into a ZIP — making debugging without specialized tooling painful.
Rendering to HTML Canvas is a pragmatic choice. We work with legal documents daily and the fidelity gap between native Office rendering and HTML-based viewers is one of those "last 10%" problems that takes 90% of the effort. Things like tracked changes formatting, table layout inheritance, and nested content controls rarely render correctly in lightweight viewers.
For document-heavy workflows (legal, compliance, procurement), having a viewer that preserves structural fidelity — especially revision marks and annotations — is table stakes. Most web-based solutions we tested lost formatting on documents with complex nested structures.
Interesting approach. Does the Canvas rendering handle tracked changes and inline comments? That is where most viewers break down.
I don't know why this was flagged, but you are right.
Google Docs [1] and OnlyOffice [2] also employ the canvas method to render office documents, and have found it reliable and consistent among different browsers.
[1]: https://workspaceupdates.googleblog.com/2021/05/Google-Docs-...
[2]: https://helpcenter.onlyoffice.com/faq/technology.aspx
The account is flagged because it's a bot account.
For those interested here * is a similar project that I believe is *not* vibe coded posted a few weeks ago **.
* https://github.com/eigenpal/docx-editor
** https://news.ycombinator.com/item?id=48228411
quite a few "... and claude" commits lately at least.
There's a genuine difference between pure vibe coding and just using Claude to polish things a bit.
Fair, though for people avoiding LLM-using projects entirely it doesn't make a huge difference.
If someone actually got "pixel-faithful" Office documents rendering correctly, MS would be screwed. That's actually really important for a lot of companies that carry around decades-old templates that never look exactly right in LibreOffice or any other software that attempted to replicate it.
The slightest misalignment of a paragraph means a line on page 27 of 120 now moved down by 2 pixels, screwing everything else out of alignment. Yes, plenty of companies pay Microsoft 365 subscriptions because of exactly this reason; it sounds ludicrous when you think they could just pay someone to replicate the formatting in a different suite a lot less than the subscription costs, but that's not how it works...
Sadly, Microsoft 365 is not “pixel perfect” compared to word. I often run into headaches where line numbers are different between the two and content ends up on different pages.
If Microsoft can’t get consistent rendering of word docs between Word for Windows, Word for macOS and Office 365, I don’t like anyone else’s chances.
Can the same version of Word now produce the same rendering on two PCs? In the past (I didn't really check recently, I'm thinking more about 10 years ago) the same file might have had page breaks in different positions and things like that. I never understood if it came from slightly different versions of the fonts, from some info derived from the default printing or anything else...
I've heard that the same file can render differently in MS Word across different machines and OSs. So, that won’t help either.
I would be very surprised if that's the case. I heard from a buddy that used to work in Office back when SDETs were a thing. They had labs of random machines rendering the same files (out of a library of thousands) and comparing the actual pixels for regressions.
Of course, that was a decade ago, so who knows.
Interesting because I'm building ooxml-cli right now, for editing pptx, docx, xlsx. At work I had to adapt a pptx to a corporate template and tried via agent. It kept failing so I started building and then it was able to relatively quick and accurate do what I needed. Then I needed it to make tables, add pictures. Recently wanted to get data from an xslx and replace text in a presentation etc.
So the tool is growing and maybe this would be interesting to have as the non LibreOffice dependent viewer...
I have a bunch of goodies at https://rcarmo.github.io/projects/go-ooxml and https://rcarmo.github.io/projects/python-office-mcp-server you might enjoy then.
I ran it against some of our internal test files, and it failed on all of them how was this project even tested for proper compliance
No human-written application code exists in this repository.
It's 100% hallucinated.
Misread that as open office xml not office open xml. I wish the standards were named more differently. They are too easy to confuse
Microsoft did that deliberately.
Very nice, the rendered demo for all the file types appear to render flawlessly and load instantly on page load, and looking in the DevTools the parsers are split into different Wasm bundles for each file type xslx, docx and pptx:
I expected the Wasm bundles to be large and a lot more bigger than that for some reason.ChatGPT.com can benefit from using this library (or such a library) for rendering a preview of the file in a side panel on the right, instead of just giving me a download link to the outputted/transformed docx/pptx/xslx file.
I tried a few PPTX files from consulting firms that are available online. The rendering does not seem to be truly pixel-perfect, but all of them were quite readable and had a good layout, which is already an impressive feat.
Its kind of sad that the first thing in the repo is a mention that no human was involved in the programming.
As others here have already mentioned, it doesn't work all that well either, proving that AI can't replace humans completely.
Total "Damn if I do, damn if I don't" situation . I put a similar disclaimer on my AI stuff too. It would be much easier if they didn't mention it. But if they're like me, they want to give people the means for an informed decision. We can respect that and move on. Do we even know if actual big software companies aren't doing it?
"LLMs are amazing, I'm so much more productive now"
"oh yeah? Show me what you made, you can't, nobody can, it's all just AI psychosis"
"I made a pixel perfect Office document viewer"
"well... I wish you hadn't"
“If you use LLMs, you’re not a real developer, you’re lazy.”
The best developers are lazy.
Obligatory: https://xkcd.com/378/
Would author be able to do it otherwise? Is particular tool choice making result worse?
Bit identical/pixel-faithful reproductions are easy to verify…
> Is particular tool choice making result worse?
Well, yes, because it doesn't work.
> Bit identical/pixel-faithful reproductions are easy to verify…
And yet the prompter put so little effort in they couldn't even verify the software they prompted for does what it's supposed to.
Why do you say it doesn't work? It seems to work fine for the feature parity they claim.
For “pixel-faithful” documents it doesn’t seem to be replicating documents all too well based on the many other testimonies in these comments
>doesn’t seem to be replicating documents all too well based on the many other testimonies in these comments
Ironic, that you are agreeing with a post saying they put in little effort for the implementation when you have put in absolutely no effort in saying that it doesn't produce pixel-faithful documents, such as producing a single concrete example.
I'm fine with that, even as someone who hates AI.
Would this project exist otherwise? i doubt it
which means it probably gets all the halucinated assets correctly and any real world documents wrong.
Still, looks pretty; if it actually has proper testing, could close the gap. Code not being the hard part is a major impediment to good software coming out of these things.
Pretty cool, rendering PowerPoint files to an image is probably the only way for LLMs to make sense of them.
Does this work in Cloudflare’s workerd environment? Would be nice to have a cheap serverless render -> LLM (GLM-OCR / PaddleOCR) -> Markdown pipeline for the various MS Office formats.
This code creates a JSON intermediate representation that LLMs could probably consume. You might want to simplify it to focus on content and reduce token usage.
I literally had to solve the "preview Office files in the browser" problem last week. I couldn't find a decent solution, so I ended up making a endpoint that ran the files through headless libreoffice on the server to convert them to PDF.
For PPTX and DOCX, this solution is slightly worse than libreoffice conversion (this does not appear to output highlightable text, while PDF conversion does).
However, the XLSX preview BLEW my mind considering this was AI coded. Really good, even interactive!
> ...this does not appear to output highlightable text
Yeah, it does.
https://ooxml.silurus.dev/storybook/?path=/story/docxviewer-...
I can't highlight text in Safari or Firefox on my iPhone (iOS 26.5), at least on that first page.
I'm not familiar with this application, so perhaps I'm missing a step, and editing mode.
[flagged]
I have been working on a similar implementation. However, wanted to understand whether you were able to make changes to the rendered content and then download the updated file again. will go through your code in detail.. Thanks.
From my experience, the team that's the closest to open-source pixel-perfect Word support is https://github.com/superdoc-dev/superdoc (unaffiliated)
Rendering Office XML directly to Canvas is clever — avoids the heavy DOM tree for large documents. Does it handle embedded media (images/charts) or just text + formatting for now?
> office-open-xml-viewer
The post title about it being "pixel-faithful" is a bit strange. I don't see that claim in the repo, and they don't seem to even claim full feature support at the moment. And for the features marked as supported in .pptx's, it does seem that at least slide image backgrounds and bullet point images aren't actually working, and some text objects have inverted text colors. Seems quite far away from being pixel-faithful in fact.
That phrasing was taken from their website: https://ooxml.silurus.dev/
Seemingly vibe-coded
Both the GitHub repository and the website explicitly claim the entire project is entirely generated by Claude.
no wonder it doesn't work well
[dead]
[flagged]
If you want to delete your account you can just email dang.
Mention you're the guy who called him an idiot publicly and he'll still be happy to help - guy has the patience of a saint.
"Built entirely by Claude through iterative prompting."
Holy cow!!