WELCOME TO MY AWESOME WEBSITE!

Henri Sivonen's Website!

Check out all the cool stuff I've done!

Article Title Description
encoding_rs: a Web-Compatible Character Encoding Library in Rust encoding_rs is a high-decode-performance, low-legacy-encode-footprint and high-correctness implementation of the WHATWG Encoding Standard written in Rust.
chardetng: A More Compact Character Encoding Detector for the Legacy Web chardetng is a new small-binary-footprint character encoding detector for Firefox written in Rust.
How I Wrote a Modern C++ Library in Rust Patterns that I used to make encoding_rs appear as a modern C++ library to C++ code.
It’s Not Wrong that "🤦🏼‍♂️".length == 7 In this post, I will try to convince you that ridiculing JavaScript for this is less insightful than it first appears.
IME Smoke Testing In early 2019, I found myself in a situation where I needed to check that I hadn’t broken IME integration code.
Why Supporting Unlabeled UTF-8 in HTML on the Web Would Be Problematic UTF-8 has won. Yet, Web authors have to opt in to having browsers treat HTML as UTF-8 instead of the browsers Just Doing the Right Thing by default. Why?
Always Use UTF-8 & Always Label Your HTML Saying So To avoid having to deal with escapes, always encode your HTML as UTF-8 and label it as such.
Activating Browser Modes with Doctype A document about the essentials of the layout modes of newer browsers.
HOWTO Avoid Being Called a Bozo When Producing XML Dos and don’ts about producing XML programmatically.
The Sad Story of PNG Gamma “Correction” Why you might not want to use PNG images when you want image colors and CSS colors to match.
An HTML5 Conformance Checker My master’s thesis
Assembling Web Pages Using Document Trees A paper about a template engine that operates on XML document trees. (Source code available.)
Tag Soup: How Mac IE 5 and Safari handle <x> <y> </x> </y> What happens with the DOM in Safari and Mac IE 5 when the nesting of the markup is broken?
Thoughts About a Print UI for Mozilla Some thoughts about printing from a Web browser.
Digitaalisesta arkistoinnista Documents about archiving digital documents (in Finnish)
Can Anti-DRM Clauses in Content Licenses be Free? Are anti-DRM clauses a good idea?
Älä käytä Creative Commons 1.0 -lisenssejä – käytä 2.5-sarjaa The Finland version of the Creative Commons suite of licenses is still at 1.0. (in Finnish)
Software: encoding_rs A Web-Compatible Character Encoding Library in Rust. (Used in Firefox.)
Validator.nu Validation 2.0.
The Validator.nu HTML Parser An implementation of the HTML5 parsing algorithm in Java. (Used in Firefox by the means of automated translation to C++.)
SaxCompiler SaxCompiler is a tool for recording SAX ContentHandler events as Java code that can play back the events without parsing XML.
Blogish Notes Articles about various technical topics.
Installing Linux on 2009 Mac Mini in 2024 Boot from BIOS-only Debian CD.
Ubuntu Custom Partitioning: Failed to start D-Bus System Message Bus Rerun the Ubuntu install with the boxes checked to make the installer reformat the file systems are installing onto.
The Text Encoding Submenu Is Gone Firefox 91 is the first release that does not have a Text Encoding submenu and instead has a single menu item called Repair Text Encoding.
Bogo-XML Declaration Returns to Gecko Firefox 89 was released today. This release (again!) honors a character encoding declaration made via syntax that looks like an XML declaration used in text/html (if there are no other character encoding declarations).
A Look at Encoding Detection and Encoding Menu Telemetry from Firefox 86 Concluding that chardetng should tolerate Big5 byte sequences that the Encoding Standard treats as unmapped.
Rust Target Names Aren’t Passed to LLVM Rust’s i686-unknown-linux-gnu target requires SSE2 and, therefore, does not mean the same as GCC’s -march=i686. It is the responsibility of Linux distributions to use a target configuration that matches what they intend to support.
Text Encoding Menu in 2021 This post is about a UI feature that I wish no one would have to use. Happily, it is indeed almost unused. Still, I made it more usable in the case when it is used.
Rust 2021 It is again the time of year when the Rust team is calling for blog post as input to the next annual roadmap. This is my contribution.
Rust 2020 It’s again the time of year when the Rust Core Team calls for blog posts for input into the next year’s roadmap. This is my contribution.
It’s Time to Stop Adding New Features for Non-Unicode Execution Encodings in C++ I think the C++ standard should adopt the approach of “Unicode-only internally” for new text processing facilities and should not support non-Unicode execution encodings in newly-introduced features.
Rust 2019 The Rust team encouraged people to write blog posts reflecting on Rust in 2018 and proposing goals and directions for 2019. Here’s mine.
Using cargo-fuzz to Transfer Code Review of Simple Safe Code to Complex Code that Uses unsafe The Rust team encouraged people to write blog posts reflecting on Rust in 2017 and proposing goals and directions for 2018. Here’s mine.
#Rust2018 I used model-based testing with coverage-guided fuzzing to gain confidence in the correctness of encoding_rs::mem.
No Namespaces in JSON, Please I think that experience from Namespaces in XML should lead to the conclusion not to repeat the same (or almost same) thing with JSON.
Julkisesti luotettu varmenne ikidomainille TLS:ää (SSL:ää) varten Aiemmin ikidomainille, kuten hsivonen.iki.fi, on ollut vaikeaa saada julkisesti luotettua TLS-varmennetta. Uusi voittoa tavoittelematon varmentaja Let’s Encrypt tarkistaa isäntänimen (hostname) hallinnan ja mahdollistaa näin julkisesti luotetun varmenteen saamisen ikidomaineille. (English summary: Previously it was impractical to get a publicly trusted TLS certificate for an iki domain (e.g. hsivonen.iki.fi). Thanks to Let’s Encrypt performing validation on a per-hostname basis, it’s now practical to get a publicly trusted certificate for an iki domain.)
If You Want Software Freedom on Phones, You Should Work on Firefox OS, Custom Hardware and Web App Self-Hostablility To achieve full-stack Software Freedom on mobile phones, I think it makes sense to focus on Firefox OS, commission custom hardware and develop self-hostable Free Software Web apps and an easy deployment platform for them.
Character Encoding Menu in 2014 This post is about a UI feature that I wish no one would have to use. Happily, it is indeed almost unused. Still, I made it more usable in the case when it is used.
Thoughts on HTML5 Becoming a W3C Recommendation Since I’ve participated in the development of HTML5 for a decade now (since before it was commonly called “HTML5”), I’ve been asked for my thoughts about HTML5 becoming a W3C Recommendation. Hence, I figured I’d post something here.
Four Finnish Banks Training Users to Give Banking Credentials to Another Site A person who turns to me for technical advice was logging in to government service using banking for a bank called Handelsbanken. However, the page that was asking for the Handelsbanken login credentials was not served from https://*.handelsbanken.fi/! After investigating what was going on, I decided to review how other banks in Finland handle this. Here are my findings.
What is EME? It was suggested at the Mozilla Summit that there isn’t good information around about what Encrypted Media Extensions (EME) actually is. Since I’m on the HTML working group and have been reading the email threads about EME there, I thought that I could provide an introduction that explains things that may not be apparent from the specification itself.
Accept-Charset Is No More Now that Firefox 10 has been released, none of the major browsers send only Chrome sends the Accept-Charset HTTP header.
WebM-Enabled Browser Usage Share Exceeds H.264-Enabled Browser Usage Share on Desktop (in StatCounter Numbers) Looking at StatCounter stats, it occurred to me that they might not match the common narrative about H.264 market share. I decide to run some numbers using StatCounter stats.
Vendor Prefixes Are Hurting the Web I think vendor prefixes are hurting the Web. I think we (people developing browsers and Web standards) should stop hurting the Web.
HTML5 Parser-Based View Source Syntax Highlighting A new implementation of the View Source HTML and XML syntax highlighting has landed in Firefox.
The html5.parser.enablePref is Gone Just a quick note to Firefox nightly testers and bug triagers: I pushed a patch that makes Firefox no longer honor the html5.parser.enablePref.
Windows 8 App Support Matrix Over the last few days, there’s been quite a bit of speculation about whether Windows 8 on ARM will ship the desktop environment and allow recompiled code written to the legacy Win32 APIs run.
The Old HTML Fragment Parser is Gone Just a quick note to Firefox nightly testers and bug triagers.
Schema.org and Pre-Existing Communities I have been reading tweets and blog posts expressing various levels of disappointment and unhappiness about schema.org not using RDFa, not using Microformats or not having been developed in the open with the community. Since other people’s perspectives differ from mine, I feel compelled to write down my take.
What Could Microsoft Do about IE6? Microsoft has started a campaign to drive down the market share of IE6. Getting rid of IE6 is a righteous goal. Microsoft’s proposed solution isn’t righteous, though.
About about:blank about:blank is probably the hardest Web page to load. In fact, it is so hard that in order to turn the HTML5 parser on by default in Firefox last year, we decided to special-case about:blank to use the old parser in Firefox 4.
Sergeant Semantics So the W3C launched a logo for HTML5. And not just for HTML5-the-spec but for HTML5-the-buzzword. Regardless of the logo itself or what it stands for, I find the choice of the ancillary visual elements weird.
Vihreiden tekijänoikeuslinja ja teosten tekijöiden eläketurva Vihreät julkaisivat äskettäin tekijänoikeuslinjapaperin. On positiivista, että puolue kiinnittää huomiota aihepiiriin niin paljon, että siitä julkaistaan erillinen linjapaperi. Minua kuitenkin häiritsee paperissa suhtautuminen teosten tekijöiden eläketurvaan.
HTML5 Script Execution Changes in Firefox 4 Beta 7 In Firefox 4 beta 7, script execution changed to be more HTML5-compliant than before. This means that in some cases sites that sniff for Firefox or Gecko may break. If your site/app works cross-browser without browser sniffing, you don’t need to read further.
The <spacer> Element Is Gone Today, I landed a patch that made the HTML5 parser in Gecko unaware of the HTML spacer element.
-webkit-HTML5 Apple took some of their Safari Technology Demos from their developer site and published them at http://www.apple.com/html5/ as an “HTML5 Showcase”. Christopher Blizzard's blog post about the subject says almost everything I'd have to say, so please read Blizzard's post. I'm posting just my diffs here.
SVG and MathML in text/html in Firefox and Validator.nu I enabled SVG and MathML-related stuff recently on both mozilla-central and on Validator.nu.
HTML5 Parser Improvements As mentioned earlier, there is an ongoing project for replacing Gecko’s old HTML parser with an HTML5 parser. Significant improvements have landed lately, so if you’ve previously tried the HTML5 parser and turned it off due to crashiness or Web compatibility issues, now is a good time to turn it back on.
Thou Shalt Not Spec a Feature that Might Inadvertently Compete with RDF when Used Contrary to How It Is Designed to Be Used From the minutes of the TAG meeting on November 2nd 2009.
Speculative HTML5 Parsing Landed As mentioned earlier, there is an ongoing project for replacing Gecko’s old HTML parser with an HTML5 parser. Today, a significant milestone landed: off-the-main-thread speculative HTML5 parsing.
Help Test HTML5 Parsing in Gecko The HTML5 parsing algorithm is meant to demystify HTML parsing and make it uniform across implementations in a backwards-compatible way. The algorithm has had “in the lab” testing, but so far it hasn’t been tested inside a browser by a large number of people. You can help change that now!
An Unofficial Q&A about the Discontinuation of the XHTML2 WG Many of the comments on Zeldman’s post indicate that there are people who are badly misinformed about the matters surrounding this announcement. To help remedy that, here’s some quick Q&A for getting informed.
Browser Technology Stack I took a quick attempt at drawing a stack for Web browsing.
The Last of the Parsing Quirks I implemented a single quirk for HTML5 parsing yesterday.
Testing HTML5 Parsing I have been using a browser with an HTML5 parser for both my work and leisure browsing for a bit over a week now. I think in-browser HTML5 parsing is now ready to be tested by others as well.
Extended Uncertainty I use my vidoop as my OpenID delegate. They used to have an EV certificate. Yesterday, they didn’t.
Out of Context Last week on W3C mailing lists.
A Lecture about HTML5 I was invited to give a lecture about HTML5 on a course titled WWW Applications at the Department of Media Technology of Helsinki University of Technology.
SVG Filter Effects in HTML without External References The project of putting an HTML5 parser inside Gecko has progressed. I merged in code from the trunk in order to experiment with cool new stuff such as SVG filter effects for HTML.
HTML5 Parsing in Gecko: A Build The effort of putting an HTML5 parser inside Gecko takes a step out of the vaporware land.
I Want an Affordable Snapshot-Saving Crypto-Backupping RAID NAS This week, I lost over one potential work day to HFS+. And it wasn’t the first time I’ve lost time to HFS+. I want to make arrangements to avoid losing time to HFS+ in the future.
Access Blocked I followed a link from a message to a spec in the /TR/space on www.w3.org.
Not Part of the Technology Stack At XTech 2006, I got a W3C brochure entitled Leading the Web to its Full Potential that had a diagram visualizing the W3C technology stack(s).
Browser Sniffing History in the Chrome UA String Google Chrome has the following cruft in the HTTP User-Agent header.
Introducing SAX Tree I chose to write yet another XML tree package.
Lowering memory requirements by replacing Schematron For long time, I’ve said is that the Schematron schema in the HTML5 facet of Validator.nu was merely a rapid prototype that should be replaced with custom Java code.
The Performance Cost of the HTML Tree Builder I’ve been thinking about the performance gap between the Validator.nu HTML Parser and Xerces. What can be attributed to the “extra fix-ups” that an HTML parser has to do and what can be attributed to my code being worse than the Xerces code?
Performance Mistake In the spirit of documenting one’s mistakes…
Validator.nu Gets Out of the Java Trap This week, I upgraded the operating system on the Xen virtual machine that powers validator.nu and html5.validator.nu to Ubuntu Hardy.
Validator.nu Downtime Validator.nu was down last week.
NVDL Support in Validator.nu I enabled NVDL today.
ARIA in HTML5 Integration: Document Conformance (Draft, Take Two) Now a runnable suggestion.
Security Quote of the Day Cluelessness and incompetence of epic proportions.
ARIA in HTML5 Integration: Document Conformance (Draft) This is not a spec and has not been endorsed by anyone.
Reality Distortion Fields Where Joel Spolsky’s analysis of the IE version targeting issue goes wrong.
Almost Precedent Why the Gecko Almost Standards Mode shouldn’t be used to justify IE engine version targeting.
Regular Expressions, Computer Science and Practice Disregard of computer science can crash your app.
Unimpressed by Leopard Sadly, Leopard is not a clear improvement over Tiger.
Built-in Accessibility Roles in HTML5 A quick table of WAI-ARIA roles and what HTML 5 provides natively for each role as of July 2007.
Printing Web Apps 1.0 This is a quick guide for getting a dead-tree version of the Web Applications 1.0 spec.
Speaking at XTech I’ll be speaking at XTech.
IM Logs Quote of the week.
EFFI’s Day in Court As mentioned earlier, Electronic Frontier Finland (EFFI) was suspected of illegal fundraising. The case was tried today. I went to the court house to observe the proceedings.
XHTML and Mobile Devices Simon Pieters’ mobile XHTML test results need more publicity.
Social Media Impression Management I asked if they had researched the image formation of social media sites. They hadn’t.
DTDs Don’t Work on the Web Last weekend, Slashdot linked to an article that observed that Netscape had removed the RSS 0.91 DTD. I hope this episode has a silver lining and helps in making people realize that DTDs don’t belong on the Web.
Thesis Defense on XForms On Friday 2007-01-12, I went to listen to the thesis defense of Mikko Honkala.
Maemo Source Code To save others the trouble of requesting the source, here are the contents of the package called “2.2006.39-14-srcs”.
Validator Web Service Interface Ideas I am just writing this down so I don’t forget it.
Three Styles Well, four styles if you count the original.
Charmod Norm Checking Charmod Norm is still in the Working Draft state, but if it were to become a normative part of (X)HTML5, it would belong to the area of the conformance checking service that I am working on now, so I prototyped Charmod Norm enforcement as well.
Charmod Checking Here’s how I have addressed the requirements of Charmod that apply to content (marked as [C] is Charmod).
Table Integrity Checker The first non-schema checker prototype is a table integrity checker.
Openmind 2006 I attended Openmind 2006 last week. Here are some notes.
ISO Opens Up a Little It turns out that ISO now has some standards on the Web. That’s good, but putting all of them there in a Web-friendly format would be even better.
Natural Hazards Again Looking across the street, I can see that there’s something extra in the air between where I sit and the house on the other side of the street.
The Scientific Method According to Hixie Quote of the week from the topic of #developers on irc.mozilla.org
What to Do