Making HTML Legal Documents (Like MPL) Look Good

A few months ago I bought “Typography for Lawyers” (TFL), an excellent book that I would recommend to all lawyers. And since the biggest document I was working on at the time was, of course, published in HTML, I started spending a few minutes here and there on learning enough CSS to make the license look better. (Understandably, the book’s very pragmatic advice is focused on Word and Pages, not HTML.)

Fine Print by CJ Sorg, used under CC-BY 2.0I’ve published the experiment (Compare with the plain-jane HTML MPL 1.1). [Update: the experiment is no longer available, but the official, final version of the license incorporates many lessons from it.] This is just an experiment and a personal hack, but I’m happy to hear more suggestions and improvements, and if the final result works, I’ll suggest we use it instead of the traditional plain HTML version. Some notes on the process, including links to the (abbreviated) blog posts at the TFL website (for much more thought and detail, buy the book):

  • Fonts: This was hard. The author of Typography for Lawyers is himself a former font designer, and (correctly, I think) a font snob. Given that I was not going to splurge for fancy webfonts just for this project, I spent a lot of time browsing and playing with Google Web Fonts. This is clearly not ideal (for example, it has only one monospace option for Exhibits A and B, and I’d like a slightly more subtle serif for the titles) but I think I found a decent combination of fonts – or at least an improvement on the system fonts. Presumably, if this became Official, some better font choices could be used.
  • Small Caps: I’d never given small caps much thought until reading TFL. The book is quite positive on them under certain circumstances, and that led me to experiment and eventually to use them as headers. Unfortunately, Google Web Fonts is limited here – none of the Google Web Fonts appear to have true small caps. TFL recommends strongly against fake smallcaps, but I think these look decent so I’ve used them; I’d be happy to replace with a better one if that was available.
  • Hyphenation and Justification: TFL does not necessarily recommend full justification, but demands hyphenation if you do fully justify. Since Mozilla just added support for hyphenation, I turned on hyphenation and justification, and I think it looks good (if you’re using a recent build.) In production, it would probably be better to either use something like hyphenator that works cross-browser, or use some sort of browser feature detection to turn off justification for browsers that don’t also support hyphenation.
  • Line length: TFL recommends something between 45 and 90 characters per line. (The Supreme Court’s well-designed documents are about 65 characters per line.) Unfortunately, as best as this CSS newbie can tell, there is no good way to do this simply in CSS. I ended up with a total hack, using the “alphabet trick” described in  TFL to estimate the right width sildenafil tablets 100mg.
  • ALL CAPS: TFL IS AGAINST ALL CAPS FOR ENTIRE PARAGRAPHS. My experimental HTML version uses some gross CSS to create a highlighting box around the two traditionally all-caps blocks in the text.
  • “smart” quotes: We know that people copy and paste from the HTML version of the license into plain text files, even though (with MPL 2) we’re going to provide a very nicely formatted plain text version of the license. And of course, copying and pasting curly quotes into plain text gets… messy. And so, I am conflicted about this. The HTML linked above uses smart quotes, while the plain text uses straight quotes. Inevitably that will lead to some problems; suggestions on how best to fix (use javascript to modify what is copy-and-pasted if someone does that?) are welcome.

Of course, I’m still very bad with CSS and HTML, so I’m sure this document can be improved, and I’m happy to take suggestions and fixes. Regardless, it has been an educational experience for me and I’m glad I toyed with it!

43 thoughts on “Making HTML Legal Documents (Like MPL) Look Good”

  1. Luis, thank you for this link and book recomendation!

    It’s great to see there’s more font lovers amongst lawyers :)

    I completely agree that legal texts have to use proper typography, not (only) for æsthetics, but mainly legibility and clarity. Legalese is hard enough to understand as it is, and we’re doing no-one a favour by obfuscating it even further with ugly fonts and messed up kerning.

    I’m really happy how HTML5 is making this simpler …so I can have the same liberty with styling on my blog then in my documents with ConTeXt.

    P.S. Yes, someday, when I have some time free, I will have to rework my website — hardware-, software-, layout- and font-wise.

  2. Okay i seriously never expected such a post in my whole life! Also i am not very into typography, but i enjoy fonts. And you made wonderful decision there.
    Just one thing, you probably want to remove the second tag.

  3. For smart quotes, the Q element exists. Note that MDC says there may be compatibility problems with (older versions of?) IE, so you may need to consult with someone, or do some verification.

    (Someone might argue that that’s not the type of quotation the HTML Q element is specced for.[1] Well, when the alternatives up for discussion include other transgressions like JS hacks, I think this infraction isn’t much of a problem.)

    For line length, the ch unit exists, but that’s CSS3, so again, compatibility comes up.

    [1] I’m speculating, here. I don’t know if this is an issue that comes up, actually.

  4. Colby: at least here, foo doesn’t seem to actually do the visually right thing- it just gives straight quotes. But I suppose I can style that with CSS?

    [I’d really love to not care about earlier IEs, but unfortunately lots of lawyers are stuck with old IEs still, so that is also a consideration.]

  5. I’m interested in your opinions on the legal ramifications of publishing legal documents in such a way as to rely on external technology to render the document visibly and legibly to somebody. That is, by using any sort of dynamic encoding system (here, HTML plus CSS; to extrapolate, possibly absurdly, consider JavaScript enhancements) for a legal document implicitly means that, in order for a person to read the document and make a judgement as to whether or not to accept its terms, some instrument must interpret the encoding, act upon its directions, and produce the readable, interpretable form of the document.

    Implicit in that is the presumption that when all parties prospectively bound under an agreement codified in such a document actually see the document, they all see it the same way.

    Now, clearly, within manageable bounds of reason mere typographical variations involving change of letter forms in a document, or change of font weight or size, are probably not legally interesting. In other words, it doesn’t seem likely that one party to an agreement might claim that they were duped by such variation in rendered output.

    However, in a case where the formatting directives are more involved, there’s always a chance that some browser bug, or browser limitation, could result in the presented form of a hypothetical document being quite incorrect: sentences or paragraphs missing or incorrect, or text overlaid on other text to create an illegible mess.

    I guess what I’m really wondering about is the degree to which the “mechanization” of legal documents taken from print media into the online world has impacted, or may impact, the assumptions around the way a prospective party to an agreement “consumes” the document.

  6. I’m surprised to see admonition against all caps paragraphs, by a lawyer, it’s so common in legalese and so silly that I was pretty sure that it was just the way lawyers are… I can only assume that he’s not a real lawyer

  7. Pierce: he’s a real lawyer, but that’s one of the examples where the website is really lacking when compared to the book; the book goes into the nitty-gritty on ALL CAPS, including citations to relevant cases and statutes in several places.

  8. Hi, Luis,

    I like it very much and yes being a former lawyer (or as much as it is possible to be former lawyer; isn’t it similar to being a former alcoholic?) and typosnob I enjoy that somebody tries to make legal texts look good.

    There is one thing which I am excited even more … disclaimers of warranty in ALL CAPS has been my personal pet-peeve for long time. I don’t know if there is an actual precedent where some judge decided that all caps make things more obvious, but in my opinion there is no better way how to make people skip over the paragraph than to make it all caps. I am just not able (and I am quite sure I am not the only one) to read paragraph of legal text in all caps. Cudos on doing much better job than that.



  9. Well, I’m really impressed! Even though I am not quite knowledgeable about typo and fonts but that’s one domain I do love and your work is just great!

    Thank you for avoiding all caps paragraphs!
    By the way, I may dare some suggestions (not worth much).

    Titles such as “2.4. Subsequent Licenses.” seem strange to me as I would expect “2.4 Subsequent Licenses” without trailing dots after text and numbering but I may be mistaken here.

    Usually, I prefer a non-breaking space between a Section and its numbering (eg “Section 10.2” in your Section 2.4).

    I think I would have used a monospace font regarding filenames such as LICENSE but, well, I do like monospace fonts.

    Oh, and your text’s width, fully justified with hyphenation looks perfect from my point of view.

  10. Colby: at least here, “foo” doesn’t seem to actually do the visually right thing- it just gives straight quotes. But I suppose I can style that with CSS?

    Aw c’mon, Chromium. FWIW, I wrote that comment after checking with Firefox latest, which displays with the text wrapped inline with open and close quotes, and pastes with dumb quotes input boxes, vim, et cetera. Checking again, pasting into Libre Office gives the unquoted text, so I’m guessing the libeditor (or whatever) clipboard is munging to dumb quotes when the destination is plaintext.

    What’s interesting is that Chromium uses CSS3 open-quote and close-quote to implement the Q element, but they look like dumb quotes. (That it’s generated content is also the reason behind it not appearing the clipboard.)

    Taking these things into account, as well as the fact that even if everything behaved just like Firefox here, this would be a misuse of Q anyway, what I’m really trying to say is this is probably not the way to go. Sorrz.

  11. No need to apologize… given that 1/2 the point is to educate me, this discussion satisfies the bill even if I can’t actually use it in practice :)

  12. If the purpose of the line length is to make the text easy to read, and you know the font size, then I don’t see anything wrong with choosing a proportionate pt or px width. If you don’t know the font size, then you should still be able to get a pretty good approximation using ems.

  13. Looks good.

    The only thing I find not to my taste is justification and hyphenation on headings. This is noticeable in the Exhibit A/B headings.

  14. Tom: Fixed, thanks.

    Romain: Wow, some nice catches there. For what it is worth, this was done by generating HTML with pandoc (from the markdown master document) and then changing only <head>. They’ll require manual munging for the final version. Unfortunately, not sure I’ll keep the LICENSE change because the Android Mono font is just… not very good.

    Jeremy, Trevor: Fixed (turned off justification and hyphenation in h2/h3), thanks for the catch.

    roc: Unfortunately, using ems for max-width does not seem to be working quite right; a max-width of 30ems seems to be creating a 60+character-wide body. Still, seems cleaner than px so I’ve switched to that.

    These changes are all uploaded.

  15. On different font styles: I remember reading (was it Robert Bringhurst in The Elements of Typographic Style?) that you should generally not use variation on multiple axes. Eg. if you have a heading font that is larger in size, there is no need to make it bold as well. Or, if you are using small caps: eg. I would have “1. Definitions.” have the same font size and weight as the “1.1. Contributor” because it’s differentiated with small caps.

    While your current rendering is beautiful compared to the original plain-text, I think getting rid of the bold will do nothing but improve it further. Try it out and let it sink in for a bit. :) (Fake small caps might look worse, though, but proper small caps would be just fine)

    Another thing that strikes me as wrong is that the lining up of the headings is kind of arbitrary. It’d look much better if you aligned the section numbers and actual section titles separately. Eg. “1.”, “1.1.”, “1.2.”… would be lined up (like they are now), and then “Definitions”, “Contributor”, “Contributor Version”… would be lined up among themselves. This might not be trivial to achieve in HTML (other than using tables or fixed widths like “3em” for the number span). Though, maybe it’s just me. :)

    Also, I love your use of whitespace: while often ignored, it is the most critical element of all good designs, especially when type is involved.

  16. Also, don’t forget to switch to using actual em-dash in the Exhibit headings (“—” vs hyphen “-” is a big difference). :)

  17. That does look much easier on the eye (font and vertical spacing being the main improvements).

    Personally, I think the line length is too short, makes my eyes feel like they’re doing to much shunting left/right per sentence.

    Also I dislike the large proportion of hyphenated words caused by the full justification, words with just two letters before the hyphen are particularly jarring, as are multiple consecutive lines ending with hyphens.

  18. Fun thing is that when I look at your new version and the old MPL 1.1 document, I actually like the look of the old one better.
    First, the font you use in headlines in the Definitions section looks bad here on my Linux machine, while the default font in the old doc looks nice.
    Second, the old doc nicely uses the width my screen has and therefore makes it easy to grasp the parts I need, while with your version, most of my screen ends up being useless whitespace.
    Third, the yellow boxes in the middle of the thing just look strange to me, while the all-caps text in the old one tells me “this is that strange legalese a normal person doesn’t need to care about”, which my brain learned through training in looking at Terms of Service docs.
    I’m personally not a friend of small caps (or any full-paragraph-or-caption caps) at all, but those small caps look acceptable to me.
    Also, the serif font used in the text is less clearly rendered on my screen than the sans-serif default font I’ve set in my browser and which the old doc is rendered with, so even if the serifs themselves should strengthen readability, the less clear rendering makes the serif font harder to read for me in the end.

  19. Somebody’s gotta say it…if you want to learn a lot about how typography and text formatting should be done, read through the TeXbook. It’s Knuth, what more do you need to know? :)

  20. @ Mike McNally:

    I can understand your concern. But this can be said for any digital format — be it HTML (+CSS), PDF, ODF, DjVu, ePub, OOXML, .doc, even plaintext or e-mail. In all cases you are depending on the render at the end of the recipient and therefore technology outside your control.

    I would even dare to argue that similar issues are present when using physical formats. We just assume that our addressees can read, are capable of reading small print and are not dyslectic¹.

    The best solution that we can muster is to strictly adopt open standards — both on the sender and recepient end — by strictly using open standards when creating the text and promoting to use a standards-compliant software to read it. On websites could be done e.g. with an automatic technology test² to see if the browser is compliant enough to render the text reliably and show this information in a small box next to the text itself.

    Thank you for an interesting question. I hope it will spur some more debate :)

    P.S. Sorry for the short reply, I had a longer one but my browser crashed in between.

    [1] Serifs — a firm favourite between us lawyers — are as a rule very hard to read for a dyslectic.
    [2] Such tests already exist. e.g.

  21. The biggest improvement, in my opinion, is the way you treat clauses that need to be “conspicuous” [as defined in UCC § 1–201(10)]. I really hate the common practice of printing conspicuous paragraphs in all caps.

    The yellow background is a great way to display those sections on screen.

    However, the yellow background might not print or copy well in black-and-white contexts. Black-and-white printers and copiers are still very common, after all. Maybe you could also make the font of these sections bold. In the original version, the paragraphs were bold as well as written in all caps. (And yes, Danilo, I know some typographers frown on simultaneous variation on multiple “axes”, but I think that rule could do with some leeway.)

  22. I was going to recommend “em”s, but additionally point out that the “em” space is the width of an “m” (em), which as the widest (or nearly so) character in the font is quite a bit wider than the width of the “average” character. So if you pick the lower recommended bound (e.g. 45) for your max-width in ems, you should always fit 45 or more characters on a line, and probably around 150% of that (e.g. ~67) on average). That 30ems is giving 60+ chars on average is surprising, but not necessarily broken, depending on the metrics of the font you’re using and extra details like kerning hints.

    As for smart quotes, I’ve generally not had a problem with that in quite some time. About the only time I have is when I’ve got a document which has passed through a Windows system, which contains smart quotes in encoding windows-1252 (or any other character in the 0x80-0x9F range), but which claims to be encoded in iso-8859-1. Now I’m just a bit more wary when dealing with people who use Windows, and even that doesn’t affect me too much any more.

  23. “point out that the “em” space is the width of an “m” (em)”

    This is incorrect. The em is a distance equal to the point size of the font. (In the days of metal type, it was also the distance from the top to the bottom of each piece of type.) So in a 12 point font, an em space and an em dash are 12 points wide. The lowercase letter m may or may not be close to this measure.

  24. A quick peek reveals at least four monospace fonts provided by Google, though no easy way to list all such fonts.

    The inconsolata font by Ralph Levian (a member of the Google Web Fonts team) has long been a favourite of mine, and is intended for code listings, though I don’t know if it would work in this setting.

    Finally, I am no expert or aesthete but it seems to me the larger gap above a paragraph than below makes the heading appear to belong to the text above it, which makes it seem subtly off-kilter.

    1. How did you do that quick peek? I just searched for “mono” (turning off display fonts) and got only the one result. Inconsolata doesn’t show up, but it’s probably better; I’ll take a look at it tonight.

  25. I was wonding the same thing as Mike McNally… The CSS hyphenation is done dictionaries (and perhaps some algorithm magic?), so I’d wonder if there are plausible cases where hyphenating at the wrong point could lead to confusion, or if it’s possible for “foo-\nbar” and “foobar” to read differently.

    I’ve seen (mainstream) news articles now and then about how a misplaced common or other grammar mistake leads to the parties going to court…

  26. I just searched for “mono” in the name which revealed three mono fonts (and one false positive) and then looked for inconsolata which I already knew was there.

    Nova Mono disappears if you filter out “display fonts” but I’m fairly certain it was designed for programmers to use so that’s perhaps just a mis-categorization. Or maybe it reflects that it’s not ready for prime-time status due to lack of hints (just a guess there don’t know for certain).

    Checking for other free monospaces I’d heard of there’s also Anonymous Pro:

  27. […] これは長期間の熟考と他のライセンス関係者との協力を必要とする難しい目標です。これ以外にもCCライセンス・バージョン4.0に関する目標は多くあります。私たちはその過程で蓄積される効果がこれまでのバージョン3.0よりもかなり優れたライセンス群を作る糧となり、その精神は今後も続くと期待しています(例えばCC:表示-継承は今後も引き続き使用されます)。分かりやすく言うと、私たちはあらゆる全てのものとのバランスを考えようとしているのです。 MPL 2.0の発表では、多くの人々がライセンス制作のために素晴らしい貢献をしてくれたことが綴られています。おそらくソフトウェア・ライセンスを考案する第一段階として、そのデザイン性に取り組むことを含め、異なるスキルを持った人々がライセンスをより使いやすく改善することができる機会を提供することが重要なのでしょう。そして、CCライセンスを利用した様々なプロジェクトが増えるにつれ、コミュニティー全体のフィードバックの必要性も大きくなるのです。この機会に是非、CCライセンス4.0に関するあなたの意見をお聞かせ下さい 。 *GPL- General Public License。フリーソフトウェア財団(FSF)の理念に基づくフリーソフトウェア・ライセンス。利用者に対しソフトウェアの利用、複製、 再頒布などの自由を与える事を最大の目的としている 原文: Mozilla Public License 2.0 公開日時:2012年1月3日 BY Mike Linksvayer (Vice President, Creative Commons) […]

Comments are closed.