software for massive document collaboration?

As part of my new role at work I’m going to be working on writing and editing some legal documents that I’d like to get both public and private feedback on.1

real text is edited in black and green (picture: Zenith Z-19 Terminal, by ajmexico, used under CC-BY)

real text is edited in black and green (picture: Zenith Z-19 Terminal, by ajmexico, used under CC-BY)

I’m trying to wrap my head around the available options, and none of them seem quite ideal. Some thoughts, first, on my requirements:

  • ease of use: I’m going to be collaborating with (among other people) lawyers, managers, etc.- i.e., non-technical people. So the solution should be easy to use, or at least have one face that is easy to use.
  • large-scale collaboration: this has to scale to input from lots of people (at least for commenting- editing will be a smaller group.)
  • maintaining the canonical version: somewhere other than my laptop should hold the canonical version of the text, including revision history.
  • commenting: it should be possible to open up a version of the document to the public, and to have them be able to comment on specific sections of the text- ‘I don’t like this paragraph’, ‘I suggest replacing A with B’, etc.
  • editing: I don’t need a massive multi-user text editor; we want feedback from many people but only a few people will be empowered to actually do edits. Ideally, though, I’d love to be able to review public comments, delete (or respond to) the bad ones, and integrate the good ones, all within the same tool. It should also be possible to do private revisions.
  • diffs/versioning: I need to be able to show the differences between two versions of a document; ideally with commentary on the reasons for the change, and with output that looks less like diff and more like an editor’s redline.

So what options do I have? These are the tools I’ve thought about so far:

  • a markup language + revision control: this would give me a lot of what I want, but it totally fails the ease of use test, and it isn’t clear that it handles the commenting role terribly well. Potentially great for canonical versions and diffs, though, especially if word-level diffs are an option and if I could figure out a way to produce good-looking diffs. With a distributed RCS this approach has the bonus of allowing for some work to exist in a non-canonical branch when changes are still being discussed/debated.
  • traditional word processors: traditional word processors can be great at diffs/versioning, and obviously they exist to edit, but they aren’t very good at scalable commenting and collaboration- things break down very quickly when you’re emailing around files, and expecting someone to merge them all together. odf-svn seems like it deals with some of these problems, at least conceptually, but development seems very stalled. I will also look at abicollab, but many of my collaborators will be on Mac- which AFAICT is not supported for newish versions of Abi. :/
  • stet/ Stet was great at handling mass commenting; its successor,, seems to be similarly good. But they don’t really allow you to do diffs between versions, so at best it could be only part of the solution.
  • wiki: no wiki that I know of can handle commenting like can. This is a shame, since they are great for showing revisions and (small-scale) collaborative editing. Also, doing ‘branches’ to propose changes that may get rejected is not possible in any wiki I’m aware of. Would love to be proven wrong on this one.
  • etherpad: etherpad is even slicker than wikis for showing revisions, and obviously superior for collaborative editing, but no facility for commenting on texts. Also lots of uncertainty about the maintainability/supportability of the code base.
  • bespin: this is so code-focused that it may not pass the ‘user friendly’ test, but hg integration is nice, and it may be sufficient for collaboration on plain text.
  • wave: this is almost exactly the kind of problem wave seems designed for, but it is such a constantly evolving product (not to mention a ‘run on someone else’s server’ problem) that I’m a little reluctant to use it. And of course since it is in semi-private beta it can’t do public commenting.

So far, I’m leaning towards gathering comments via a instance, using hg + markup (or even plain text?) to store the canonical version and generate revisions, and using etherpad, bespin, or a wiki for collaborative editing when necessary. But that still feels like a pretty fragile solution to me- lots of file transitions where things could go wrong, especially between hg and etherpad/wiki. I’d need to find a markup which can transparently/reliably go in and out of the editing tool from hg (or just admit defeat and use plain text), and the diffs from hg would almost certainly need some processing to make them look good.

So does anyone have suggestions on other tools, or specific suggestions on how to make this toolchain more robust and/or powerful?

  1. Sorry, no details quite yet on what the project is, and no prizes for guessing… []

15 thoughts on “software for massive document collaboration?”

  1. I suppose you thought about using the code that the FSF used for the comment pages of their GPL3 drafts?

  2. You mention Wave, but what about Google Documents? It would seem far superior to Wave for this. It has all of the features you mention: excellent version control, commenting, easy to share publicly, etc.

  3. I think MDC uses DekiWiki which is a MediaWiki hacked up beyond recognition to support all kinds of interesting use cases. Songbird for example uses it for internal documentation, public (static, but with comments) documentation and more general public wiki space. The developers also seem keen to support interesting use-cases. You could talk to the MDC folks and see if they can set something like that up for you.

  4. Benjamin: ‘stet’ (mentioned above) is that GPLv3 software, now maintained as

    bochecha, fraggle, test: very interesting, I will look into them.

    Matthew: same problem- not self-hosted; can’t do public comment without editing.

    Ian: oh, interesting. I will definitely talk to them. (They should see this, it is on their planet after all ;)

  5. I think you are on the right track. Use to gather feedback on milestone versions, and source control on plain (or gently wiki-marked-up) text for the internal development process. You don’t need fancy formatting until the last minute; or, if you do, you can have a one-way export process to PDF.


  6. Unfortunately, I do need at least some intermediate fancy formatting; I want to be able to generate legible redlines (aka ‘lawyers diffs’) fairly regularly. I believe wdiff can probably be coaxed into this, though.

    (For anyone who has read this far, I’m leaning towards markdown as the markup, but other suggestions are welcome.)

  7. […] So does anyone have suggestions on other tools, or specific suggestions on how to make this toolchain more robust and/or powerful? Sorry, no details quite yet on what the project is, and no prizes for guessing…Syndicated 2009-12-29 15:06:25 from Luis Villa's Internet Home » Blog Posts […]

  8. I do all my docs with ReST (plain text with minimal formatting) and then do rst2pdf or rst2odt or rst2html (or even rst2s5 for presentations) if I have to send them to others.

  9. Hello, my Company ( is editing Calenco ( which is an XML CCMS (Components Content Management System) which base is Free Software (AGPL). Concerning your requirements:
    * ease of use: it is a Web Based interface, aimed at technical writers, and managers, not geeks.
    * large-scale collaboration: that’s the goal of Calenco
    * maintaining the canonical version: yes it’s stored on a server.
    * commenting: this is planned for soon
    * editing: this is done through a Web Based WYSIWYM(ean) XML editor of our own
    * diffs/versioning: All content is versionned, and we connected Calenco to a commercial XML Diff tool that allows us to generate nice HTML reports with redlining.

    We have no public demonstration available yet, but I’ll be glad to show the tool to you. Just drop us a note at

    Best wishes for 2010!

  10. Have you thought about commercial tools like WorkShare? It’s for lawyers, is Word centric, but rather than using the god-awful track changes feature, it does real red-lining deltas of documents and has an excellent UI for merging peoples changes to documents. Not the ideal, definitely, but it’s really an indispensable tool for an attorney these days.

  11. Hi Luis,
    We’d be really happy to read your thoughts on the suitability of for your needs as your project is pretty much at the center of our aims.

    There are a few extra features that we plan to roll out to make the site more useful notably, “copy”, invite and various premium services.

    Unfortunately we do not have an OSX binary that can talk to We were 98% of the way with a native OSX client when our developer had to leave us. It appears hard to attract free software developers to with OSX expertise.

    Maybe we should just bite the bullet and package up our GTK client to run under X11 on OSX.

Comments are closed.