On the Importance of Per-File License Information

After the release of MPL 2, the first request for MPL 2.1 came from someone who didn’t want to put copyright headers in individual files. The issue has recently reared its head in Apache as well, and I recently was asked related questions by a GPL user as well.

The main reasons given for not using per-file headers are two-fold:

  1. They’re awkward in short files. Many programming frameworks these days (most notably rails) are encouraging creation of many short files, so this is becoming a bigger problem than it was when per-file headers were first created.
  2. They’re not considered very relevant in languages or frameworks that are library-centric, especially those with package managers that are heavily used (like ruby’s gems). Again, many modern language frameworks include tooling that encourages this approach, so some developers are thinking “why can’t I just express the license once per library?”

The case for per-file copyright headers is put well, and succinctly, by Larry Rosen:

“[O]ur goal is to pass on any important IP information that might be useful … in the place(s) [downstream licensees] are most likely to find it.”

Larry’s comment makes two assumptions that I want to flesh out and support.

First, Larry assumes that the place where people are “most likely to find” licensing information is in per-file headers. It is true that in the best case scenario in many modern languages/frameworks, library-level is a great place to put licenses – in normal use, they’ll get seen and understood. But lots of coding in the wild is not “normal use.” I review a lot of different codebases these days, and files get separated from their parent projects and directories all the time. And then you have to use fairly complex (and often expensive) tools to do what should be a simple task- figure out what the license is. So, yes, modern frameworks should in theory reduce the need for per-file licensing information – but in practice, that is often not the case.

Second, Larry assumes that you actually want people to use your code. Lots of publishers of open source code seem surprisingly unconcerned by this, unfortunately. The functional, practical benefits of open source all start with someone else reusing your code, so if you’re publishing open source code at all, you should be concerned about making it easy for people to use the code you publish. Again, putting licensing information in each file can help make this easier, by making it easier for people to figure out their rights and responsibilities. (This is particularly true if you want commercial uptake, since so many commercial users of open source are getting more conservative about using source code that is not properly labeled and licensed.)((Larry also perhaps assumes you want people to respect your license when using your code; that is a surprisingly complex topic that I will try to address some other day.))

So, yes: if you want people to find your licensing information, and  to use your code, per-file headers are the way to go. They may not be ideal but they really are worth the effort.