On the Importance of Per-File License Information

After the release of MPL 2, the first request for MPL 2.1 came from someone who didn’t want to put copyright headers in individual files. The issue has recently reared its head in Apache as well, and I recently was asked related questions by a GPL user as well.

The main reasons given for not using per-file headers are two-fold:

  1. They’re awkward in short files. Many programming frameworks these days (most notably rails) are encouraging creation of many short files, so this is becoming a bigger problem than it was when per-file headers were first created.
  2. They’re not considered very relevant in languages or frameworks that are library-centric, especially those with package managers that are heavily used (like ruby’s gems). Again, many modern language frameworks include tooling that encourages this approach, so some developers are thinking “why can’t I just express the license once per library?”

The case for per-file copyright headers is put well, and succinctly, by Larry Rosen:

“[O]ur goal is to pass on any important IP information that might be useful … in the place(s) [downstream licensees] are most likely to find it.”

Larry’s comment makes two assumptions that I want to flesh out and support.

First, Larry assumes that the place where people are “most likely to find” licensing information is in per-file headers. It is true that in the best case scenario in many modern languages/frameworks, library-level is a great place to put licenses – in normal use, they’ll get seen and understood. But lots of coding in the wild is not “normal use.” I review a lot of different codebases these days, and files get separated from their parent projects and directories all the time. And then you have to use fairly complex (and often expensive) tools to do what should be a simple task- figure out what the license is. So, yes, modern frameworks should in theory reduce the need for per-file licensing information – but in practice, that is often not the case.

Second, Larry assumes that you actually want people to use your code. Lots of publishers of open source code seem surprisingly unconcerned by this, unfortunately. The functional, practical benefits of open source all start with someone else reusing your code, so if you’re publishing open source code at all, you should be concerned about making it easy for people to use the code you publish. Again, putting licensing information in each file can help make this easier, by making it easier for people to figure out their rights and responsibilities. (This is particularly true if you want commercial uptake, since so many commercial users of open source are getting more conservative about using source code that is not properly labeled and licensed.)((Larry also perhaps assumes you want people to respect your license when using your code; that is a surprisingly complex topic that I will try to address some other day.))

So, yes: if you want people to find your licensing information, and  to use your code, per-file headers are the way to go. They may not be ideal but they really are worth the effort.

20 thoughts on “On the Importance of Per-File License Information”

  1. I have a related question. I’m happy to put licensing information in each file, but the FSF (e.g.) recommends an obstrusive 14-line statement along with the copright line. Is that really necessary, or does a simple statement of “Copyright 2012 me ; licensed under the fooPL” do the job?

  2. Another con is that Rosen’s goal ends up being self-defeating in practice. People *suck* at keeping those preamble bits up to date: they’ll include some BSD-licensed code via cut-and-paste into their GPL project, but the preamble on that file will still say the whole file is GPL, for example.

    Of course, that happens with a LICENSE (urgh) file as well, where you infrequently see more than a single licensing statement, despite inclusion of code from many other places. But at least then it’s only one place that’s needs correcting.

  3. zac: The URLs frequently change to present a different document, or no document at all (404, or “please update your links”, or worse).

    The grant of license is an action at one point in time. The time and place and context when recipients need that information is often distant, in years or countries or VCS revisions or project splits or re-incorporations or any of a number of other barriers.

    The point of putting an explicit statement of what permissions are granted *in the file*, and what the exact license terms are, is so that every recipient has the best possible chance of knowing what the intent of the copyright holder is.

  4. Peter: One of the Red Hat lawyers has given that particular issue a lot of thought; I’ve asked him to weigh in.

    Malcolm: That’s true of any licensing information, anywhere, so I don’t think it’s a strike against keeping information in specific files.

    Zac: I think you really need name and version number, and some indication that it is the license. See MPL 2.0’s header for a fairly-close-to-minimal example.

    Ben: I think, as long as you’re using a modern, common, standard license, using the full name, version number, and canonical URL gets you a long way. Yes, URLs are not ideal, but the hosts for the major licenses (FSF, Mozilla, Apache) are pretty good about maintaining robust, permanent URLs for the major licenses, and if you have the name and version number, you can get things from OSI if the canonical URI fails. [This is yet another strike against BSD/MIT, by the way.]

  5. I can’t agree enough. Making code do unexpected things is a large part of free software’s appeal. Clear license notices on each file prevent legal friction from building up in the process over time. Thanks for taking the time to hash this out.

  6. I can’t see where the MPL requires per-file headers. It specifically says in EXHIBIT A for the license notice:

    “If it is not possible or desirable to put the notice in a particular file, then You may include the notice in a location (such as a LICENSE file in a relevant directory) where a recipient would be likely to look for such a notice.”

    And for the copyright notice itself (without the license notice), it only says that you may not remove or alter them.

    And per-file headers cannot work anyway. You cannot really attach a notice to an image in a sensible way, or in any other form not allowing for comments or license statements. And you most likely do not want to put the notice into README files and its friends either, but those also require a license (and at least for Debian, that license must be free as well, in order to be shipped).

    1. Hi, Julian: That wasn’t what I intended to imply with my post, sorry it was unclear. Because it has come up in other places, we actually wrote a new FAQ entry to address this question. The permanent version is here; the text follows:

      Q22: Does MPL 2.0 require that the MPL 2.0 license notice header be included in every file?

      The license notice must be in some way “attached” to each file. (Sec. 1.4.) In cases where putting it in the file is impossible or impractical, that requirement can be fulfilled by putting the notice somewhere that a recipient “would be likely to look for such a notice,” such as a LICENSE file in the same directory as the file. (Ex. A.) The license notice is as short as possible (3 lines) to make it easy to put in as many different types of files as possible.

      While the license permits putting the header somewhere other than the file itself, individual files often end up being distributed on their own, without the rest of the software they were authored with. As a result, putting the license notice in the file is the surest way to ensure that recipients are always notified.

  7. […] on both keywords and karma then aggregates) (github.com) 5 points by Jd 1 hour ago | discuss32.On the Importance of Per-File License Information (tieguy.org) 2 points by martey 1 hour ago | discuss33.Crowd-Investing and Social Entrepreneurship […]

  8. [Deleting the comment under this blog’s no asshole policy. Feel free to contribute by actually addressing the arguments in the post, instead of just calling the proposed policy “brain damaged.”]

  9. […] On the Importance of Per-File License Information Martin Michlmayr's OSI News Items – Mon, 2012-03-19 14:53 Categories: OSI News Picks Opensource.org site content is licensed under a Creative Commons Attribution 2.5 License. | Terms of Service _uacct = "UA-3916956-1"; urchinTracker(); […]

  10. […] Please place these statements at the top of the file, not in the middle or at the end. This is important for various reasons. Email addresses or company contacts for copyright holders should be listed either in the […]

Comments are closed.