I’m curious what the community thinks about licenses for submitted packages via onboarding.
We do currently have on question about licenses - whether the license is an accepted CRAN license. (p.s. this is something we could automatically check in our planned increase in automated checks)
In addition, we may add a question to the submission template along the lines of:
Why did you choose the license you’re using?
If the answer is along the lines of:
Just sorta picked one
Then I’d vote we suggest MIT. Anyone disagree?
It’s possible the submitter could have a good reason for the license they have chosen, in which case they can likely keep what they already have.
A suggestion would be great. By why MIT? We make our default suggestion Apache2 because of the included patent clause, which is missing from other licenses.
I recall Neil pointing out that MIT license was deemed too vague to provide proper open license protection in a UK study or legal test, but we’d have to bug him to get the reference (even the license name MIT is a bit ambiguous – is it the same as Expat & X11?).
I think it is worth while asking why a given license was chosen as part of our onboarding process just for fun and to see that the author has given it appropriate thought (in particular, we wouldn’t want authors to include GPL’d code while declaring a different license). Matt pointed me to drat (no, not that drat) http://www.isi.edu/~gil/papers/mattmann-etal-softmine2015.pdf as a tool for identifying these issues automatically.
Me, I’ve been using the BSD 2-clause myself recently, because it is permissive, approved by my institution, and has my institution’s name in it ;-). I think it’s safest that rOpenSci just leaves the details of recommending licenses to the lawyers though.
It’s pretty common for employers to put that they retain some level of IP rights to work related to job duties in employment contracts. This includes faculty positions, AFAIK (though I of course have never held one, my grad student contract did say that). I remember that Duncan (who is on R-core) had to explicitly refuse to sign that part when he joined Davis and that they had to hire him anyway. Robert Gentleman had to do the same thing when he joined Genentech (though I think the scope of the exception was narrower than what Duncan got from the UC).
That’s why all my recent packages are Author: me, Copyright: Genentech, Inc.
Faculty are treated differently at UC than staff wrt scholarly works. A good summary of the situation is on the UCOP site, which says:
According to UC’s policy, ownership of copyrights to scholarly and aesthetic works generally reside with the faculty creator, with certain exceptions. If, for example, the work is sponsored or contracted, or is part of a project that has special provisions on copyright ownership, then copyright ownership is generally retained by the university.
So, Carl probably owns his copyrights on scholarly works, but there may be other factors considered at Berkeley. Not sure if code fits here – it was originally written to deal with scholalrly articles. Carl could probably fight the UCB policy either way. If he in fact is the copyright holder, then UC license policy does not apply – he can make his own decisions.
Interesting discussion. @cboettig, who authored that UC Berkeley OSS license chart? I’ve never seen it before. A lot of the code I’ve seen released by Berkeley staff and students is GPL-3 and Apache 2.0, so clearly most academic software authors have no idea it exists, or don’t believe it pertains to the code they create.
@sckott, the patent clause with Apache 2.0 is effectively the following, “the original author of the code grants you rights for the patented stuff that might be included in the code.” (source) Apache 2.0 is typically considered the most business-friendly license because companies who use your OSS don’t have to worry about the original author owning a patent on something that they produce with it. That’s why we use Apache 2.0 at H2O.ai. Which I guess is probably the exact reason that UCB doesn’t like it – UC is known to be pretty aggressive on patent enforcement.
@ledell That is actually a draft document which was all that the campus Licensing Specialist in our Office of Technology Licensing, Division of Intellectual Property & Industry Research Alliances (IPIRA) could provide at the time I made my inquiry when I started at Berkeley last Fall, so it’s not actually a public document. From what I understood of the Specialist’s explanation, the exclusion of Apache 2.0 was precisely due to the patent clause you identify (though it sounded like it wasn’t a simple as the UC wanting to retain ownership of a possible patent but more to avoid entanglement in a patent dispute if someone else pursued patenting – ianal and can’t promise I really followed the details). I believe this policy is supposed to be UC-wide and not specific to one campus, but Matt can probably weigh in more since he’s dealt with this for longer.
Also not clear how collaboration on software ends up being viewed when UC (and many other universities) treat copyright ownership differently between staff and faculty (and students, though I believe we treat them like faculty as copyright owners)?
I agree it’s pretty confusing when you try and see what others are doing – like you say, some of the most trumpeted software produced by campus faculty and students uses the Apache 2.0 license, including Apache Mesos & Apache Spark, though I understand they were originally developed under a different license (I think BSD) and re-licensed when they were transferred to Apache.
Anyway, I’d love greater clarity on all this myself but hence my hesitation to make recommendations in face of all this stuff that’s over my head.
tl;dr I’m increasingly of the opinion that the previous concerns about the MIT (and to a lesser extent the BSD) license are no longer an important issue. However I find it odd that UC Berkeley is against the Apache 2 license - most organisations choose to use Apache because it provides better patent protection than MIT and BSD.
MIT license
The original concern was that with the way that the MIT license was written. The license has a clause which includes a disclaimer of all liability. In certain jurisdictions, e.g. England and Wales under English Law, it is not allowed to disclaim liability of death arising from proper use. Any invalid clauses have the potential to be struck out in their entirety, thus potentially leaving you open to unlimited liability under English Law.
However I’ve now had the chance to talk to some IT law academics, as well as some actual software contract lawyers. And whilst none have gone on the record to provide legal advice, their comments are that it is likely that any judge would seek to remove the smallest possible part of the clause that kept the spirit of the contract. Therefore you would still retain most of the disclaimer.
Apache License and patents
I’d be interested if the issue UC has is with the patent grant clause or the patent retaliation clause in Apache. From the document, it appears to be the patent grant clause.
It sounds like the thing you need to think about is how you do (or don’t) take ownership of contributions. Are packages completely separate? Are there any other license incompatibility issues? Who is contributing?
Thanks @npch for joining us here - And good news regarding MIT, since at least most of my software users MIT
This is a great question that I don’t know the answer to. We accept contributed packages - and have people transfer their github repo to our github org. account. However, we almost never become authors/contributors on their software unless we actually contribute some code. There is some movement now to allow authors of contributed packages to rOpenSci to list reviewers in their package: rnaturalearth · Issue #22 · ropensci/software-review · GitHub So if a software package is in our github org account, but none of us are authors, do UC Berkeley or our funder(s) have any say over licensing? Don’t know. In addition, there are package contributors from lots of different countries - does that confuse things at all? I imagine it does. - Seems like these are issues other groups have probably dealt with, right? Curious what they’ve done. Wonder how Jupyter handles contributions, what licensing they use
In general, if you do not have a formal contribution policy that transfers copyright and/or the right to relicense, then no. The copyright of the contributed software package remains with the authors, and the license is the one they have chosen. rOpenSci is simply hosting the package.
Note that some permissive licenses give permission to relicense code under a different license, which would mean that you would be able to have a say over licensing of modified versions, but I don’t recall any that transfer copyright (which is one reason why you can’t just strip copyright / author statements out of code, apart from the obvious reason of it’s not the done thing).
Generally international contributions don’t add much confusions apart from two potential areas: export controls (e.g. for some crypto-related code) and - potentially - differing views on copyright (Russia’s copyright laws spring to mind). However it rarely causes issues for non-sensitive open source code.
I’d take a look at this guidance on Contributor Licensing Agreements from OSS Watch - useful stuff, with links to some examples at the bottom: http://oss-watch.ac.uk/resources/cla
Just to pitch in, I participated in some discussions about this on campus, and it seems that the way this policy is applied doesn’t just distinguish faculty/staff, but actually where the source of the funds comes from: basically if it’s grant-funded work and the grant said the license would be X (say Apache), then that’s ok.
A perfect example is the AMPLab’s work: all they do is Apache licensed, and they have a mix of faculty, postdocs, students and staff who all contribute to the same codebase.
What is not clear is what happens if staff from other divisions wants to collaborate with existing projects… I really think this academically funded/staff work division is very ill-thought and counterproductive, and I hope we might make inroads in improving the situation.
And just to make sure it’s clear (it’s hinted above but not super-explicit): these policies are UC-wide, not Berkeley specific. In fact, my understanding of the concerns with patents stems in part from the fact that any one UC campus has no authority to issue a patent grant impacting a different campus, yet if the overall license refers to the UC Regents at large, that could be interpreted as being the case (I’m not defending that argument, just paraphrasing what someone explained to me). It’s quite confusing and not ideal, IMO.