Hi all,
It was a pleasure to meet you at FOSDEM, and I hope you had safe trips home.
In this e-mail I share the minutes of the meeting we had, though calling them "minutes" might be overselling it. I'm summarising the points raised. If I missed any, then my mind is fallible and my notes incomplete. I aim to keep the e-mail short, but I'm notoriously bad at that.
- It was raised repeatedly that adding headers to big projects is a non-trivial task that very few people want to do manually. If this can somehow be automated, and if REUSE can facilitate doing this automatically, then it might increase the amount of headered files.
+ Important to note, however: This may result in false headers. Automatically adding headers to old files skips manual legal review. This is something to be cautious about, and to convey to users of such a tool.
+ This would require a template(s). The template(s) would be put in a "[.]reuse" directory.
+ Provide links to tools that might already do this, e.g. Maven plugin.
- The utility of the bill of materials (third step of REUSE) was debated.
+ There was some misconception that this file could be used to declare the copyright and licensing of a file like in debian/copyright. This is not so. Rather, it is a list of ALL files in a project and the copyright info that was discovered.
+ The bill of materials must ALWAYS be generated by a computer, and possibly curated by hand.
+ As such, the bill of materials may not need to exist within the repository. It is build output, which customarily never exists in the codebase. But if one wanted to, they could keep it in the repo and "bump" the file with every release.
+ Moreover, the bill of materials is not an actionable STEP for developers (though more on the target audience of REUSE later). Developers can run `reuse compile` to generate the BOM, but then there is nothing they can do with the output. The output is for legal teams. As such, being able to generate the BOM should be a _goal_ rather than a step. i.e., once you can generate a complete BOM, you have succeeded in adopting REUSE.
- The specification is not a good tutorial. A new tutorial must be created.
+ Link to existing tutorials.
+ One tutorial cannot cover the complexity of the spec unless the tutorial has an interactive decision tree.
+ If the decision tree is not implemented, then a tutorial ideally only covers _one_ path through the spec, and this should be clearly marked. e.g., "this tutorial is one easy way to be REUSE compliant, but it is not the only way".
+ The tutorial should be as short and sweet as possible, and have a limited scope. People do not like reading long tutorials, so a long tutorial would hamper adoption of REUSE.
- The specification is not a good specification.
+ It would be nice if the spec had identifiable bullet points that can be referenced, instead of being a wall of text.
+ The spec contains an error regarding SPDX Exceptions, which is approved for fixing.
+ The spec contains a lot of silly edge cases. Rewriting the spec will fix this.
+ Matije and Carmen will co-operate on drafting a new spec.
- The recommendation that Git/VCS could be used to record copyright and authorship information instead of recording this information in the files themselves will be SCRAPPED. People _could_ do that if they really wanted to, but it would be out of the scope of the REUSE spec.
- The tool is currently hosted on gitlab.org. The website/spec is currently hosted on github.com. There are still some bits and pieces on git.fsfe.org, possibly. It would be good if this were cleaned up.
+ Preference for GitHub and/or GitLab, because git.fsfe.org has a non-trivial barrier for entry. GitHub probably has the lowest barrier, but is the least in line with FSFE's ideals.
+ Consult Matthias about this. Gabriel?
- Programmers really like automation. The tool could do some extra things:
+ Possibly reduce Python version from 3.5 to 3.3/3.4 for teams that are stuck on old Python versions. Python 2 is out of the question.
+ Provide a Docker container for people who really hate dealing with Python.
+ `reuse init` to automatically set up some simple stuff.
+ Download licences from SPDX or elsewhere. Fill in the templates automatically.
+ Set up the DEP-5 file.
+ Auto-generate headers (see above).
- People don't want to mark trivial or config files as having copyright, and want to exclude them from the linter.
+ Excluding files from the linter is a can of worms best left closed.
+ Make a clear recommendation to license those files under CC0.
- Sites such as GitHub do not recognise the LICENSES directory. This leads to an awkward situation where you have to put your "main" licence in LICEN[CS]E, COPYRIGHT or COPYING for it to be recognised. This is not ideal, because now LICENSES does not contain ALL licences.
+ If you elect to put everything in the LICENSES directory, the site will erroneously say that you have no licence at all.
+ This also messes with multi-licence projects.
+ Contact GitHub/GitLab/Bitbucket about collaboration.
- "debian/copyright" is not a great file path. It is incompatible with projects that legitimately include that path, and it incorrectly invokes a relation to the Debian project.
+ Allow the user to specify their own path.
+ Change the default to "[.]reuse/dep5" (or something similar).
- MIT and BSD are tricky licences. They include the copyright holder within the licence---which means that no two such licences are identical---and mandate an exact reproduction of the licence. REUSE currently deals with this by recommending the developer to create a separate licence for every copyright holder, but this is frankly very ugly.
+ Solve this by not making any recommendations about this in the spec. Instead, include an explanation of the problem in an FAQ.
- Speaking of an FAQ: Turn the flight rules into an FAQ.
- There was some confusion regarding the goal of REUSE and its target audience, and I am not certain it is solved entirely. What follows is a summary mixed with personal reflection.
+ The argued goals are (1.) making sure that copyright information can be parsed by computers, (2.) making sure that developers know how to license their stuff, (3.) make sure that lawyers can generate SPDX files, and probably some more goals that have slipped my mind.
+ After some reflection, it appears to me that they are all perfectly valid and tangential goals, but the fact that there was some (slight) confusion and disagreement here re-ignites a pain point I should have written down, but did not: REUSE does not have an elevator pitch, and it desperately needs one. When I talk to a developer and tell them about REUSE, I do not have a simple, quick explanation at hand, even though I've been working on this for a while. Any attempt at an elevator pitch quickly becomes convoluted:
"The REUSE Project presents a set of recommendations to developers that they can implement so that their project have full coverage of copyright and licence information, in such a way that a computer can verify this". This elevator pitch is correct, but more than a mouthful.
I hope that this has covered most of the talking points (and action points). Please correct the things I got wrong, and add things I forgot. I am only human.
With kindest regards, Carmen
Dear Carmen,
thanks for writing this up. Looks like proper minutes to me and I don’t remember anything that you didn’t write down already :)
Die 5. 02. 19 et hora 15:10 Carmen Bianca Bakker scripsit:
- Possibly reduce Python version from 3.5 to 3.3/3.4 for teams
that are stuck on old Python versions. Python 2 is out of the question.
I checked with our guys and it seems the problem is that the forelast macOS version (what in Debian terms would be “oldstable”) has Python 2.7 by default, but the latest (“stable”) has already 3.7.
So IMHO 3.5 is quite acceptable, and people stuck on “oldstable” can simply install a newer version of Python (perhaps we can list a few options how to do that).
cheers, Matija
Die 5. 02. 19 et hora 15:10 Carmen Bianca Bakker scripsit:
- MIT and BSD are tricky licences. They include the copyright
holder within the licence---which means that no two such licences are identical---and mandate an exact reproduction of the licence. REUSE currently deals with this by recommending the developer to create a separate licence for every copyright holder, but this is frankly very ugly.
- Solve this by not making any recommendations about this in
the spec. Instead, include an explanation of the problem in an FAQ.
I would suggest that we stick to the SPDX templates and that if SPDX would recongise the text as vanilla MIT or BSD, so should we.
As you can see from the MIT and BSD-4-Clause (the most wordy one) examples linked below, everything that is in red is variable text and SPDX would treat different strings there as still conformant with the template:
https://spdx.org/licenses/MIT.html https://spdx.org/licenses/BSD-4-Clause.html
See the <alt> and <optional> tags in the XML sources of the SPDX license templates here:
https://github.com/spdx/license-list-XML/blob/master/src/BSD-4-Clause.xml https://github.com/spdx/license-list-XML/blob/master/src/MIT.xml
One idea how we could tackle this (and it would be wonderful to sync up with SPDX for this!), would be to:
• store the SPDX template text in plain-text form¹ in `LICENSES/ MIT.txt`
• include the following in the headers:
file_a.txt: ``` © 2010 Some Hacker some1@hack.me SPDX-License-Identifier: MIT ```
file_b.txt: ``` © 2015 Corporation X x@example.com SPDX-License-Identifier: MIT ```
• this would mean all the needed info is there. If you look at the license text (or rather its SPDX template), you don’t need anything else but insert the copyright year(s) and holder in the license template. The other parts are optional to be changed – e.g. “copyright holder” is prefectly enough and does not *need* changing, it is just that several companies/projects chose to do so.
• in turn, a BoM tool would then gather the copyright and license info anyway, so all’s well.
cheers, Matija — 1 https://github.com/spdx/license-list-data/blob/master/text/ MIT.txt