Hi, me again
So I am going to respond to multiple comments in one go:
I had a look at Julia Reda's post, and as far as I can make out, she only focuses on the fact that individual snippets are very short - but doesn't make any mention that inserting *lots* snippets algorithmically is *all* that copilot does...
In a way her position is understandable - I believe it is consistent with that of the pirate party - they are copyright minimalists, and would prefer a world with no copyright, as far as I can tell. But until IP lawyers call themselves TSOALGGM lawyers (that expands to "temporary stewards of a limited government granted monopoly" rather than "intellectual property") I am not sure if her view is representative of the current situation.
Another commenter said that the codebase used to generate the model is just somehow the "input" and not actually *in* model - but I am not sure the distinction is that clear. If I ROT13 a Metallica mp3, then there is an algorithmic transformation and new file is clearly different, but it is possible to recover the original. In the same way it could be argued that the copilot model encodes the input code in its weightings. I suppose there are some losses, but if I were to downsample and ROT13 a Metallica CD (I don't, I have decided not to like their music), I'd still be in trouble if I'd claim it as my own work, right ? And if I XOR it with a Rick Astley mp3, would that suddenly be fair use ?
Finally for more amusement value: A different conversation points out to me that I should have made my mail more click-baity: "Does copilot mean that microsoft has lost its license to distribute the linux kernel ?" I am still not sure - but maybe this bit of sensationalism makes is clearer what is at stake ?
regards
marc