When Publishers Want to License Your Book to Generative AI Firms (updated)


Is it part of your book contract that your publisher may license your book to the makers of ChatGPT or other large language models or forms of generative AI? If given the choice, should you go along with it?

Those are the questions raised in a recent email from a philosophy professor. He writes:

Last year, I published a book with Cambridge University Press. Now, I have received an email from them requesting that I sign an addendum to my contract agreeing to them licensing my book to providers of generative AI. They note that such technologies “offer opportunities and risks.” I imagine other philosophers have been getting similar requests.

I see some of the potential opportunities here (remuneration; getting my arguments read by other philosophers in the form of supposedly original student essays), but I’m having a harder time thinking through the potential risks—in part because I have little familiarity with the work philosophers are doing on AI. Given this, I was thinking that more-informed Daily Nous readers might be able to weigh in on the ethics and prudence of permitting such licensing.

Discussion welcome, especially from those who’ve faced similar offers, as well as from readers with expertise in legal and ethical matters related to AI, intellectual property, and publication ethics.


UPDATE: Via James Klagge (Virginia Tech), I learned that MIT Press is surveying its authors about this topic. The questions are below. If other publishers are doing this, or are announcing policies, please let us know in the comments. Thanks.

MIT Press Survey Questions:

Do you believe that works you have authored should be used to train generative AI systems? How do publisher practices in this area impact your own choice of publishing partners? 

Please share your perspectives on whether academic publishers like the MIT Press should enter into paid licensing arrangements with AI companies seeking to train on textual works under copyright, and under what conditions. For example, should it be a pre-requisite for any potential LLM partner to provide reliable attribution to your work if that work significantly informs a chatbot’s query response?

If the MIT Press enters into any such licensing deals that have the potential to include your own work(s), would you like to be provided with an “opt in” notification? 

If the MIT Press enters into any such licensing deals that have the potential to include your own work(s), would you expect to be compensated, as per other digital licensing partnerships according to your publishing contract? 

Anything else on LLM training practices and licenses that you would like to share?

 

guest

15 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Gorm
Gorm
1 year ago

I am very curious. What is remuneration? That is, what is the publisher promising the book author for signing the addendum to your contract. Here is where my imagination fails me.

Milan
Milan
Reply to  Gorm
1 year ago

I’m just guessing, but I suppose that the publisher is hoping to get some payment from the AI companies for licensing the book to them, and that the author is promised a similar share of those payments as for the profit from book sales, i.e. proabably around one percent. At the moment, it seems that most AI companies rely on content that is available for free online and don’t bother about licensing. (It’s unclear how legal that is.) So, the whole situation the addendum is meant for is hypothetical.

Last edited 1 year ago by Milan
Kenny Easwaran
Reply to  Milan
1 year ago

> At the moment, it seems that most AI companies rely on content that is available for free online and don’t bother about licensing. (It’s unclear how legal that is.)

I think this situation is changing quickly. Some of the holders of the most valuable intellectual property for training AI are themselves training AI, and thus want to legally bar competitors from using their data for free. (I’m thinking here primarily of Google, which owns YouTube, and doesn’t want Anthropic or OpenAI training their models on this data. I suspect Microsoft has a similar amount of valuable data they are sharing with OpenAI that they don’t want Anthropic or Google using, and it’s possible that Amazon has data they are sharing with Anthropic that they don’t want OpenAI and Google to use.)

I don’t know where any of this has landed yet, but it seems likely that the major players here are getting to be a lot more interested in enforcing copyright law against their competitors, even if it means they have to live with it themself.

Definitely a year or two ago though, they were all just claiming that reading anything on the internet is fair use.

Milan
Milan
Reply to  Kenny Easwaran
1 year ago

Thanks. This is interesting. I suppose this is part of the reason why the email writer’s publisher wants to create legal certainty around the AI rights.

Michel
1 year ago

FWIW, Taylor & Francis (which owns Routledge, among others) has already signed away your/our work in return for a large sum, with no warnings given, consent asked, or remuneration offered, contracts be damned.

ikj
ikj
Reply to  Michel
1 year ago

the market value of your intellectual labor is as words strung together in coherent sentences to be vacuumed up by machines in order that they can also string together more or less coherent words.

Kenny Easwaran
Reply to  ikj
1 year ago

I think it’s important that the sentences aren’t just coherent, but in fact contain some sort of representation of a world outside the text (either a physical world or a world of ideas). They want to be able to make more and more coherent strings of words that contain within them representations of these worlds (even if the reference only comes through deference to more expert members of the language community).

ikj
ikj
Reply to  Kenny Easwaran
1 year ago

while i understand the desire, i remain skeptical that worlds of any kind other than syntactic are a relevant concept for llms. but i’m no expert!

Anco
Reply to  Kenny Easwaran
1 year ago

“some sort of representation” is doing quite some heavy lifting there. It’s more like the shadows of the shadows in Plato’s cave: there is the world, the words we use to describe it*, and then there is the vectorized probability distribution of associated word clusters (which might be referred to as ‘syntax’ in the narrowest of terms).

There is likely enough space to introduce more degrees of separation too.

* Apologies to all the Platonists and philosophers of language here.

AGT
AGT
Reply to  Michel
1 year ago

I don’t understand how this is legally possible. Surely, they will be sued? Or is this just a joke?

Michel
Reply to  AGT
1 year ago

Not a joke, it really happened: https://www.thebookseller.com/news/academic-authors-shocked-after-taylor–francis-sells-access-to-their-research-to-microsoft-ai

They may be sued, but lawsuits take time, energy, and money. And authors will be entitled to what, $2 each in royalties? Maybe a few more in punitive damages?

martin peterson
1 year ago

CUP offers a 20% royalty, which is much more than I receive for printed copies. I like that they let me decide what I want to do; other publishers do not seem to do that. So thanks CUP!

Gorm
Gorm
Reply to  martin peterson
1 year ago

Thanks Martin. This is helpful. The one worry is that your book (or mine) may be sold in a bundle such that you effectively make very little money from this. It might thus be valued at $10, instead of $36 (the approx. value of the PB), and thus yield you a mere $ 2.00. But, obviously I do not know.

AGT
AGT
1 year ago

This isn’t on topic officially, but connects to the broader issue. Routledge wants us to sign a contract with a ROFR clause in it: they would have right of first refusal, before we go to any other publisher with future work. I have never come across this in an academic publishing contract before and I don’t want to sign it. Anyone any experience?

Two-time remover
Two-time remover
Reply to  AGT
1 year ago

This is pretty standard boilerplate for book publishers. It is normally negotiable, so you can ask to have it removed before signing. I have requested that of two presses, and both were happy to remove that clause. But you do have to ask.