TLDR: Salesforce is facing a proposed class-action lawsuit filed by authors who allege the company used thousands of copyrighted books without permission to train its artificial intelligence models, including its xGen AI series and models powering Einstein Copilot. The lawsuit highlights the growing legal challenges faced by tech companies over the use of copyrighted material in AI development.
Cloud-computing giant Salesforce is now embroiled in a significant legal battle, facing a proposed class-action lawsuit from a group of authors who claim the company unlawfully utilized their copyrighted works to train its artificial intelligence software. The complaint, filed in federal court on Wednesday, October 15, 2025, alleges that Salesforce infringed on intellectual property rights by incorporating thousands of books into the training datasets for its xGen AI series of large language models (LLMs) and other generative AI tools like Einstein Copilot, which are powered by AI firm Cohere’s models.
Authors E. Molly Tanzer and Jennifer Gilmore are among the plaintiffs, with other reports also mentioning bestselling writers such as Jonathan Franzen, Jodi Picoult, and George Saunders, indicating a potentially broader legal action. The lawsuit specifically points to the inclusion of ‘notorious RedPajama and The Pile datasets’ in Salesforce’s training regimen, which are said to contain the ‘Book3 corpus’ – a collection of hundreds of thousands of copyrighted books acquired without the explicit authorization or consent of their creators.
Joseph Saveri, the attorney representing the authors, underscored the critical need for transparency from companies developing AI products that rely on copyrighted material. ‘It’s important that companies that use copyrighted material for AI products are transparent,’ Saveri stated, adding, ‘It’s also only fair that our clients are fairly compensated when this happens.’ The plaintiffs are seeking substantial damages and an injunction to prevent Salesforce from any further unauthorized use of their content in AI training.
Adding a layer of irony to the situation, the lawsuit reportedly cites previous statements by Salesforce CEO Marc Benioff, who has publicly criticized other AI companies for using ‘stolen’ training data and suggested that compensating content creators for their work would be ‘very easy to do.’ The complaint argues that Benioff’s own company should adhere to these principles.
Also Read:
- Generative AI Copyright Dispute: Artists and Writers File Landmark Lawsuit Against Google
- Salesforce Ventures Deploys $850 Million from AI Fund into 35 Leading Enterprises
Salesforce has not yet issued a public statement regarding the lawsuit, with a company spokesperson declining to comment on the matter. This legal action against Salesforce is not an isolated incident but rather part of a burgeoning trend. Numerous authors and content owners have initiated similar lawsuits against other major tech firms, including OpenAI, Microsoft, and Meta Platforms, all alleging the misuse of copyrighted material for AI model training. Notably, Anthropic recently reached a landmark $1.5 billion settlement in August with a separate group of authors over similar copyright infringement claims, setting a precedent for the potential financial implications of such disputes.


