U.S. AI Action Plan Proposals: AI Companies Seek “Freedom” to Access Data & Copyrighted Information for LLMs Training

In their U.S. AI Action Proposals, OpenAI and Google want the freedom to train AI LLM models on copyright data and publicly available data.

Written by Jackie Leavitt (Editor at Large)

Reviewed by Aleksander Hougen (Chief Editor)

Last Updated: 24 Mar'25

In response to the White House’s request for input for the U.S. AI Action Plan, Google and OpenAI have submitted proposals requesting that the U.S. government let the companies train their AI LLMs on copyrighted data.

Both these proposals equate to the companies asking the government to exempt LLMs from established copyright laws — and Google’s proposal edges into privacy laws, as well.

In OpenAI’s proposal and online statement, the company points to reasons related to national security, economic growth and competition with China as reasons to promote their “freedom-focused policy proposals,” including “freedom of intelligence” and “freedom to innovate and learn.”

“[W]e are at the doorstep of the next leap in prosperity: the Intelligence Age. But we must ensure that people have freedom of intelligence, by which we mean the freedom to access and benefit from AI as it advances, protected from both autocratic powers that would take people’s freedoms away, and layers of laws and bureaucracy that would prevent our realizing them.”

— OpenAI AI Action Plan Statement

OpenAI’s copyright strategy argues that “[t]he federal government can both secure Americans’ freedom to learn from AI, and avoid forfeiting our AI lead to the PRC by preserving American AI models’ ability to learn from copyrighted material.”

It’s worth noting that OpenAI has many ongoing lawsuits — including with The New York Times and several Authors Guild writers — who allege the company has already used and trained its models on copyrighted material.

Google’s proposal also discusses copyright access (though Google’s statement on the proposal does not).

Section 1.D.ii of the proposal argues that “[t]hree areas of law can impede appropriate access to data necessary for training leading models: copyright, privacy, and patents.”

Most concerning about Google’s proposal is its request to exempt publicly available data from privacy regulations for the benefit of training AI models.

“Balanced privacy laws that recognize exemptions for publicly available information will avoid inadvertent conflicts with AI or copyright standards, or other impediments to the development of AI systems. A federal privacy regulatory framework should define categories of publicly available data and anonymous data that are treated differently than personally identifying data.”

— Google’s AI Action Plan statement

If you’re wondering what publicly available information (PAI) is, well, it varies state to state, but generally it can encompass data from many sources, including:

The internet, including publications, blogs, discussion groups, social media, etc.
Global media sources
Public government data
Professional and academic publications
Commercial data
Technical reports, patents and business documents
Technical data, public domain information, IP addresses
Deep web or dark web information

This is a lot of publicly available personal data collection, and if you care at all about data privacy, you should be concerned.

We will continue to cover this story as the White House evaluates proposals.