25 November 2024

AI code of practice: first draft and first copyright meeting

On 1 August 2024, the EU’s Artificial Intelligence Act entered into force. This regulation carries significant implications for the cultural and creative sectors as AI increasingly transforms artistic processes and leverages cultural data. Among other provisions, the AI Act introduces obligations for providers of general-purpose AI (GPAI) models, including transparency and copyright.

To define the technical measures and policies GPAI providers must implement to meet these obligations, the European Commission’s AI Office is facilitating the drafting of a Code of Practice. By May 2025, this document will outline best practices and measures to support providers in complying with their legal requirements. The Code is being developed through a multistakeholder process involving nearly 1000 participants from industry, academia, civil society, and rightsholder organisations. Culture Action Europe also participates in this working group. The process is led by Chairs—renowned experts—who consolidate stakeholder input to draft successive iterations of the document.

A first draft

Last week, the AI Office published the first draft of the Code. Below is an overview of the main measures related to transparency and copyright.

Measure 3: Internal Copyright Policy

Providers of GPAI models must implement an internal policy ensuring compliance with EU copyright laws across the entire lifecycle of their models. They should also assign clear responsibilities within their organisations to oversee this policy.
Providers of GPAI models must perform copyright due diligence on upstream parties before contracting them and ensure that these entities have respected rights reservations. In the context of AI model development, ‘upstream’ refers to the process of collecting and preparing the datasets used to train the model.
Providers of GPAI models should take steps to mitigate the risk that downstream systems produce copyright-infringing outputs. ‘Downstream’ refers to later stages where the AI model, being essentially a statistical model, is integrated into tools or applications for real-world use. Providers are urged to avoid overfitting their models (when the model learns the training data too closely, including its noise or specific details) and should require downstream entities to prevent repeated generation of outputs identical or recognisably similar to protected works. This measure does not apply to SMEs.

Measure 4: Providers should identify and comply with rights reservations

Providers should only use crawlers that respect the robots.txt protocol.
Providers should ensure that rights reservation expressed through robots.txt does not negatively affect the findability of the content in their search engine.
Providers should respect other appropriate machine-readable means to express a rights reservation at the source and/or work level according to widely used industry standards.
Providers, excluding SMEs, should collaborate to develop and adopt interoperable machine-readable standards for expressing rights reservations.
Crawling activities must exclude pirated sources, such as those listed on the European Commission’s Counterfeit and Piracy Watch List or national equivalents.

Measure 5: Transparency

Providers will publish information on their websites about the measures they adopt to identify and comply with rights reservations, written in clear and understandable language.
This information should include the names of all crawlers used for GPAI model training and their relevant robots.txt features.
Providers are encouraged to designate a single point of contact to allow rightsholders to communicate directly and promptly lodge complaints regarding the use of protected works in GPAI model development.
Providers will draw up, keep up-to-date and provide the AI Office upon its request with information about data sources used for training, testing and validation and about authorisations to access and use protected content for the development of a GPAI model.

Transparency and Copyright

On 21 November, the first meeting of the Working Group on Transparency and Copyright, co-chaired by Nuria Oliver and Alexander Peukert, took place. Pre-selected participants, representing both rightsholders and tech companies, briefly presented their positions on the first draft of the Code of Practice. Culture Action Europe provides generalised feedback from the meeting (in line with the Chatham House Rule, names of organisations are not disclosed).

Providers’ copyright policies should go beyond merely respecting opt-outs, even though this is a crucial aspect. They should also incorporate measures to establish robust licensing frameworks and encourage collaboration with Collective Management Organisations and key rightsholders.
Many rightsholders argued that relying solely on the robots.txt protocol for opting out is insufficient and risks misapplication for AI training permissions. Rightsholders should be able to use other machine-readable mechanisms, such as opting out via terms and conditions on a website, public repositories of rights reservations, public declarations, or using Automated Content Recognition (ACR) technology to remove protected content from datasets.
Some participants suggested establishing an official public registry to explicitly record rights reservations. This registry would provide legal certainty for all stakeholders and enable tracking the dates of rights reservations, facilitating the removal of protected data from datasets as needed. However, one participant opposed the proposal, arguing that it could place an undue burden on rightsholders.
Regarding upstream copyright compliance, rightsholders argue that it should not be limited to a simple pre-check of datasets—GPAI model providers should require third parties to provide full traceability of the data they supply and details about their collection methods. The concept of ‘reasonable due diligence’ needs further elaboration.
Ensuring downstream copyright compliance requires GPAI model providers to share detailed information about the data used for training with the AI Office and downstream entities. This is the only way to ensure that AI outputs are not generated using illegal or infringing content.
However, others noted that downstream providers are often the only entities capable of properly assessing and managing copyright compliance within their specific operational context. They may manipulate their own protected content or hold licences which fall outside the control of GPAI providers.
Authors and rightsholders must be compensated for the prior unauthorised and illegal use of copyrighted works by GPAI providers. The Code of Practice should include a provision requiring AI providers to commit, through their copyright policies, to compensating for such unauthorised use. The Code should also establish a framework for sanctions and measures to address non-compliance.
At the same time, tech company representatives stressed the need to stay within the scope of the AI Act, avoiding additional obligations: ‘We’re here to finish the rules under the AI Act, nothing more, nothing less.’ They questioned the AI Office’s role, arguing it is ‘not a copyright enforcer’ and that its responsibilities in verifying copyright compliance are unclear.
They also pointed to technical challenges, including the unfeasibility of work-level rights reservations and the difficulty of downstream compliance. Predicting infringing outputs, they argued, is nearly impossible with current technology, and imposing copyright compliance on downstream providers lies outside the AI Act’s scope.

Both the next meeting and the publication of the second iteration of the Code of Practice are expected to take place in January 2025.

Culture Action Europe, together with the Michael Culture Association, has prepared considerations regarding the implementation of the AI Act, developed through our Action Group on AI & Digital. This paper forms the basis of the feedback we are providing in the Code of Practice drafting process.

Knowledge

Advocacy

Culture Action Europe opposes 3rd draft of the EU General-Purpose AI Code of Practice Considerations regarding the implementation of the European Union’s Artificial Intelligence Act A Cultural Deal for Europe

Events

State of Culture Webinar | Culture, digital & artificial intelligence

News

The copyright trap: will the AI Code of Practice fix the problem?

[ Cultural Policies ]

[ Digital Transformation ]

Cookie	Duration	Description
connect.sid	1 day	This cookie is used for authentication and for secure log-in. It registers the log-in information.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Analytics" category .
cookielawinfo-checkbox-functional	1 year	The cookie is set by the GDPR Cookie Consent plugin to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Necessary" category .
cookielawinfo-checkbox-others	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to store the user consent for cookies in the category "Others".
cookielawinfo-checkbox-performance	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to store the user consent for cookies in the category "Performance".
viewed_cookie_policy	1 year	The cookie is set by the GDPR Cookie Consent plugin to store whether or not the user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
COMPASS	1 hour	No description
foo	never	No description available.
loglevel	never	No description available.

Cookie	Duration	Description
NID	6 months	NID cookie, set by Google, is used for advertising purposes; to limit the number of times the user sees an ad, to mute unwanted ads, and to measure the effectiveness of ads.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_TTP6ES223J	2 years	This cookie is installed by Google Analytics.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
S	1 hour	Used by Yahoo to provide ads, content or analytics.
sp_landing	1 day	The sp_landing is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
sp_t	1 year	The sp_t cookie is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.

A first draft

Transparency and Copyright

Related