Author
Ms Xiaomeng Zhang
Organisation/Institution
Peking University, School of New Media
Country
CHINA
Panel
Information Technology Law
Title
A Study on the Regulated Use of Training Data in Generative Artificial Intelligence
Abstract
This paper examines how the rise of generative artificial intelligence has pushed the legal status of training data into public and regulatory focus in Asia. Training data is vital to model performance, yet its legal use is often uncertain and difficult to evaluate under current rules. Based on representative cases in recent Chinese judicial practice, this study identifies four major areas of dispute: copyright, patent protection, data-related economic interests, and data collections with competitive value that existing legal categories do not fully cover. These cases reveal gaps in the stability and clarity of current legal guidance. The paper then analyzes the practical obstacles that accompany the use of training data. Ownership is often unclear. Authorization chains are fragmented and hard to confirm. Institutions engaged in research or testing may not aim for commercial gain, but they still handle large volumes of data. These conditions create risks for fair competition, diversity of cultural expression, and ethical expectations in the wider information environment. To address these concerns, the paper proposes a three-part governance approach. Legal measures should clarify different types of data interests, refine judicial standards for infringement and unfair competition, and encourage coordinated enforcement by administrative regulators. Data governance should support traceable data sources, reliable data transactions, and clear rules that distinguish commercial from non-commercial use. Risk prevention should include early identification of economic, cultural, and ethical risks, along with practical monitoring and open disclosure requirements. The goal of this paper is to support a more stable and sustainable framework for training data governance in Asia. It aims to contribute to regional dialogue on justice, responsibility, and institutional development in an era shaped by generative AI. Keywords: Generative Artificial Intelligence; Training Data; Collective Data Interests; Regulated Use
Biography
I am currently a PhD candidate in Communication at Peking University, where my research examines how generative artificial intelligence is reshaping the cognitive frameworks, operational processes, and risk structures of China’s mainstream media organizations. Working in collaboration with a faculty advisor specializing in civil law, I engage in interdisciplinary research that connects technological transformation with legal and regulatory questions, including rights boundaries, responsibility allocation, and governance mechanisms under the civil law framework. Building on my doctoral work, my broader academic interests span a wide range of digital governance issues. These include the normative use of AI training data, platform governance and algorithmic accountability, institutional evolution in the digital economy, and the restructuring of public administration and regulatory systems amid rapid technological change. My research integrates perspectives from communication studies, law, and public policy to analyze how new technologies generate emerging normative challenges across institutional, technical, and societal dimensions. I have participated in several research projects focused on AI governance, technology regulation, and digital rule-making, with an emphasis on comparative regulatory approaches and the development of adaptive governance frameworks in Asia and beyond. I aim to contribute to an evidence-based dialogue on how legal systems can respond to the risks and opportunities associated with generative AI and other frontier technologies.