Author
Ms Jialun Li
Organisation/Institution
Peking University, School of New Media
Country
CHINA
Panel
Intellectual Property Rights
Title
A Study on the Regulated Use of Training Data in Generative Artificial Intelligence
Abstract
This paper examines how the rapid expansion of generative artificial intelligence has brought the legal status of training data to the center of public and regulatory debate in China. Although training data is essential for model performance, its use is often unclear, loosely supervised, and difficult to evaluate through existing legal rules. Based on a set of representative cases in recent Chinese judicial practice, the study identifies four types of disputes that frequently appear in litigation: conflicts involving copyright, patent rights, data-related economic interests, and data collections that have competitive value but do not fall under established legal categories. These cases show that current rules do not provide stable or consistent guidance for courts or developers. The paper then analyzes several common obstacles that appear in the practical use of training data. Many data sources have uncertain ownership. Authorization chains are fragmented and difficult to verify. Institutions that use data for research or technical testing often operate in a zone where intentions are non-commercial but the scale of data use is extensive. These situations create risks for market fairness, cultural diversity, and ethical expectations in the broader information environment. In response, the paper proposes a structured set of governance strategies. The first set focuses on law. Clearer statutory recognition of different types of data interests is needed. Judicial standards for infringement and unfair competition must be more specific. Administrative regulators should coordinate their supervision. The second set focuses on data governance. Data tracing systems, reliable data trading platforms, and clear rules for commercial and non-commercial use can help reduce uncertainty. The third set focuses on risk prevention. Economic, cultural, and ethical risks should be identified at an early stage and addressed through transparent procedures and practical monitoring tools. The goal of the paper is to support a more stable and sustainable framework for training data governance in Asia, and to contribute to regional discussions on justice, responsibility, and long-term institutional development.
Biography
I am currently a PhD candidate in Communication at Peking University, where my research examines how generative artificial intelligence is reshaping the cognitive frameworks, operational processes, and risk structures of China’s mainstream media organizations. Working in collaboration with a faculty advisor specializing in civil law, I engage in interdisciplinary research that connects technological transformation with legal and regulatory questions, including rights boundaries, responsibility allocation, and governance mechanisms under the civil law framework. Building on my doctoral work, my broader academic interests span a wide range of digital governance issues. These include the normative use of AI training data, platform governance and algorithmic accountability, institutional evolution in the digital economy, and the restructuring of public administration and regulatory systems amid rapid technological change. My research integrates perspectives from communication studies, law, and public policy to analyze how new technologies generate emerging normative challenges across institutional, technical, and societal dimensions. I have participated in several research projects focused on AI governance, technology regulation, and digital rule-making, with an emphasis on comparative regulatory approaches and the development of adaptive governance frameworks in Asia and beyond. I aim to contribute to an evidence-based dialogue on how legal systems can respond to the risks and opportunities associated with generative AI and other frontier technologies.