How Tight AI Regs Hurt Chinese Firms
A lament: “Life is too hard!! Friends, who can lend me a rope?”
Anonymous contributor L-squared returns again to ChinaTalk, this time with a rundown of China’s interim and forthcoming AI-generated content (AIGC) regulations.
In July 2023, the Cyberspace Administration of China (CAC) announced interim generative AI regulations. At that time, analysts noted that the measures were less onerous than those proposed in an April draft. For instance, changes from the draft included specifying that the rules would apply only to public-facing services and vowing to attach “equal importance to [generative AI] development and security.” In this deep dive, however, I show that the scope of regulatory targets is actually wider than expected, and Chinese AI diffusion is being seriously compromised by confused and overbearing regulatory action.
The Regulations in a Nutshell
The generative AI regulations confirm that strict content restrictions apply to AI-generated outputs and outline various responsibilities for service providers. These relate to training data, labeling of outputs, data protection, responding to illegal content, user transparency, and so on.
The regulations follow and should be understood in conjunction with previously introduced provisions for recommender algorithms and “deep synthesis.” The former introduced the obligation to conduct security assessments and register algorithms, and those provisions were carried over in the regulations for deep synthesis and generative AI. The deep synthesis regulations significantly overlap with those for generative AI but are less focused on text, as they were drafted before the rise of ChatGPT and its Chinese competitors. Among their provisions is a requirement for app stores operating in China to check that an app has completed algorithm registration before allowing it to list.
Broad Regulatory Scope
In theory, the requirement to conduct a security assessment and algorithm registration applies only to providers of services “with public opinion properties or the capacity for social mobilization.” An analysis by one law firm, however, reported that according to their experience, this attribute should be understood broadly: merely having information-exchange functions — such as commenting and publishing — could be enough to trigger this requirement.
In what was thought by some to be a get-out for services aimed at business users, the generative AI regulations declare that entities that develop and apply generative AI but do not provide services to the domestic public are not required to comply. But according to a 2B software company employee in Beijing who is going through the algorithm registration process, Western media reports failed to recognize that “the public” 公众 can be interpreted as covering both individual consumers and business employees. This means 2B services are expected to comply with the provisions (though they tend to be subject to less onerous requirements than their 2C counterparts). Of 151 deep synthesis algorithms registered in June and August 2023, 50 target business customers, according to brief summaries provided by CAC. Among these are algorithms for voice synthesis, digital avatar creation, and customer service.
A caveat is that organizations developing AIGC tools for their own internal use don’t seem to be required to follow the regulations. One investment professional told me that many Chinese startups and big companies are making use of this exemption to build bots based on open-source models like WizardLM and MPT, improving their productivity. Of course, though, not all companies have the required technical know-how to adapt open-source models in this way.
Unclear Requirements
The software company employee I spoke to who is handling registration for his firm said that regulators were still not clear on what they wanted. This confusion makes the process lengthy and inefficient. One news article noted that registration could take two to three months, and cited an employee communicating with regulators almost three times a week.
In a video on social media platform Xiaohongshu 小红书, an algorithm-registration consultant offering to hand-hold clients through the painful registration process captures the mood: the sketch imagines a program maker despairing about the compliance process — this frame reads, “Life is too hard!! Friends, who can lend me a rope?”
Draft basic security requirements released for comment in October (and analyzed by Matt Sheehan here) should provide more clarity on how the generative AI regulations will be enforced. The requirements are not yet finalized and are likely getting pushback from industry. Even so, the current draft confirms the strictest possible interpretation of the regulations. For instance:
In case there was any confusion about the instruction in the regulations to use “data and foundational models that have lawful sources,” the security requirements confirm that companies cannot use foundation models that haven’t passed the domestic registration process. As foreign-developed models will generally not attempt to register, this means Chinese companies must rely on less-capable domestic alternatives.
The security requirements also prevent the use of information blocked on the Chinese internet as training data — a big impediment for developers, given that Wikipedia is often an important part of training corpora and many datasets are hosted on Hugging Face.
The wording of the regulations seems to allow for modifications to a model’s code to not be reported as long as key information about the model (such as type of algorithm and field of application) remains the same. Yet the security requirements state that when providers make important updates to or upgrade a model, they must repeat a security assessment and re-register with the authorities.
In an attempt at balance, here’s one requirement in the draft that seems relatively sensible and beneficial for consumers: services must allow users to opt out of having their data used for training, and can’t hide this option more than four clicks away from the main page. In the US, the FTC took it to Amazon for putting a Prime cancellation button six clicks away.
Negative Impact on the Market
Greater regulation is generally associated with less competition (though anti-competitive impacts can be reduced by well-designed regulation). In this case, early evidence of the registration system favoring larger companies with extensive compliance teams comes from a shallow analysis of firms included on CAC’s lists of successfully registered deep-synthesis algorithms in June and August 2023. Large firms make up the highest proportion, at 40%.
Company scale was determined by the Qichacha 企查查 platform, which uses an AI model based on government guidelines for company categorization. I made upward adjustments for ten companies that were clearly subsidiaries of large technology firms — but as this was not a thorough check, the number of large firms may still be an underestimate.
Smaller, resource-strapped companies face an unenviable set of options when it comes to deciding their technology strategy in the face of the generative AI regulations. Without the budgets to train their own models from scratch, they are restricted to basing their services on one of the other models approved by the CAC. Within that option set, they can either:
Spend precious cash on a closed-access model;
Directly use an open-source one like Baichuan-13B (saving money but taking a hit on performance); or
Fine-tune an open-source model (improving performance but complicating compliance because the regulators need details of the fine-tuning process).
Further, most firms likely must pay for a content moderation API supplied by an Internet giant such as Baidu or Tencent, given the high barriers to setting up reliable moderation infrastructure themselves.
While big companies are relatively less hampered by the regulations, they aren’t immune from challenges — in fact, their large user base means they are more likely to attract negative attention if their AI systems misbehave. iFlytek’s share price dropped 10% after its AI-powered study assistant tablet described Mao Zedong as “narrow-minded” and “intolerant” for starting the Cultural Revolution.
Any Cause for Hope?
Are there options for companies that lack the will or capacity to navigate this compliance minefield? Small companies with products that don’t need to rely on app stores for customer acquisition could choose to eschew compliance. But doing so would put them in the perverse position of having to avoid growing big enough to attract regulator attention — not exactly a recipe for generating high-growth startups!
Alternatively, companies could forgo the Chinese market entirely and seek opportunities abroad. But expansion into high-value Western markets is likely to be difficult given stiff competition and increasing scrutiny of Chinese firms.
Another avenue that big tech companies will be pursuing is trying to shape China’s upcoming AI Law to be more aligned with their interests (though these might not always align with those of smaller firms who are less equipped to lobby). The law, which was included in the 2023 legislative work plan of the State Council, will likely be a comprehensive law consolidating and building on existing AI regulations. Many details about the legislation, including the timeline for review and approval of the State Council’s draft by the legislature, are currently unclear. But a proposal put forward by scholars at the Chinese Academy of Social Sciences, which will be a reference for the State Council’s draft, gives some clues as to what might end up in the law.
Some elements could be helpful to providers, such as allowing appeals against regulator decisions and reducing the maximum processing time for algorithm registrations from thirty to five days. At the same time, it proposes requiring providers to complete audits every two years and placing special obligations on providers of foundation models in particular. In particular, they would need to produce a yearly “social responsibility report” and establish an independent oversight body of primarily external members.
Implications for National Competitiveness
What are the implications of all this for China’s competitiveness in AIGC vis-à-vis liberal democracies? According to a prominent benchmark for Chinese large language models, the best Chinese foundation models are still less capable than GPT-4. (Stay tuned for my next post, which aims to give you a better sense of their relative capabilities using some qualitative demonstrations.)
There are several reasons for this gap, and it’s not clear how much of a role compliance pressure is playing versus other factors like talent. But I think it’s reasonable to assume that the extra lengths Chinese developers have to go to to ensure their outputs are politically acceptable will increase compliance costs and reduce the helpfulness and honesty of their models relative to Western ones.
As Jeff Ding’s work highlights, a state’s capacity to diffuse, or widely adopt, innovations is an important but often overlooked element of its S&T capabilities. The Chinese regulations are likely to slow diffusion, not only by reducing the quality of the best foundation models available in China, but also by imposing overly burdensome regulations on even low-risk applications. All of that is likely to impede startups and SMEs hoping to build useful products and services on top of foundation models, slowing the rate of adoption of generative AI across the economy. By contrast, the EU’s AI Act and Biden’s executive order indicate a more sensible risk-based approach that should have less of a dampening effect on diffusion.
Check out L-squared’s previous ChinaTalk post on Hugging Face’s disappearance from the Chinese market.
Yup, following up on our prior thread, that sounds a LOT closer to what I’m hearing vs folks saying AI is lightly regulated in China.