China seeks AI ‘catch-up’ by creating its own ChatGPT-like tools
The Chinese government has been pouring billions of yuan into research and development in the artificial intelligence (AI) sector and the development of AI talent through new university programmes.
But ChatGPT has left Chinese companies and universities playing catch-up and ignited a new ‘arms race’ with the United States, focused on generative AI. Since the emergence of ChatGPT, the central government and city authorities have announced new funds to entice promising start-ups that might develop a Chinese ChatGPT.
One university professor in southern China said even in the short period since the launch of the public version of ChatGPT in November last year, Chinese universities with strong AI departments have come under pressure to expand postgraduate and research student recruitment and put in place schemes to entice back from abroad Chinese students and researchers specialising in generative AI.
China wants its own intelligent natural language systems for reasons that range from language and the need for Chinese language systems to keep up with English and other global languages, to purely political reasons relating to its goals as a global science and technology power.
“We are under pressure to show that we can respond to this new AI challenge and that we are not behind the United States in this area,” the professor said, speaking on condition of anonymity.
Nonetheless, Chinese ambivalence about ChatGPT-like tools, particularly in the China context of the need to control dissemination and to quickly censor topics deemed sensitive, has led to some universities and technology companies adopting a conservative, low-key approach while at the same time trying to show they are competing to be among the first to launch a Chinese version of ChatGPT, technology experts in China have noted.
ChatGPT is barred in China, in part because it trained on data that is outside China’s internet firewall that keeps out Western influences. However, many have used proxy servers or VPNs to access it, creating a buzz in China. For a short period this month the government even blocked VPN access, to reassert control amid the excitement.
“We’ve seen a lot of companies in China join the frenzy and announce that they’re working on or will be rolling out similar ChatGPT-like tools or integrating them into their search engines. Some of that is purely because of the current hype, and that’s going to cool down a little bit,” Hanna Dohmen, senior researcher at the Center for Security and Emerging Technology at Georgetown University in the United States, told University World News.
“But to what extent those are going to be as good or potentially even better than US or Western products is hard for us to say at this time,” Dohmen acknowledged.
Nonetheless she described the emergence of ChatGPT as a “turning point” in global research competition in AI.
“While technology competition in AI is not new, we’re now seeing advancements in generative AI being commercialised and rolled out to the public, and that is going to have significant consequences for the technology rivalry between the United States and China,” she said.
Fudan University tool crashes
After the initial shock at being ‘left behind’, technology experts in China have been debating how wide the gap is between China and America, and how quickly it can be bridged.
But the rush to release generative AI language products before they are ready has already produced early casualties – an indication of a considerable gap in generative AI.
Shanghai’s Fudan University has developed its own ChatGPT-like service MOSS, an AI-powered language tool with one-tenth the capabilities of ChatGPT. But it was billed as being “sufficient for academic research”, according to media reports on its launch last month.
It was intended to show Fudan University’s AI tech prowess, including compared to Chinese tech companies, from whom it has recruited a number of high profile researchers in recent years.
But the Fudan University team was forced to make a public apology when MOSS crashed hours after launch on 20 February, due to a sudden surge of traffic on Chinese social media platform Weibo. The Fudan team has now said MOSS will no longer be openly available to the public. The project was expected to be an open source tool.
MOSS, said to be able to generate conversation as well as asking and answering questions, was developed by a team in the Fudan University Natural Language Processing Group led by Qiu Xipeng, a professor in Fudan’s School of Computer Science, and backed by the Shanghai Artificial Intelligence Laboratory set up in July 2020 as a world-class national level research institute with collaborations with other universities in China, Hong Kong and abroad.
The Fudan University team, which initially described MOSS as a conversational language model like ChatGPT, played down the comparison the day after its launch, saying they had much to improve.
“MOSS is still a very immature model; it still has a long way to go before reaching ChatGPT. An academic research lab like us is unable to produce a model whose ability nears ChatGPT,” a statement published on the university website said, noting that the computing resources available were not enough to support such large traffic.
The team statement added that “as an academic group we do not have sufficient engineering experience, creating a very bad experience and first impression on everyone, and we hereby express our heartfelt apologies to everyone”.
Commercial versions more likely to emerge
Other universities in China are said to be working on ChatGPT-like versions but, unwilling to take a hit to their reputation after the Fudan debacle, they are not rushing to launch, the professor in southern China said.
Fudan University at least had AI industry connections and brought together more than a dozen AI company executives and academics to analyse the development of ChatGPT, security risks and areas of potential use, he said.
While Fudan University was the first to launch, it is generally accepted among China tech industry experts that it will be the country’s behemoth private tech companies that will roll out a domestic competitor to ChatGPT.
Baidu, the Beijing company that runs China’s internet search engine equivalent to Google – which is barred in China – on 7 February announced that it would launch its Ernie chatbot sometime this month, after embedding the bot into its search service. But it is still unclear what Ernie will be able to do.
ChatGPT is built on a large language model trained on a huge amount of language data. Ernie was based not just on Chinese internet and social media data but also English language information that includes Wikipedia and Reddit, both blocked in China.
In a flurry of announcements in the wake of the ChatGPT buzz, e-commerce giant Alibaba and multimedia titan Tencent also quickly announced plans for their own AI chatbots. Alibaba said it is testing a ChatGPT-like service, while its in-house research institute Damo is developing its own AI chatbot.
Chinese media recently reported that the Damo Academy applied for a patent for “human-machine dialogue and pre-trained language model training methods, systems and electronic devices”, which can improve the accuracy of question-and-answer interactions.
Hao Peiqiang, a Tianjin-based independent developer who used to run a search engine company in China, said on his YouTube channel that he expected Baidu’s Ernie bot to lag far behind ChatGPT.
“Baidu’s answer to ChatGPT may live well within the Great Firewall because it doesn’t have to compete with international counterparts. As long as it can satisfy the needs of the domestic market, it will still generate money,” he said.
Tencent has set up a team to develop a chatbot called HunyuanAide. The company earlier said it has a strategy for related research and technology, which it will advance “in an orderly manner”.
“Similar to US companies, Chinese companies like Baidu have been working on training these large language models for a couple of years now,” said Georgetown’s Hanna Dohmen. “There is certainly a lot of data that China has access to that is different from what the West has access to and that is a concern,” she said, referring to bias and distortion that bugs language models including ChatGPT.
“The increase in misinformation at scale is very real with generative AI tools if they are not regulated properly, or if the right policies are not in place,” she warned.
But in China data is seen as an advantage. Chinese companies “have massive data, probably more data than the United States or Europe has, and they have fewer regulations on that data. So they have a data advantage,” Caroline Wagner, a professor at Ohio State University and an expert on China technology, told University World News.
Control and censorship
But control and censorship are key additional considerations for Chinese developers.
“China is really trying to get ahead of emerging technologies and the risks that they potentially pose to political and social disruption, that the Chinese government is wanting to control,” noted Dohmen.
China has already brought in deep synthesis regulations which came into effect on 10 January. The regulations include any ‘synthetically generated’ text and audio, including text generated by chatbots.
One of its provisions states: “Content that goes against existing laws is prohibited, as is content that endangers national security and interests, damages the national image or disrupts the economy” – seen by many as a very broad set of restrictions.
“China is the first country in the world to comprehensively regulate deep synthesis technology, with such broad and sweeping controls that are at the discretion of the Chinese government,” said Dohmen.
“That is a key concern because it allows the Chinese government to suppress any content that is contrary to the economic, political and national security interests of Beijing.
“Chinese companies are going to have to comply with those regulations. And that leaves a lot of discretion to the Chinese government, in terms of what it finds acceptable, within the bounds of its desire to regulate content, and its censorship efforts,” Dohmen said.
But she added: “Excessive restrictions, content regulation and censorship could hinder commercialisation and further innovation of such technologies.”
Regulators have already told major Chinese tech companies not to offer ChatGPT services to the public amid growing alarm in Beijing over its uncensored replies to user queries.
“Tech companies need to consider the political consequences,” said one Chinese technology expert in Beijing, speaking on condition of anonymity. “The problem with ChatGPT-type technologies is their unpredictability,” she said.
“China’s security apparatus has identified and watches academic dissidents to stop them from speaking out, but ChatGPT cannot be identified and controlled in this way. It will need a whole new security mechanism,” she added.
China’s regulators require anything posted online to be reviewed in advance to ensure they do not contain banned topics and terms.
Baidu has a comprehensive filter for sensitive terms. But a ChatGPT-like bot that generates content could be less easy to constrain without substantial human moderation, which currently exists.
ChatYuan launched by Xu Liang, lead developer at Yuanyu Intelligence, a start-up that launched its own small-scale chatbot in late January, was suspended just a day later after circulating responses on China’s economy and the Russian war in Ukraine that went contrary to official versions.
China’s drive to develop ChatGPT-like services could be hindered by US export controls that restrict China’s access to advanced chips needed to drive powerful AI engines.
Export controls could hinder development
“The US and the West may have a chips advantage and try to take advantage of that for a short amount of time while China tries to multisource chips or build their own – but it’s going to be a long haul for them to build their own,” noted Caroline Wagner.
This is one of the reasons that some China technology experts – including Zhang Yalin, founder and chief operating officer of an AI chip start-up company Enflame Technology, based in Shanghai – believe a Chinese tool to rival ChatGPT could be some three to five years away.
Even China’s Science and Technology Minister Wang Zhigang acknowledged that catching up would not be easy.
ChatGPT has advantages in delivering results in real time, which is “very difficult to achieve”, Wang said at a press conference on the opening day of the National People’s Congress annual session on 5 March.
China has done a lot of planning and research in the fields of natural language processing and natural language understanding over the years and has made some achievements, according to Wang. But for China to attain the kind of results seen by OpenAI, the country needs to “wait and see”, he said.