According to an AI-based analytics project
A machine learning platform has identified hundreds of "hidden" Chinese AI companies, prompting speculation that the Chinese AI ecosystem is much bigger than previously thought.
To take an in-depth look at China's AI landscape, Georgetown University’s Center for Security & Emerging Technology (CSET) partnered with British automation specialist Amplyfi.
The pair have used supervised machine learning models, NLP, and unstructured data to identify large numbers of AI companies not listed in major commercial databases such as Crunchbase and PEData.
Using Chinese-language text extracted from news outlets and web searches, the researchers identified hundreds of thousands of companies potentially linked to AI, with 110,000 of these coming from the Chinese tech publication 36kr alone.
With the help of a second model and a team of annotators, the researchers were able to narrow this down to 888 actual organizations, only 21 percent of which had AI in their description in datasets like Crunchbase.
Filling the gaps
The authors of the research paper titled ‘Using Machine Learning to Fill Gaps in Chinese AI Market Data’ argue that relying solely on structured data fails to provide a comprehensive picture of the global AI business landscape.
“Organizations that are making decisions without considering the value of unstructured and deep-web data are ‘flying blind’ and may not be aware of the huge steps forward in the abilities of machines to read, analyses and support strategic decision making,” commented Chris Ganje, CEO at Amplyfi.
The AI industry in China is growing rapidly, with more than $30bn in funding made available to startups since 2016.
“More than three quarters of the model-identified, AI-involved companies we examined are not labeled or described as AI-related in structured datasets,” stated the report. “The sheer volume of these “hidden” companies suggests that no matter one’s definition of AI activity, using structured data alone—even from the best providers—will yield an incomplete understanding of China’s AI industry.”
The team behind the report believes its work has significant implications for accurately mapping the global AI ecosystem, including in the US, where it is hypothesized that data on many AI companies continues to fall through the cracks. The tracking project has received fresh investment and the team now plans to refine its models to unearth similar omissions elsewhere in the world.