Sponsored By

Building AI Datasets For Southeast Asia

Building AI Datasets For Southeast Asia

Ciarán Daly

February 25, 2019

5 Min Read

by Isaac Tan

"The things that make southeast Asia unique to tourists and visitors could also be why the region struggles with the adoption of AI."

MALAYSIA - Southeast Asia (SEA) is a goldmine of business opportunities. Individually, these 11 countries may not be able to combat the strength of other countries like the United States. But collectively, the entire region offers a lot of potential for companies to grow and boost their businesses here.

With a population of over 640 million people - a majority of whom are below 35 years old - Southeast Asia is hungry for new technology and extremely receptive to foreign imports. More than half of the region are internet users, with an additional 3.8 million new users connecting to the internet for the first time every month. Consequently, this rapidly growing market is already very lucrative for tech firms.

Unfortunately, this doesn’t seem to be the case when it comes to the adoption of artificial intelligence and machine learning. With the exception of Singapore, other Southeast Asian countries seem to be lagging behind in terms of AI or ML-enabled technology. It’s surprising that a region with a higher population than Europe isn’t able to keep up with the technological advancements unfolding in other parts of the world.

Related: Why Asia Will Become The World's AI Powerhouse

One glance at the image below will tell you all you need to know about how skewed towards the Western hemisphere the race for AI currently is. While we can assume that the United States leads the charge because of its long-term investments in growing digital ecosystems such as Silicon Valley and crucial technological hubs across the country, China’s 5-year development plan helped make it a strong competitor and places it hot on the heels of the US.


The things that make southeast Asia unique to tourists and visitors could also be the main reasons why the region struggles with the adoption of AI:

Language diversity

Most countries in SEA are at least bilingual. In Malaysia alone, it’s not uncommon to encounter people who can speak up to 5 languages and dialects. Tools based on NLP will have to be uniquely trained to cater to each region’s native slang and accent. Popular languages like English are spoken with a mix of the local flair, even leading to a seemingly whole new language like “Singlish” and “Manglish”.

This means that preparing the training data that can cater to every single country is an extremely tedious process as it requires large amounts of speech and voice data from not just each of the 11 countries.

Unique road and traffic infrastructure

Tech news is constantly abuzz with updates from the autonomous vehicle industry. But the reality of the situation is far from what we may think. By itself, it’s already tricky enough to get the right hardware and algorithm for a fully functional autonomous vehicle. At the moment, no one has successfully released one outside the safety of a geofenced area. This is because it’s a huge responsibility to teach a self-driving car the difference between a human, a tree, and another vehicle.

Ask anyone about their encounter with traffic in countries like Indonesia or Vietnam and they will probably tell you about the notorious traffic jams or floods of motorcycles that weave in between cars.

"Building truly intelligent AI and ML systems that can cater towards a Southeast Asian market is a challenge that not every company is willing to face."

If training a self-driving car in heavily monitored regions like Europe is already such a difficult task, imagine the lengths one has to go through to build a self-driving system that can function in Vietnam’s tricky traffic conditions.

But the situation isn’t the same across every country in SEA. As of 2018, countries have allocated over $320 billion to improve infrastructure in the region from road improvements to high-speed rail systems. So, there is hope yet for AI to step into the picture as part of the development plans.

Related: Beyond Self-Driving - Is Japan The Future Of Transport?

Racial diversity

Those in the industry know very well by now the challenges that facial recognition software face in terms of racial diversity. Because most of the technology is developed by companies headquartered in the Western hemisphere, programs tend to identify male, Caucasian faces more accurately than those of others.

In countries like the US where the majority of the population is Caucasian, developers of facial recognition systems are already facing extreme scrutiny for their lack of inclusivity when it comes to the data that’s used to train their algorithms. While a shift is already starting to happen with major companies like IBM taking up the challenge, this is going very difficult to execute in a diverse region like SEA.

Building truly intelligent AI and ML systems that can cater towards a Southeast Asian market is a challenge that not every company is willing to meet, partly because of how difficult it is to build a dataset that’s tailored for each country. But the returns can be immense and worth the investment.

Players in the local market, namely from Singapore and Malaysia have already begun making their move but a lot more collaboration is needed in order to capitalize on the region’s true potential. In 2018, McKinsey reported the colossal impact that AI will have on the global economy which includes $2.6T additional value in Marketing and Sales and up to an 11.6% impact on the Travel industry.

Companies can only do so much to initiate a regional shift. Policymakers and governments could afford to follow in China’s footsteps and commit to a long-term plan to develop the country’s technological prowess.


Isaac is a Product Manager and data geek. He is currently working at Supahands to bring innovative ideas to life by helping the team build the world's most efficient workforce by combining machine and human intelligence. You can typically find him combing over mountains of data, working between design and engineering teams in delivering cutting-edge products and services for both business and users.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like