Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!
November 30, 2023
A new algorithmic model built around learning to play games including chess and poker could power new decision-making systems. Is it a step towards AGI?
Student of Games (SoG), built by former Google DeepMind staff, was unveiled in a paper published in the Science journal. It is a general-purpose system designed to understand and play both ‘perfect’ games – where each player has all the information needed to play, such as chess and Go – as well as games involving 'imperfect' information, such as poker and Scotland Yard.
It is designed to be robust, unifying various approaches to game-learning under one algorithm. Previously, systems like the famous Deep Blue, which defeated chess grandmaster Garry Kasparov, were built to play one game. A more recent example is DeepMind's AlphaGo, which played Go but not chess; however, its successor, AlphaZero, could master three perfect information games but not poker, an imperfect information game.
SoG’s versatility could make it a useful decision-making system in the form of an AI agent. Several members of the team behind it are now at EquiLibre Technologies, a startup attempting to use game theory to build algorithmic trading tools.
The researchers are looking to expand the system, including bringing down the "substantial" computing resources required to attain results in challenging domains. "An interesting question is whether this level of play is achievable with less computational resources,” according to the paper.
The Student of Games algorithm uses a game tree, a graph that shows possible moves in a game. The system’s neural networks learn and refine to adopt a strategy for different game types.
Using a technique called Growing-Tree Counterfactual Regret Minimization (GT-CFR), the options on the system’s game tree grow dynamically, making it able to refine strategies. It can also further improve those strategies by utilizing Sound Self-Play – by playing the game against itself, it learns from mistakes.
In simple terms, it is similar to playing continually losing rounds of video games due to employing the same tactics. By trying something different and experimenting, the player could achieve an improved result. SoG is about improving from experiences to achieve better results.
SoG builds upon prior research from DeepMind, which routinely explored ways to improve AI’s decision-making skills via video games. Systems like AlphaZero are “predecessors” to SoG, the authors said.
To improve on these prior systems, SoG took a combination of earlier concepts and meshed them together – including the combined search abilities and deep neural networks from AlphaGo along with the game-theoretic reasoning and search in imperfect information game skills from DeepStack, an earlier system built to play poker.
The resulting model performs well, playing games and can even beat the strongest openly available AI agent in heads-up no-limit Texas Hold 'em Poker.
The holy grail of AI is artificial general intelligence (AGI) – the idea that AI systems can perform any tasks humans can, autonomously.
Research into agent-based systems and the idea that they can perform routine tasks on their own could be an early-stage step to achieving the zenith of AGI.
DeepMind has been experimenting with agents for years, like Sparrow which is designed to avoid naughty responses, or more recently, the Multiagent Society, created with MIT that sees multiple AI systems debate a prompt to achieve a better output.
AGI may still be a long way off, with experts still arguing over the ways to define conceptual questions around it. But research into agent-based systems like SoG could bring it ever so slightly closer.
Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.
You May Also Like