A major scientific leap from Google’s sister company
Alphabet's DeepMind has developed an artificial intelligence system that can predict the structure that proteins will fold into.
The AlphaFold system can accurately determine a protein’s 3D shape from its amino-acid sequence, a huge scientific achievement that could have a profound effect on how we understand diseases and develop drugs.
“A very special moment"
Living cells are comprised of billions of different proteins, each of which has a complex 3D shape that defines what it does and how it works.
More than 200 million proteins have been discovered, and the number continues to rise. But we only know the exact shape of only a few hundred thousand.
Each protein is a string made up 20 amino acids, arranged in different orders. Their interactions with each other make the protein fold, with scientist Cyrus Levinthal estimating in 1969 that there were some 10^300 possible conformations for a typical protein.
A major goal of computational biologists has therefore been to work out how to predict a protein's shape just from looking at a string of amino acids.
Using brute force computing is essentially impossible, given the astronomical number of configurations, so scientists have increasingly looked to artificial intelligence as a way to achieve this goal.
Enter AlphaFold. Trained on the sequences and structures of about ~170,000 proteins mapped out by the RCSB Protein Data Bank and other protein databases, it can accurately predict the shape of proteins simply from their sequence of amino acids. The system was trained on 128 TPUv3 cores for the duration of a "few weeks."
This month, AlphaFold defeated around 100 teams in a protein-structure prediction challenge called Critical Assessment for Structure Prediction (CASP), set up 25 years ago to encourage research in the field.
"We have been stuck on this one problem – how do proteins fold up – for nearly 50 years," CASP co-founder and chair Professor John Moult said.
"To see DeepMind produce a solution for this, having worked personally on this problem for so long and after so many stops and starts, wondering if we’d ever get there, is a very special moment."
CASP uses a metric called the Global Distance Test (GDT) which measures how accurate protein folding predictions are when compared to the correct position, out of a score of 100.
A score of 90 GDT has long been considered to be the benchmark to beat, as it is similar to what can be obtained from experimental lab methods (something which can take months or years, expensive equipment, and still fail). In the latest CASP assessment, AlphaFold achieved a median score of 92.4 GDT across all targets – an average error of about the width of an atom.
For the hardest protein targets, it had a median score of 87.0 GDT.
"This computational work represents a stunning advance on the protein-folding problem, a 50-year-old grand challenge in biology," said Nobel laureate and President of the Royal Society, Professor Venki Ramakrishnan.
"It has occurred decades before many people in the field would have predicted. It will be exciting to see the many ways in which it will fundamentally change biological research.”
The hope is that the breakthrough will allow scientists to understand how proteins work, making drug development easier, and potentially setting humanity on a path towards being able to develop enzymes that can eat plastic, or absorb carbon.
Significant work is still required, particularly on how proteins combine to form complexes, and how they react with RNA, DNA, and small molecules.
"AlphaFold is a once in a generation advance, predicting protein structures with incredible speed and precision," said Arthur Levinson, former CEO of Genentech and current CEO of Alphabet's Calico. "This leap forward demonstrates how computational methods are poised to transform research in biology and hold much promise for accelerating the drug discovery process.”
DeepMind plans to publish a paper detailing AlphaFold's achievement, but it was coy on whether it would release the algorithm itself.
“We’re right at the beginning of exploring how best to enable other groups to use our structure predictions,” the company said.