As science gets too big for human minds

Sebastian Moss

August 4, 2021

4 Min Read

We're collecting too much data.

With increasingly powerful instruments outputting ever more complex scientific data, researchers are struggling to make discoveries that require sifting through it all.

Lawrence Berkeley National Laboratory thinks the answer lies in artificial intelligence. The Department of Energy lab has unveiled a multi-year project to introduce innovative autonomous discovery techniques across a broad set of problems.

A study tool for studying tools for studying

Led by the Center for Advanced Mathematics for Energy Research Applications (CAMERA), the effort is a broad, international push to revolutionize scientific discovery.

In the first paper as part of the project, published in Nature Reviews Physics, CAMERA research scientist Marcus Noack describes Gaussian processes (GP) for autonomous data acquisition.

The paper demonstrates that GP-based autonomous data acquisition can lead "to the effective and efficient acquisition of high-value datasets at large experimental facilities," with little to no human intervention.

The study notes that the next step is to imbue the GP machine learning framework with domain knowledge, so that it can "learn, adapt, confirm or reject certain knowledge bases if the collected data warrants it."

This would prove invaluable in data-intensive studies, the paper said. "Full domain awareness and HPC readiness of Gaussian-process driven autonomous data acquisition will lead to the acceleration of scientific discovery in biological, chemical, physical and materials sciences."

While the Gaussian process was used here, other stochastic processes could be used instead.

“More and more experimental fields are taking advantage of this new optimal and autonomous data acquisition because, when it comes down to it, it's always about approximating some function, given noisy data,” Noack said in a statement.

His work, and that of the wider CAMERA team, was initially focused on synchrotron beamline experiments (essentially studying X-rays at about 10 billion times brighter than the sun). Synchrotrons have been used to make discoveries in fuel cells, solar enegy, superconductors, and even vaccines.

With Lawrence Berkeley in the midst of upgrading its Advanced Light Source facility with 100 times brighter soft X-ray light and significantly more data capture, it was the perfect area for CAMERA to start with.

The team is now looking to use the same AI tools and methods for other scientific fields.

In April, a workshop on autonomous discovery in science and engineering sponsored by CAMERA and chaired by Noack attracted hundreds of scientists from around the world, the lab said.

“We are still in the early days with this, but much progress has been made in the past year,” said Martin Böhm, an instrument scientist in the spectroscopy group of Institut Laue-Langevin in Grenoble, France, and a co-author on the Nature Reviews Physics paper.

“For spectrometry, for example, it offers a new way of doing experiments and lets the instruments do the work, which results in time savings for users.”

As part of the effort, CAMERA has launched a Gaussian process autonomous data acquisition gpCAM software, open sourcing it for other researchers.

John Thomas, a post-doctoral research fellow in Berkeley Lab’s Molecular Foundry, said that he is using gpCAM to help with his photo-coupled scanning probe microscopy work on thin-film semiconducting systems.

“Nanoscale applications that make use of artificial intelligence and machine learning algorithms, specifically for scanning probe systems, have been an interest in the Weber-Bargioni group [at the Foundry] for some time,” Thomas said. “We became interested in using Gaussian processes toward autonomous discovery in the summer of 2020.”

He added: “Autonomous driving of scanning probe instruments, without the need for constant human operation, can optimize tool performance for engineers and scientists by continuing experiments during off-business hours or providing routes for simultaneous tasks within a given workflow; that is, the tool can be set up for an autonomous run while the user can efficiently make use of the time allowed."

Another early user is Aaron Michelson, a graduate researcher in the Oleg Gang group at Columbia University working on DNA origami-based self-assembly.

The tool is helping him study the thermal annealing history of DNA origami superlattices at the nanoscale, among other research avenues.

“DNA nanotechnology in the pursuit of self-assembling functional material often suffers from a limited ability to sample the large parameter space for synthesis,” he said. “Either this requires a large volume of data to be collected or a more efficient solution to experimentation.

“Autonomous discovery can be directly incorporated in both mining large datasets and guiding new experiments. This allows the researcher to steer away from mindlessly making more samples and puts us in the driver's seat to make decisions.”

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like