BioEmu AI reveals protein choreography in biological conditions

Proteins aren’t rigid sculptures. They twist, flex, and sometimes unravel — movements essential to understanding their function. Some proteins, like enzymes, open like clamshells to grab molecules. Others such as signalling proteins shift shape to control cell processes. Still others briefly reveal hidden gaps where drugs can bind. Artificial intelligence (AI) tools like AlphaFold have made structure prediction routine, but they typically yield just one stable form, a single frame from what is really a moving picture.

A new deep learning system called BioEmu, developed by Microsoft and researchers at Rice University in the US and Freie Universität in Germany, predicts the full range of shapes a protein naturally explores under biological conditions. Known as the equilibrium ensemble, it allows high-resolution protein flexibility modelling at scale, unlike slower, more classical approaches. Described in Science, BioEmu is faster and cheaper, enabling large-scale predictions of protein function.

To understand BioEmu’s significance, it helps to see what it’s up against. The gold standard for modeling protein flexibility is molecular dynamics (MD), which tracks atomic movements at millionths of a billionth of a second using tools like GROMACS or Anton.

Despite its ultrafine resolution and accuracy, MD is slow and costly. Simulating motions over microseconds or milliseconds can take tens of thousands of GPU-hours, even on supercomputers.

BioEmu sidesteps this bottleneck by relying on an AI diffusion model. To train BioEmu, researchers first fed it real protein structures, from millions of AlphaFold-predicted assemblies, 200 milliseconds of MD simulations spanning thousands of proteins, and half a million mutant sequences from experimental stability measurements. It’s like dropping a sugar cube into a glass of water: the original structure, clear and defined, is gradually dissolved. BioEmu’s real task is to learn how to run that process in reverse: from noise to a sugar cube. Once trained, it can generate thousands of plausible protein conformations from scratch.

BioEmu excelled at benchmarks. It captured large shape changes in enzymes, local unfolding that switches proteins on or off, and fleeting cryptic pockets, temporary crevices that can serve as drug docking sites, like in the cancer-linked protein Ras. It predicted 83% of large shifts and 70-81% of small changes accurately, including open and closed forms of a vital enzyme called adenylate kinase. It also handled hard to predict proteins that don’t have a fixed 3D structure and how mutations affect protein stability.

Fast but not fully detailed

While MD simulates how proteins move over time, including interactions with water and drugs, BioEmu quickly generates snapshots of all the stable shapes a protein is likely to adopt. It can produce thousands of these structures in minutes to hours on a single GPU. But it can’t show how a process unfolds.

“If a researcher wants to understand how a drug reaches a hidden binding site, MD can reveal the step-by-step pathway,” says Kalairasan Ponnuswamy, bioinformatician and assistant professor at SRM Institute of Science and Technology. “BioEmu shows the final shapes, not how the protein gets there.”

MD also handles temperature shifts, membranes, and other conditions that BioEmu’s static predictions can’t yet model.

BioEmu also can’t model cell walls, drug molecules, pH changes or show prediction reliability like AlphaFold. It’s also limited to single chains and can’t model how proteins interact, a key part of most biological processes and drug targets.

“It’s better seen as a hypothesis-generating tool than a source of final conclusions,” says Ponnuswamy.

As the system grows to handle more complex proteins and chemical interactions, researchers may still need experiments or older simulation methods to validate what it proposes.

Still, the conceptual advance is clear. If AlphaFold provided the protein world’s blueprint, BioEmu sketches its choreography. By capturing flexibility quickly across thousands of proteins, it enables large-scale drug discovery and function studies with fewer resource constraints, Ponnuswamy notes: “Tasks that took weeks will now take hours.”

He does however emphasise the need for proper training and skill-set acquisition.

“Future scientists will not only need a deep grounding in physics and chemistry, they’ll also need fluency in machine learning and physical modelling to unlock the true potential of such hybrid approaches.”

The researchers see BioEmu and MD as complementary tools. BioEmu can quickly generate a range of plausible conformations, which MD can then explore in detail. This hybrid approach could greatly reduce simulation time while preserving fidelity.

Anirban Mukhopadhyay is a geneticist by training and science communicator from Delhi.

Published – July 20, 2025 05:30 am IST

Leave a Comment