CryoSPIN ❄️߷

Improving Ab-Initio Cryo-EM Reconstruction with Semi-Amortized Pose Inference

NeurIPS 2024

Shayan Shekarforoush1,2       David Lindell1,2       Marcus Brubaker1,2,3,4       David Fleet1,2,3      
1University of Toronto     2Vector Institute     3Google Research     4York University
MAIN_FIGURE

CryoSPIN consists of two stages: (i) an auto-encoding stage where an image encoder with multiple heads maps the input image to the pose candidate set, followed by computing projections by slicing through the volume decoder in Fourier space. The projections are compared with the input image and the one with the minimum error is used. (ii) An auto-decoding stage where pose parameters are stored in a look-up table for all images and directly optimized using stochastic gradient descent (SGD).


Abstract

Cryo-EM is an increasingly popular method for determining the atomic resolution 3D structure of macromolecular complexes (eg, proteins) from noisy 2D images captured by an electron microscope. The computational task is to reconstruct the 3D density of the particle, along with 3D pose of the particle in each 2D image, for which the posterior pose distribution is highly multi-modal. Recent developments in cryo-EM have focused on deep learning for which amortized inference has been used to predict pose. Here, we address key problems with this approach, and propose a new semi-amortized method, cryoSPIN, in which reconstruction begins with amortized inference and then switches to a form of auto-decoding to refine poses locally using stochastic gradient descent. Through evaluation on synthetic datasets, we demonstrate that cryoSPIN is able to handle multi-modal pose distributions during the amortized inference stage, while the later, more flexible stage of direct pose optimization yields faster and more accurate convergence of poses compared to baselines. On experimental data, we show that cryoSPIN outperforms the state-of-the-art cryoAI in speed and reconstruction quality.


Reconstruction Results


Semi-Amortized vs. Fully-Amortized

We branch the running reconstruction into two after a certain number of epochs: the first continues using the encoder (Fully-Amortized) while the second switches to direct pose optimization (Semi-Amortized). We mark the trajectory of pose estimates for example images over a heat map of the approximate (log) pose posterior (marginalized over in-plane rotations to get view-direction) on a uniform grid of the unit sphere \( S^2 \). After Gnomonic projection, we show the neighborhood around optimization trajectories. Each cell shows an area of 10 degrees and the error of final estimates are also provided for each example.


Multi-modal Pose Posterior

We compare our multi-head encoder which return multiple pose candidates per-image with cryoAI encoder which obtains two pose estimates using input augmentation. We plot the pose estimates by each method over the approximate (log) pose posterior (marginalized over in-plane rotations) visualized over a uniform grid on the unit sphere. Clearly, cryoAI encoder fails to handle multi-modality of the pose distribution while our method is able to account for the uncertainty by capturing multiple modes.


Citation

@article{shekarforoush2024improving,
      title={Improving Ab-Initio Cryo-EM Reconstruction with Semi-Amortized Pose Inference},
      author={Shekarforoush, Shayan and Lindell, David B and Brubaker, Marcus A and Fleet, David J},
      journal={arXiv preprint arXiv:2406.10455},
      year={2024}
}

The design of this project page is based on this website.