Abstract: We propose a parametric forward model for single particle cryo-electron microscopy (cryo-EM), and employ stochastic variational inference to infer posterior distributions of the physically interpretable latent variables. Our cryo-EM forward model accounts for the biomolecular configuration (via spatial coordinates of pseudo-atoms, in contrast with traditional voxelized representations) the global pose, the effect of the microscope (contrast transfer function’s defocus parameter). To account for conformational heterogeneity, we use the anisotropic network model (ANM). We perform experiments on synthetic data and show that the posterior of the scalar component along the lowest ANM mode and the angle of 2D in-plane pose can be jointly inferred with deep neural networks. We also perform Fourier frequency marching in the simulation and likelihood during training of the neural networks, as an annealing step.