The dataset is publicly available in zenodo under DOI: 10.5281/zenodo.17197808 and can be downloaded using [link](https://zenodo.org/records/17197808).

The dataset contains a pickle file containing simulated cellular protein mixture dataset with SNR 0.01 and missing wedge angle 30°. It contains 4000 subtomograms and their metadata. The subtomograms belong to any one of four macromolecule class: FAS, proteasome, ribosome, or TriC. Each macromolecule class has 1000 subtomograms. The dataset is organized as a dictionary where each element in the dictionary has following elements:

`subtomo` is the subtomogram of size (48, 48, 48)

`protein_name` is the name of the macromolecular structure

`PDB_ID` is the RCSB pdb id of the structure

`template` is the template for the macromolecule used to simulate the data. It is of size (48,48,48)

`angle` is the euler angle by which the template is rotated

`translation` is the translation vector by which the template is translated
`map` is the rotated and translated template

`class_id` is the class label assigned to the subtomogram (based on protein name).

