CryoDRGN2 quickstart
There are two commands for ab initio reconstruction, cryodrgn abinit_homo
and cryodrgn abinit_het
for homogeneous and heterogeneous ab initio reconstruction, respectively:
# homogeneous ab initio reconstruction
(cryodrgn) $ cryodrgn abinit_homo -h
# heterogeneous ab initio reconstruction
(cryodrgn) $ cryodrgn abinit_het -h
Setup
Downsample your particles to a box size of 128 either with
cryodrgn downsample
or with other tools.If you have a large dataset (>500k images), we recommend training on a subset of particles for initial testing. Use
cryodrgn_utils select_random
to select a random subset of particles.# get a random selection of 200k particles from a dataset of 1,423,124 particles (cryodrgn) $ cryodrgn_utils select_random 1423124 -n 200000 -o ind200k.pkl
You can then train on only the random subset with the argument
--ind ind200k.pkl
For reference, ab initio heterogeneous reconstruction on a dataset containing 218k 128x128 particles took 20 hours to train on a single V100 GPU.
Example usage
# homogeneous reconstruction
(cryodrgn) $ cryodrgn abinit_homo [particles] --ctf [ctf.pkl] -o [output_directory] >> output.log
# heterogeneous reconstruction
(cryodrgn) $ cryodrgn abinit_het [particles] --ctf [ctf.pkl] --zdim 8 -o [output_directory] >> output.log
Note on training settings
The default translational search extent is +/- 10 pixels (
--t-extent 10
). If your particles are not well-centered, you can use a wider search extent, e.g. +/- 40 pixels (--t-extent 40
).Poses are updated every 5 epochs (
--ps-freq 5
) to alternate between pose search (slow) and standard cryodrgn1 training (fast) using the last iteration’s poses.The default pose search settings are not tuned for high accuracy alignments (a tradeoff of accuracy vs. compute speed). You can increase the resolution of the pose search with `
The default training time is 30 epochs. A typical use case is to run for 30 epochs, check the results (
cryodrgn analyze
), then extend training to 60 epochs. You can extend by rerunning with-n 60 --load latest
. If your dataset is very large, you may want to reduce the pose search freqency--ps-freq
and the number of epochs-n
.During training, pose search epochs will get successively slower. This is because the parameter
--l-ramp-epochs 25
increases the max resolution from a Fourier radius of 12 pixels (--l-start
) to 32 pix (--l-end
) over the first 25 epochs of training.Example training time course (1 V100 GPU)
2022-01-20 18:00:59 # =====> Epoch: 1 Average gen loss = 0.8816, KLD = 0.8981, total loss = 0.8817; Finished in 1:13:20.758477 2022-01-20 18:02:07 Using previous iteration poses 2022-01-20 18:15:20 # =====> Epoch: 2 Average gen loss = 0.8825, KLD = 1.6127, total loss = 0.8826; Finished in 0:13:13.168348 2022-01-20 18:16:27 Using previous iteration poses 2022-01-20 18:29:41 # =====> Epoch: 3 Average gen loss = 0.8818, KLD = 1.8766, total loss = 0.8819; Finished in 0:13:14.378679 2022-01-20 18:30:48 Using previous iteration poses 2022-01-20 18:44:02 # =====> Epoch: 4 Average gen loss = 0.8811, KLD = 2.0323, total loss = 0.8812; Finished in 0:13:13.887047 2022-01-20 18:45:09 Using previous iteration poses 2022-01-20 18:58:23 # =====> Epoch: 5 Average gen loss = 0.8808, KLD = 2.1141, total loss = 0.8809; Finished in 0:13:14.298884 2022-01-20 20:30:25 # =====> Epoch: 6 Average gen loss = 0.8783, KLD = 2.0354, total loss = 0.8784; Finished in 1:30:55.173547 2022-01-20 20:31:46 Using previous iteration poses 2022-01-20 20:45:00 # =====> Epoch: 7 Average gen loss = 0.878, KLD = 2.1416, total loss = 0.8782; Finished in 0:13:14.021982 2022-01-20 20:46:20 Using previous iteration poses 2022-01-20 20:59:35 # =====> Epoch: 8 Average gen loss = 0.8778, KLD = 2.1859, total loss = 0.8780; Finished in 0:13:14.906471 2022-01-20 21:00:43 Using previous iteration poses 2022-01-20 21:13:57 # =====> Epoch: 9 Average gen loss = 0.8776, KLD = 2.2222, total loss = 0.8778; Finished in 0:13:14.233966 2022-01-20 21:15:04 Using previous iteration poses 2022-01-20 21:28:19 # =====> Epoch: 10 Average gen loss = 0.8775, KLD = 2.2439, total loss = 0.8776; Finished in 0:13:14.907938 2022-01-20 23:28:54 # =====> Epoch: 11 Average gen loss = 0.8769, KLD = 2.2428, total loss = 0.8771; Finished in 1:59:27.793799 2022-01-20 23:30:01 Using previous iteration poses 2022-01-20 23:43:15 # =====> Epoch: 12 Average gen loss = 0.8769, KLD = 2.3463, total loss = 0.8770; Finished in 0:13:14.289931 2022-01-20 23:44:23 Using previous iteration poses 2022-01-20 23:57:37 # =====> Epoch: 13 Average gen loss = 0.8767, KLD = 2.3692, total loss = 0.8769; Finished in 0:13:14.531232 2022-01-20 23:58:45 Using previous iteration poses 2022-01-21 00:11:59 # =====> Epoch: 14 Average gen loss = 0.8766, KLD = 2.3928, total loss = 0.8768; Finished in 0:13:14.821960 2022-01-21 00:13:07 Using previous iteration poses 2022-01-21 00:26:22 # =====> Epoch: 15 Average gen loss = 0.8765, KLD = 2.4063, total loss = 0.8767; Finished in 0:13:15.422771 2022-01-21 02:58:58 # =====> Epoch: 16 Average gen loss = 0.8762, KLD = 2.3726, total loss = 0.8764; Finished in 2:31:28.195825 2022-01-21 03:00:05 Using previous iteration poses 2022-01-21 03:13:07 # =====> Epoch: 17 Average gen loss = 0.8762, KLD = 2.4672, total loss = 0.8764; Finished in 0:13:02.429271 2022-01-21 03:14:14 Using previous iteration poses 2022-01-21 03:27:17 # =====> Epoch: 18 Average gen loss = 0.876, KLD = 2.4911, total loss = 0.8762; Finished in 0:13:02.989886 2022-01-21 03:28:24 Using previous iteration poses 2022-01-21 03:41:27 # =====> Epoch: 19 Average gen loss = 0.876, KLD = 2.5077, total loss = 0.8762; Finished in 0:13:02.990354 2022-01-21 03:42:34 Using previous iteration poses 2022-01-21 03:55:37 # =====> Epoch: 20 Average gen loss = 0.8759, KLD = 2.5235, total loss = 0.8761; Finished in 0:13:02.651984 2022-01-21 07:09:20 # =====> Epoch: 21 Average gen loss = 0.8756, KLD = 2.4737, total loss = 0.8758; Finished in 3:12:35.716970 2022-01-21 07:10:27 Using previous iteration poses 2022-01-21 07:23:30 # =====> Epoch: 22 Average gen loss = 0.8757, KLD = 2.5532, total loss = 0.8759; Finished in 0:13:02.696251 2022-01-21 07:24:37 Using previous iteration poses 2022-01-21 07:37:40 # =====> Epoch: 23 Average gen loss = 0.8756, KLD = 2.5753, total loss = 0.8758; Finished in 0:13:03.271111 2022-01-21 07:38:47 Using previous iteration poses 2022-01-21 07:51:52 # =====> Epoch: 24 Average gen loss = 0.8755, KLD = 2.5935, total loss = 0.8757; Finished in 0:13:05.296650 2022-01-21 07:52:59 Using previous iteration poses 2022-01-21 08:06:03 # =====> Epoch: 25 Average gen loss = 0.8755, KLD = 2.6057, total loss = 0.8757; Finished in 0:13:03.414230 2022-01-21 12:13:51 # =====> Epoch: 26 Average gen loss = 0.8753, KLD = 2.5529, total loss = 0.8755; Finished in 4:06:41.267859 2022-01-21 12:14:59 Using previous iteration poses 2022-01-21 12:28:01 # =====> Epoch: 27 Average gen loss = 0.8753, KLD = 2.6321, total loss = 0.8755; Finished in 0:13:02.813657 2022-01-21 12:29:09 Using previous iteration poses 2022-01-21 12:42:11 # =====> Epoch: 28 Average gen loss = 0.8752, KLD = 2.6528, total loss = 0.8754; Finished in 0:13:02.511731 2022-01-21 12:43:18 Using previous iteration poses 2022-01-21 12:56:21 # =====> Epoch: 29 Average gen loss = 0.8752, KLD = 2.6635, total loss = 0.8754; Finished in 0:13:02.821824 2022-01-21 12:57:28 Using previous iteration poses 2022-01-21 13:10:31 # =====> Epoch: 30 Average gen loss = 0.8751, KLD = 2.6744, total loss = 0.8753; Finished in 0:13:02.737088
Questions and contact
If you have any questions about the method or software, please file a GitHub issue:
https://github.com/zhonge/cryodrgn/issues
Or post in the cryoDRGN Google Group: https://groups.google.com/g/cryodrgn.
Reference
CryoDRGN2 software was developed by Ellen Zhong & Adam Lerer with software support from Vineet Bansal. If you find the ab initio tools in cryoDRGN useful, please cite:
Zhong, Lerer, Davis, Berger. ICCV 2021.