cryoDRGN Installationο
We provide installation instructions assuming an Anaconda environment for managing dependencies. Anaconda is a python package/environment manager which can handle complex dependencies between Python packages through the creation of python environments. We recommended creating a separate environment for cryodrgn to prevent any conflicts among dependencies with other software packages.
Compute/hardware requirements:
High performance linux workstation or cluster
NVIDIA GPUs
Dependencies:
python
pytorch
cudatoolkit
numpy
pandas
Additional dependencies for visualization:
matplotlib
seaborn
scipy 1.4.0+
scikit-learn
umap
jupterlab
ipywidgets
plotly and cufflinks
The software has been tested on Python 3.7-3.9 and pytorch 1.0-1.7, 1.12.
1) Install anacondaο
For most platforms, the installation typically consists of a shell script (e.g. Anaconda Installers) that you execute on the command line which will prompt you to install and choose a base directory where all the downloaded software and environments will go.
See the official Anaconda documentation and follow their installation instructions here.
Once your anaconda environment is activated, your anaconda environment should be indicated on the command line, e.g.:
(base) $
2) Setting up the cryoDRGN environmentο
First, create a new conda environment named
cryodrgn
(or renamed as appropriate):(base) $ conda create --name cryodrgn python=3.9
Activate the environment. Your command prompt will usually indicate the environment you are in with
(environment name)
before the prompt:(base) $ conda activate cryodrgn (cryodrgn) $
Install pytorch and cudatoolkit into your new cryodrgn environment:
(cryodrgn) $ conda install pytorch cudatoolkit=11.7 -c pytorch
Replace the cudatoolkit version with the appropriate version of CUDA installed with the GPU drivers. You can check the CUDA version with
nvidia-smi
.+-----------------------------------------------------------------------------+ | NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A | | N/A 41C P0 N/A / N/A | 5MiB / 4096MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 1420 G /usr/lib/xorg/Xorg 4MiB | +-----------------------------------------------------------------------------+
Donβt forget to include
-c pytorch
to get the software from the official pytorch channelTo customize the installation line depending on your situation, look at Pytorchβs Start locally.
Optional step for older pytorch versions (pre-version 1.6). (Newer versions of pytorch natively support mixed precision training.)
For accelerated training speeds (~3x faster) on GPUs with tensor cores (Nvidia Volta, Turing, and Ampere GPU architectures), install the apex package into the active conda environment. See their official documentation and installation instructions here.
git clone https://github.com/NVIDIA/apex cd apex pip install -v --disable-pip-version-check --no-cache-dir ./
3) Install cryoDRGNο
Obtain cryodrgn source code by cloning the git repository, and then doing a
pip install .
in the checkout folder. This will also install dependencies thatcryoDRGN
depends on.
# Clone source code and install
(cryodrgn) $ git clone https://github.com/zhonge/cryodrgn.git
(cryodrgn) $ cd cryodrgn
(cryodrgn) $ pip install .
Alternatively, if you prefer not to use git, you can directly download a ZIP file of the latest release from https://github.com/zhonge/cryodrgn/releases. For example, if the latest release is
cryodrgn-1.1.0.zip
:
(cryodrgn) $ unzip cryodrgn-1.1.0.zip
(cryodrgn) $ cd cryodrgn-1.1.0
(cryodrgn) $ pip install .
4) Testing the Installationο
Once installed, you should be able to call the cryodrgn
executable and see a list of commands:
(cryodrgn) $ cryodrgn -h
There is a small testing dataset in the source code that you can use to run cryodrgn and verify that all the dependencies were installed correctly:
(cryodrgn) $ cd [sourcecode directory]/testing
(cryodrgn) $ ./quicktest.sh
It should take ~20 seconds to run and reach a final loss around 0.08 in version 1.0 and 0.03 in version 1.1+. The output should look something like:
+ cryodrgn train_vae data/hand.mrcs -o output/toy_recon_vae --lr .0001 --seed 0 --poses data/hand_rot.pkl --zdim 10 --pe-type gaussian
2022-09-20 16:30:05 /home/vineetb/.conda/envs/cryodrgn/bin/cryodrgn train_vae data/hand.mrcs -o output/toy_recon_vae --lr .0001 --seed 0 --poses data/hand_rot.pkl --zdim 10 --pe-type gaussian
2022-09-20 16:30:05 Namespace(particles='/home/vineetb/cryodrgn/cryodrgn/testing/data/hand.mrcs', outdir='/home/vineetb/cryodrgn/cryodrgn/testing/output/toy_recon_vae', zdim=10, poses='/home/vineetb/cryodrgn/cryodrgn/testing/data/hand_rot.pkl', ctf=None, load=None, checkpoint=1, log_interval=1000, verbose=False, seed=0, ind=None, invert_data=True, window=True, window_r=0.85, datadir=None, lazy=False, preprocessed=False, max_threads=16, tilt=None, tilt_deg=45, num_epochs=20, batch_size=8, wd=0, lr=0.0001, beta=None, beta_control=None, norm=None, amp=True, multigpu=False, do_pose_sgd=False, pretrain=1, emb_type='quat', pose_lr=0.0003, qlayers=3, qdim=1024, encode_mode='resid', enc_mask=None, use_real=False, players=3, pdim=1024, pe_type='gaussian', feat_sigma=0.5, pe_dim=None, domain='fourier', activation='relu', func=<function main at 0x7f47d8676790>)
2022-09-20 16:30:06 Use cuda True
2022-09-20 16:30:06 Loading dataset from /home/vineetb/cryodrgn/cryodrgn/testing/data/hand.mrcs
2022-09-20 16:30:06 Loaded 100 64x64 images
2022-09-20 16:30:06 Windowing images with radius 0.85
2022-09-20 16:30:06 Computing FFT
2022-09-20 16:30:06 Spawning 16 processes
2022-09-20 16:30:06 Symmetrizing image data
2022-09-20 16:30:06 Normalized HT by 0 +/- 94.426513671875
2022-09-20 16:30:06 WARNING: No translations provided
2022-09-20 16:30:07 Using circular lattice with radius 32
2022-09-20 16:30:07 HetOnlyVAE(
(encoder): ResidLinearMLP(
(main): Sequential(
(0): Linear(in_features=3208, out_features=1024, bias=True)
(1): ReLU()
(2): ResidLinear(
(linear): Linear(in_features=1024, out_features=1024, bias=True)
)
(3): ReLU()
(4): ResidLinear(
(linear): Linear(in_features=1024, out_features=1024, bias=True)
)
(5): ReLU()
(6): ResidLinear(
(linear): Linear(in_features=1024, out_features=1024, bias=True)
)
(7): ReLU()
(8): Linear(in_features=1024, out_features=20, bias=True)
)
)
(decoder): FTPositionalDecoder(
(decoder): ResidLinearMLP(
(main): Sequential(
(0): Linear(in_features=202, out_features=1024, bias=True)
(1): ReLU()
(2): ResidLinear(
(linear): Linear(in_features=1024, out_features=1024, bias=True)
)
(3): ReLU()
(4): ResidLinear(
(linear): Linear(in_features=1024, out_features=1024, bias=True)
)
(5): ReLU()
(6): ResidLinear(
(linear): Linear(in_features=1024, out_features=1024, bias=True)
)
(7): ReLU()
(8): Linear(in_features=1024, out_features=2, bias=True)
)
)
)
)
2022-09-20 16:30:07 9814038 parameters in model
2022-09-20 16:30:07 6455316 parameters in encoder
2022-09-20 16:30:07 3358722 parameters in decoder
2022-09-20 16:30:07 Warning: z dimension is not a multiple of 8 -- AMP training speedup is not optimized
2022-09-20 16:30:08 # =====> Epoch: 1 Average gen loss = 1.10873, KLD = 3.386924, total loss = 1.108840; Finished in 0:00:01.346893
2022-09-20 16:30:09 # =====> Epoch: 2 Average gen loss = 0.740441, KLD = 8.101010, total loss = 0.740694; Finished in 0:00:00.542031
2022-09-20 16:30:09 # =====> Epoch: 3 Average gen loss = 0.535575, KLD = 10.675920, total loss = 0.535908; Finished in 0:00:00.541637
2022-09-20 16:30:10 # =====> Epoch: 4 Average gen loss = 0.368407, KLD = 13.592397, total loss = 0.368831; Finished in 0:00:00.541102
2022-09-20 16:30:11 # =====> Epoch: 5 Average gen loss = 0.234344, KLD = 16.974737, total loss = 0.234873; Finished in 0:00:00.544139
2022-09-20 16:30:11 # =====> Epoch: 6 Average gen loss = 0.140454, KLD = 19.307134, total loss = 0.141056; Finished in 0:00:00.541751
2022-09-20 16:30:12 # =====> Epoch: 7 Average gen loss = 0.0899037, KLD = 20.093284, total loss = 0.090530; Finished in 0:00:00.544378
2022-09-20 16:30:13 # =====> Epoch: 8 Average gen loss = 0.0641149, KLD = 20.714445, total loss = 0.064761; Finished in 0:00:00.543761
2022-09-20 16:30:13 # =====> Epoch: 9 Average gen loss = 0.0496119, KLD = 20.798030, total loss = 0.050260; Finished in 0:00:00.545478
2022-09-20 16:30:14 # =====> Epoch: 10 Average gen loss = 0.0408589, KLD = 21.089181, total loss = 0.041516; Finished in 0:00:00.548600
2022-09-20 16:30:15 # =====> Epoch: 11 Average gen loss = 0.0338546, KLD = 21.053582, total loss = 0.034511; Finished in 0:00:00.560902
2022-09-20 16:30:15 # =====> Epoch: 12 Average gen loss = 0.0290218, KLD = 21.509603, total loss = 0.029692; Finished in 0:00:00.549171
2022-09-20 16:30:16 # =====> Epoch: 13 Average gen loss = 0.0252569, KLD = 21.402734, total loss = 0.025924; Finished in 0:00:00.554416
2022-09-20 16:30:17 # =====> Epoch: 14 Average gen loss = 0.0222708, KLD = 21.686829, total loss = 0.022947; Finished in 0:00:00.549451
2022-09-20 16:30:17 # =====> Epoch: 15 Average gen loss = 0.0196031, KLD = 21.829715, total loss = 0.020284; Finished in 0:00:00.547152
2022-09-20 16:30:18 # =====> Epoch: 16 Average gen loss = 0.0175662, KLD = 21.648027, total loss = 0.018241; Finished in 0:00:00.548684
2022-09-20 16:30:19 # =====> Epoch: 17 Average gen loss = 0.0159719, KLD = 21.876881, total loss = 0.016654; Finished in 0:00:00.540562
2022-09-20 16:30:19 # =====> Epoch: 18 Average gen loss = 0.0147737, KLD = 21.754937, total loss = 0.015452; Finished in 0:00:00.541675
2022-09-20 16:30:20 # =====> Epoch: 19 Average gen loss = 0.0133148, KLD = 21.684366, total loss = 0.013991; Finished in 0:00:00.543656
2022-09-20 16:30:20 # =====> Epoch: 20 Average gen loss = 0.0124398, KLD = 21.621814, total loss = 0.013114; Finished in 0:00:00.543454
2022-09-20 16:30:21 Finished in 0:00:15.839427 (0:00:00.791971 per epoch)
You will want to verify that the output contains
Use cuda True
in the first few lines to ensure thatcryoDRGN
will be using your GPU for training.
Updating cryoDRGNο
To update to a later version, you need to obtain the updated software either with git checkout <version>
or direct
download from https://github.com/zhonge/cryodrgn, then rerun $ pip install .
in your cryodrgn anaconda environment:
(cryodrgn) $ cd /path/to/repo
(cryodrgn) $ git checkout 1.1.0 # or `git pull origin master` to get the latest sw
(cryodrgn) $ pip install .
To keep multiple versions of cryoDRGN in parallel, you will need to create a new anaconda environment and re-install all the dependencies.
Known Issuesο
In the jupyter notebook: no attribute
from_dcm
orfrom_matrix
:e.g.
AttributeError: type object 'scipy.spatial.transform.rotation.Rotation' has no attribute 'from_dcm'
Make sure you are using scipy version 1.4.0 or later.
If you are using scipy version 1.6.0 or later, make sure you are using cryodrgn version 0.3.2 or later.
In
cryodrgn analyze
: Running UMAP hangs for certain versions of umapUnder active investigation here
If you run into any issues getting cryoDRGN installed, please file a github issue, including all the commands you used and their output.