.. _`parabricks-quickstart`: ===================== Parabricks Quickstart ===================== .. contents:: :depth: 1 :local: .. admonition:: Compute Resources - Have questions or need help with compute, including activation or issues? Follow `this link. `__ - :ref:`User Agreement ` .. admonition:: Docker Usage - The information contained on this page assumes that you have a knowledge base of using Docker to create images and push them to a repository for use. If you need to review that information, please see the links below. - :ref:`Docker and the RIS Compute Service ` - :ref:`Docker Basics: Building, Tagging, & Pushing A Custom Docker Image ` Image Details ------------- - Docker image hosted at nvcr.io/nvidia/clara/clara-parabricks:4.0.0-1 - Official documentation for `Parabricks version 4.0.0. `__ Getting Started --------------- - Connect to compute client. .. code:: ssh wustlkey@compute1-client-1.ris.wustl.edu - Prepare the computing environment before submitting a job. .. code:: # Use scratch file system for temp space export SCRATCH1=/scratch1/fs1/${COMPUTE_ALLOCATION} # Use Active storage for input and output data export STORAGE1=/storage1/fs1/${STORAGE_ALLOCATION}/Active # Use host level communications for the GPUs export LSF_DOCKER_NETWORK=host # Use debug flag when trying to figure out why your job failed to launch on the cluster #export LSF_DOCKER_RUN_LOGLEVEL=DEBUG # Use entry point since the parabricks container has other entrypoints but our cluster, by default, requires /bin/sh export LSF_DOCKER_ENTRYPOINT=/bin/sh # Create tmp dir export TMP_DIR=${SCRATCH1}"/parabricks-tmp" [ ! -d $TMP_DIR ] && mkdir $TMP_DIR - Submit job. Basic commands for use: .. code:: bsub -n 16 -M 64GB -R 'gpuhost rusage[mem=64GB] span[hosts=1]' -q general -gpu "num=1:j_exclusive=yes" -a 'docker(nvcr.io/nvidia/clara/clara-parabricks:4.0.0-1)' pbrun command options Known Issues ------------ - Parabricks relies on available GPU(s) noted with ``NVIDIA_VISIBLE_DEVICES`` which defaults to 'all' regardless the quantity and device number of GPU(s) reserved at runtime. As such, there is a possibility the software will attempt to run on GPU(s) the job does not have access to. At this time it is advised to prepend ``pbrun`` with the following. .. code:: for VAR in $(printenv | grep CUDA_VISIBLE_DEVICES); do export ${VAR/CUDA/NVIDIA}; done Additional Information ---------------------- - Cores (``-n``) and memory (``-M`` and ``mem``) may need to be adjusted depending on the data set used. - 1 GPU server should have 64GB CPU RAM, at least 16 CPU threads. - 2 GPU server should have 100GB CPU RAM, at least 24 CPU threads. - 4 GPU server should have 196GB CPU RAM, at least 32 CPU threads. - It is suggested to keep the GPUs at 4 and RAM at 196GB unless your data set is smaller than the 5GB test data set. - There is diminishing returns on using more GPUs on small data sets. - Replace ``command`` with any of the ``pbrun`` commands such as ``fq2bam``, ``bqsr``, ``applybqsr``, or ``haplotypecaller``. - Please refer to official `Parabricks documentation `__ for additional direction. Earlier Versions ---------------- Earlier versions are still available but no longer directly supported by RIS. Please refer to the latest version for direct support. .. toctree:: :maxdepth: 1 deprecated-tools/parabricks-deprecated-quickstart