Using OFED in Docker Images

Compute Resources

Docker Usage

Overview

This documentation will guide you on making sure you’re using the most appropriate OFED version for your Docker image in regards to the Scientific Compute Platform.

Installing the Correct Version

Shown below is an example of OFED 5.4 driver Dockerfile instructions for RedHat 7.7.

ENV MOFED_VERSION 5.4-3.1.0.0
ENV OS_VERSION rhel7.7
ENV PLATFORM x86_64
RUN cd /tmp/ && yum install -y pciutils numactl-libs gtk2 atk cairo gcc-gfortran tcsh lsof libnl3 libmnl ethtool tcl tk perl make libusbx fuse-libs && \
    wget -q http://content.mellanox.com/ofed/MLNX_OFED-${MOFED_VERSION}/MLNX_OFED_LINUX-${MOFED_VERSION}-${OS_VERSION}-${PLATFORM}.tgz && \
    tar -xvf MLNX_OFED_LINUX-${MOFED_VERSION}-${OS_VERSION}-${PLATFORM}.tgz && \
    MLNX_OFED_LINUX-${MOFED_VERSION}-${OS_VERSION}-${PLATFORM}/mlnxofedinstall --user-space-only --without-fw-update -q  --distro rhel7.7 && \
    cd .. && \
    rm -rf ${MOFED_DIR} && \
    rm -rf *.tgz && \
    yum clean all

This also pertains to the Ubuntu with different code snippets but same version of MOFED_VERSION 5.4-3.1.0.0.

ENV MOFED_VERSION 5.4-3.1.0.0
ENV OS_VERSION ubuntu20.04
ENV PLATFORM x86_64
RUN cd /tmp/ && apt-get update && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends pciutils numactl-libs gtk2 atk cairo gcc-gfortran tcsh lsof libnl3 libmnl ethtool tcl tk perl make libusbx fuse-libs && \
    wget -q http://content.mellanox.com/ofed/MLNX_OFED-${MOFED_VERSION}/MLNX_OFED_LINUX-${MOFED_VERSION}-${OS_VERSION}-${PLATFORM}.tgz && \
    tar -xvf MLNX_OFED_LINUX-${MOFED_VERSION}-${OS_VERSION}-${PLATFORM}.tgz && \
    MLNX_OFED_LINUX-${MOFED_VERSION}-${OS_VERSION}-${PLATFORM}/mlnxofedinstall --user-space-only --without-fw-update -q  --distro ubuntu20.04 && \
    cd .. && \
    rm -rf ${MOFED_DIR} && \
    rm -rf *.tgz && \
    apt-get clean

Once you have the correct OFED version installation code in your Dockerfile, you can build and push the image as you normally would.

Testing Your Image

Shown below are the steps to run a test job.

  • Create a bsub file called test.bsub as shown below. Please replace <Docker image tag> with your Docker image tag and <MPI program>.

#BSUB -q subscription
#BSUB -R "span[ptile=1]"
#BSUB -a "docker(<Docker image tag>)"
#BSUB -G compute-ris
#BSUB -oo lsf-%J.log

mpirun -np $NP <MPI program>
  • Run your test. Shown below is an example command. Please replace <Number of processes> with number of exec nodes to run the test.

export NP=<Number of processes> && \
LSF_DOCKER_NETWORK=host \
LSF_DOCKER_IPC=host \
LSF_DOCKER_SHM_SIZE=20G \
bsub -n $NP < test.bsub
  • There is a test script in https://github.com/WashU-IT-RIS/docker-osu-micro-benchmarks.git. Shown below are the instructions for OSU Benchmark test.
    • Clone the repository.

    git clone https://github.com/WashU-IT-RIS/docker-osu-micro-benchmarks.git
    
    • Change directory to docker-osu-mirco-benchmarks.

    cd docker-osu-mirco-benchmarks
    
    • Run an OSU Benchmark test.
      • Replace <test> with an OSU test that you want to run. For example, osu_bw for OSU bandwidth test.

      • Replace <compute-group> with the compute group you are a member of.

    QUEUE=subscription bin/osu-test.sh <test> -G <compute-group>
    

Docker Images Identified That Utilize OFED

If a Docker image you use appears here, you will likely need to update your image.

Current as of 6/10/22

Docker Image

OFED Version

gcr.io/ris-registry-shared/base-terminal

4.7-3.2.9.0

gcr.io/ris-registry-shared/base-terminal:latest

4.7-3.2.9.0

gcr.io/ris-registry-shared/base-x

4.7-3.2.9.0

gcr.io/ris-registry-shared/base-x-cuda

4.7-3.2.9.0

ruikang/api_wrf

4.9-2.2.4.0

ruikang/api_wrf:latest

4.9-2.2.4.0

us.gcr.io/ris-appeng-shared-dev/bayly-nli:centos7

4.9-2.2.4.0

us.gcr.io/ris-appeng-shared-dev/compiler-base:oneapi2021.1.1_centos7

4.9-2.2.4.0