.. _`data-between-computes`:

=====================================
Copying Data Between Compute Clusters
=====================================

.. contents::
   :depth: 2
   :local:

Target Audience
---------------

This document describes a procedure for transferring data between The McDonnell
Genome Institute's **compute0** cluster and WUIT RIS **compute1** cluster.

* You must have login credentials for both compute environments.
* You must have read/write permissions to the relevant storage volumes.
* Be mindful of your `$USER` name on both clusters, some users have differing user IDs.

Build or find a container with ssh and rsync present
----------------------------------------------------

This Dockerfile constructs an Ubuntu based container with rsync and openssh.

::

    cat >> Dockerfile <<EOF
    FROM ubuntu
    
    # Tell apt-get that we're not paying attention
    ENV DEBIAN_FRONTEND noninteractive
    
    # First layer: an up to date starting base OS
    RUN sed -i 's/^# deb /deb /' /etc/apt/sources.list \
      && apt-get update
    
    # Next: Add desired packages and clean up
    RUN apt-get install -y --no-install-recommends \
        libnss-sss \
        openssh-client openssh-server rsync \
      && apt-get clean all \
      && rm -r /var/lib/apt/lists/*
    EOF

Push the container to Docker Hub:

::

    docker build . -t $(REGISTRY)$(OWNER)/$(NAME):latest
    docker push $(REGISTRY)$(OWNER)/$(NAME):latest

.. note::

    Feel free to use my container `mcallaway/rsync:latest`


Prepare an SSH key on the compute cluster client nodes
------------------------------------------------------

We need to create a set of SSH keys. A "user" key for the sending ssh client
side, and a "host" key for the ssh server side. You can create both keys in
your $HOME directory on compute1, then copy them over to compute0 so they exist
on both sides. The sender only needs the user key and the server only needs the
host key, but the instructions copy them all just for uniformity.

So, on compute1:

::

    cd $HOME
    mkdir ./etc

Create a key for use by sshd server:

::

    ssh-keygen -t rsa -f etc/ssh_host_rsa_key -N ''

Create a key for use by ssh client:

::

    ssh-keygen -t rsa -f etc/ssh_user_rsa_key -N ''

Add this user key to your ~/.ssh/authorized_keys file on the compute1 side:

::

    cat etc/ssh_user_rsa_key.pub >> ~/.ssh/authorized_keys

Create an sshd_config that refers to the path to the above SSH keys:

::

    cat > ~/etc/sshd_config <<EOF
    Port 22 # Override this by passing PORT to sshd_entrypoint.sh
    HostKey /home/mcallawa/etc/ssh_host_rsa_key
    PidFile /home/mcallawa/etc/sshd.pid
    PasswordAuthentication no
    ChallengeResponseAuthentication no
    GSSAPICleanupCredentials no
    EOF

Verify permissions on your ~/.ssh and ~/etc contents:

::

    chmod 700 ~/.ssh
    chmod 600 ~/.ssh/authorized_keys
    chmod 700 ~/etc
    chmod 600 ~/etc/*

Have a wrapper to run sshd as you:

::

    cat >> ~/etc/sshd_entrypoint.sh <<EOF
    #!/bin/bash
    PORT=\$1
    while true; do
      echo Starting sshd as \$USER
      /usr/sbin/sshd -f etc/sshd_config -D -p \$PORT
      echo sshd exited...
      sleep 3
    done
    EOF

Now copy the keys `etc/ssh_user_rsa_key` and `etc/ssh_host_rsa_key` to the
compute0 side by "catting" them and cutting and pasting them into a text editor
on compute0.


Launch a sshd job into compute1
-------------------------------

Select a port between 8000 and 8999, and start your sshd, making note of the
execution node it lands on.

.. note::

    Note here that we expect $HOME to have been passed into the Docker container for the job.
    This happens automatically if your current working directory happens to be your $HOME, but
    if you are not launching these jobs from your $HOME, you will have to add $HOME to the list
    of LSF_DOCKER_VOLUMES.

::

    LSF_DOCKER_VOLUMES="/storage1/fs1/mcallawa:/storage1/fs1/mcallawa" LSF_DOCKER_PORTS='8200:8200' bsub -Is -G compute-ris -q general-interactive -R 'select[port8200=1]' -a 'docker(mcallaway/rsync:latest)' bash ./etc/sshd_entrypoint.sh 8200
    Job <60223> is submitted to queue <general-interactive>.
    <<Waiting for dispatch ...>>
    <<Starting on compute1-exec-163.ris.wustl.edu>>
    latest: Pulling from mcallaway/rsync
    5c939e3a4d10: Already exists
    c63719cdbe7a: Already exists
    19a861ea6baf: Already exists
    651c9d2d6c4f: Already exists
    bf91b5efbfd8: Pull complete
    bb3a7dd7dc67: Pull complete
    Digest: sha256:2366f9b805855764fa7202aaf3f29b5ced4c7af2463fa7570b9ea73a7eb72e58
    Status: Downloaded newer image for mcallaway/rsync:latest
    docker.io/mcallaway/rsync:latest
    Starting sshd as mcallawa

.. note::

    Interactive jobs will soon have "runtime limits" on the order of 3 days (to be determined).
    Non-interactive (batch) jobs will have much longer runtime limits, likely 6 weeks. Be wary
    of "losing" (forgetting about) running jobs, but note they'll be killed sooner or later.
    Be mindful of this in large (multi-day) transfers. Use log files to keep track of batch jobs,
    using `bsub -eo job.err -oo job.out ...` files for stdout and stderr.

Launch an rsync job into compute0
---------------------------------

Now launch an rsync job in compute0 to "push" to compute1. Note the use of environment variables here
to specify the path from which your data is coming, the host and port involved, and the use of the
"host" network for Docker:

::

    export LSF_DOCKER_VOLUMES="/gscmnt/temp403:/gscmnt/temp403"
    export LSF_DOCKER_NETWORK=host
    bsub -q research-hpc -Is -a 'docker(mcallaway/rsync:latest)' bash
    Job <2665534> is submitted to queue <research-hpc>.
    <<Waiting for dispatch ...>>
    <<Starting on blade18-2-2.gsc.wustl.edu>>
    latest: Pulling from mcallaway/rsync
    5c939e3a4d10: Already exists
    c63719cdbe7a: Already exists
    19a861ea6baf: Already exists
    651c9d2d6c4f: Already exists
    bf91b5efbfd8: Already exists
    bb3a7dd7dc67: Pull complete
    Digest: sha256:2366f9b805855764fa7202aaf3f29b5ced4c7af2463fa7570b9ea73a7eb72e58
    Status: Downloaded newer image for mcallaway/rsync:latest
    mcallawa@blade18-2-2:~$ HOST=compute1-exec-163.ris.wustl.edu # The compute1 exec node above
    mcallawa@blade18-2-2:~$ PORT=8200
    mcallawa@blade18-2-2:~$ rsync --archive --whole-file --verbose --stats --progress -e "ssh -p $PORT -i $HOME/.ssh/ssh_user_rsa_key" /gscmnt/temp403/systems/git_srv.tar.gz $USER@$HOST:/storage1/fs1/mcallawa/Active/data/
    sending incremental file list
    git_srv.tar.gz
     32,555,073,536  97%  124.27MB/s    0:00:05

.. note::

    The same warnings apply here with job duration, termination at runtime limits, and the
    use of output and error files.

.. note::

    You can also simply use `scp -P $PORT -i $HOME/.ssh/ssh_user_rsa_key $SRC $USER@$HOST:$DEST`

Note that rsync has some computational overhead, is tar over ssh faster?
------------------------------------------------------------------------

Instead of using "rsync", one can also use "tar over ssh". Note here the use of
"pv" to calculate the rate of data crossing a pipe for a "progress bar":

::

    mcallawa@blade18-2-2:~$ HOST=compute1-exec-163.ris.wustl.edu
    mcallawa@blade18-2-2:~$ PORT=8200
    mcallawa@blade18-2-2:~$ tar cf - /gscmnt/temp403/systems/mcallawa/data/ | pv | ssh -p $PORT -i ~/.ssh/ssh_user_rsa_key $USER@$HOST 'tar xf -'
    tar: Removing leading `/' from member names
    4.27GiB 0:00:29 [ 148MiB/s] [                        <=>          ]

Caveats
-------

* We observe a sigle, single-threaded rsync job to transfer at around 150MB/s.
* Use more than one job across different hosts.
* Cumulative bandwidth across these two clusters is 2x40Gb.
* Launch several jobs across different pairs of hosts to parallelize, but remember this is a shared system, be mindful of others. We as a community need to try to "add up" to a cumulative total of about 80G of network consumption. This is hard without QoS tools.
* Be careful with your file paths, strip "/" where needed.
* Rsync and tar will preserve symbolic links, where Globus and Samba do not.
* Many people are likely to use this process, you'll need to pick a network port not in use. Try this out by using bhosts to find out if a port is open:

::

    # Show me all hosts in the "general" host group with port 8200 open
    bhosts -w -R 'select[port8200=1]' general