.. _`ris-rclone`: ======================= Moving Data With Rclone ======================= .. contents:: :local: :depth: 2 What is Rclone? =============== From https://hub.docker.com/r/rclone/rclone: | Rclone ("rsync for cloud storage") is a command line program | to sync files and directories to and from different cloud storage providers. Overview ======== You will install rclone on your local computer. Through the command `rclone config`, you will create a credential file for rclone to connect to your WUSTL Box on your local computer. By copying the file to your home directory on RIS compute1 client, you will be able to access your Box storage through your rclone container on a compute1 exec node. Prerequisites ============= #. A WUSTL Box account #. A user account for RIS storage1 and compute1 services Building an Endpoint ==================== I. Installation --------------- - For macOS users, run the following command to install rclone with Homebrew. .. code:: > brew install rclone - For Windows users, download the relevant archive file from https://rclone.org/downloads/ for your environment. Then, extract the `rclone.exe` binary from the archive. - For Linux/BSD users, run the following command to install rclone. .. code:: > curl https://rclone.org/install.sh | sudo bash II. Configuration ----------------- a. Creating the configuration file for the connection to WUSTL Box 1. Open a terminal where the rclone has been installed. 2. Run `rclone config` to start the interactive process. .. code:: > rclone config No remotes found - make a new one n) New remote s) Set configuration password q) Quit config n/s/q> 3. Type `n` to setup a new remote connection. It will ask for the name for your new remote connection. .. code:: n/s/q> n name> 4. Type `Box` for example, as the name of your new remote connection. It will ask for the storage type. .. code:: name> Box Type of storage to configure. Enter a string value. Press Enter for the default (""). Choose a number from below, or type in your own value 5. Type `box` for the storage type. .. code:: Type of storage to configure. Enter a string value. Press Enter for the default (""). Choose a number from below, or type in your own value 1 / 1Fichier \ "fichier" 2 / Alias for an existing remote \ "alias" 3 / Amazon Drive \ "amazon cloud drive" 4 / Amazon S3 Compliant Storage Provider (AWS, Alibaba, Ceph, Digital Ocean, Dreamhost, IBM COS, Minio, Tencent COS, etc) \ "s3" 5 / Backblaze B2 \ "b2" 6 / Box \ "box" 7 / Cache a remote \ "cache" 8 / Citrix Sharefile \ "sharefile" 9 / Dropbox \ "dropbox" 10 / Encrypt/Decrypt a remote \ "crypt" 11 / FTP Connection \ "ftp" 12 / Google Cloud Storage (this is not Google Drive) \ "google cloud storage" 13 / Google Drive \ "drive" 14 / Google Photos \ "google photos" 15 / Hubic \ "hubic" 16 / In memory object storage system. \ "memory" 17 / Jottacloud \ "jottacloud" 18 / Koofr \ "koofr" 19 / Local Disk \ "local" 20 / Mail.ru Cloud \ "mailru" 21 / Mega \ "mega" 22 / Microsoft Azure Blob Storage \ "azureblob" 23 / Microsoft OneDrive \ "onedrive" 24 / OpenDrive \ "opendrive" 25 / OpenStack Swift (Rackspace Cloud Files, Memset Memstore, OVH) \ "swift" 26 / Pcloud \ "pcloud" 27 / Put.io \ "putio" 28 / QingCloud Object Storage \ "qingstor" 29 / SSH/SFTP Connection \ "sftp" 30 / Sugarsync \ "sugarsync" 31 / Tardigrade Decentralized Cloud Storage \ "tardigrade" 32 / Transparently chunk/split large files \ "chunker" 33 / Union merges the contents of several upstream fs \ "union" 34 / Webdav \ "webdav" 35 / Yandex Disk \ "yandex" 36 / http Connection \ "http" 37 / premiumize.me \ "premiumizeme" 38 / seafile \ "seafile" Storage> box ** See help for box backend at: https://rclone.org/box/ ** 6. Leave blank for the following questions about: `client_id`, `client_secret`, `box_config_file`, `access_token`. .. code:: OAuth Client Id Leave blank normally. Enter a string value. Press Enter for the default (""). client_id> OAuth Client Secret Leave blank normally. Enter a string value. Press Enter for the default (""). client_secret> Box App config.json location Leave blank normally. Leading `~` will be expanded in the file name as will environment variables such as `${RCLONE_CONFIG_DIR}`. Enter a string value. Press Enter for the default (""). box_config_file> Box App Primary Access Token Leave blank normally. Enter a string value. Press Enter for the default (""). access_token> 7. Type `user` for the option to delegate the connection role to rclone. .. code:: Enter a string value. Press Enter for the default ("user"). Choose a number from below, or type in your own value 1 / Rclone should act on behalf of a user \ "user" 2 / Rclone should act on behalf of a service account \ "enterprise" box_sub_type> user 8. Use the default values for the rest of the questions for: `Edit advanced config?` `Use auto config?` Then, It will provide you a link and wait for code. .. code:: Edit advanced config? (y/n) y) Yes n) No (default) y/n> Remote config Use auto config? * Say Y if not sure * Say N if you are working on a remote or headless machine y) Yes (default) n) No y/n> If your browser doesn't open automatically go to the following link: http://127.0.0.1:53682/auth?state=##################### Log in and authorize rclone for access Waiting for code... 9. Open your browser to the link on your machine where `rclone config` has been running on. 10. Login to WUSTL Box with your credential. Approve the access on your Duo App. .. image:: rclone/Box_login.png 11. Grant the access for rclone to connect to Box. Then, your will see the confirmation of the process. An email notification from box will be sent to you with the subject: `Box login from "rclone"`. .. image:: rclone/Grant_rclone_the_Box_access.png .. image:: rclone/Rclone_config_success.png 12. Close the browser. The configuration for rclone connection to Box will be displayed on your terminal. For example: .. code:: Got code -------------------- [Box] type = box box_sub_type = user token = {"access_token":"###########################","token_type":"bearer","refresh_token":"##############################################","expiry":"2020-12-11T12:45:22.744758-06:00"} -------------------- y) Yes this is OK (default) e) Edit this remote d) Delete this remote y/e/d> 13. Type `y` if the configuration content looks OK. Then, you will see the new remote connection in the remotes list. .. code:: y/e/d> y Current remotes: Name Type ==== ==== Box box 14. Type `q` to finish the interactive process. .. code:: e) Edit existing remote n) New remote d) Delete remote r) Rename remote c) Copy remote s) Set configuration password q) Quit config e/n/d/r/c/s/q> q b. Copying the credential file to the home directory on compute1 1. Confirm the rclone configuration file from the terminal where rclone config has been run. a. On Mac and Linux: .. code:: > ls -la $HOME/.config/rclone/rclone.conf b. On Windows (using CMD or PowerShell): .. code:: > dir %APPDATA%/rclone/rclone.conf .. admonition:: Windows Command Assumptions The above command assumes the ``rclone`` configuration file is its default folder. Please see the `rclone documentation `__ for more information. It is also assumed that the ``%APPDATA%`` environment variable is set to the correct location. Replace ``%APPDATA%`` with the correct path if needed. 2. (Optional) Verify the content of the file to see the remote storage you've just created. a. On Mac and Linux: .. code:: > view $HOME/.config/rclone/rclone.conf b. On Windows (using CMD or PowerShell): .. code:: > type %APPDATA%/rclone/rclone.conf 3. Copy the file to your compute1 home directory. For example (replacing ```` with your WUSTL key): a. On Mac and Linux: .. code:: > scp $HOME/.config/rclone/rclone.conf @compute1-client-1.ris.wustl.edu:~/.rclone.conf b. On Windows (using CMD or PowerShell): .. code:: > scp %APPDATA%/rclone/rclone.conf @compute1-client-1.ris.wustl.edu:~/.rclone.conf III. Test --------- a. Run `ssh` to a compute1 client from a terminal. You will get a shell at your compute1 home. b. Verify the rclone configuration file at your home directory. .. code:: > ls -la .rclone.conf c. Run `bsub` to start a rclone container on a compute1 exec node. .. code:: > LSF_DOCKER_ENTRYPOINT=/bin/sh bsub -Is -G group-name -q general-interactive -a 'docker(rclone/rclone)' /bin/sh d. Run `rclone lsd` to check the connection from compute1 exec node to your Box storage by listing the directories. For example: .. code:: > rclone lsd Box:/ Use Case ======== From Box to Storage1 -------------------- Example: A user has a file `File_A` in the WUSTL Box. The file needs to be moved to the storage1 space `/storage1/fs1/${STORAGE_ALLOCATION}/Active`. 1. Run `ssh` to a compute1 client from a terminal. For example: .. code:: > ssh compute1-client-1.ris.wustl.edu 2. Verify the rclone configuration file is in the home directory. .. code:: > ls -la $HOME/.rclone.conf 3. Prepare to mount the storage1 space to the job. .. code:: > export LSF_DOCKER_VOLUMES=/storage1/fs1/${STORAGE_ALLOCATION}/Active:/storage1/fs1/${STORAGE_ALLOCATION}/Active 4. Rub `bsub` to start a rclone container. .. code:: > LSF_DOCKER_ENTRYPOINT=/bin/sh bsub -Is -G group-name -q general-interactive -a 'docker(rclone/rclone)' /bin/sh 5. Copy `File_A` from the WUSTL Box to the storage1 space. .. code:: > rclone ls Box:/File_A 314572800 File_A > ls /my_storage1/File_A ls: /my_storage1/File_A: No such file or directory > rclone copy Box:/File_A /my_storage1/ 6. Verify the file in the storage1 space. .. code:: > ls /my_storage1/File_A /my_storage1/File_A 7. Exit the rclone container. .. code:: > exit References ========== - Rclone https://rclone.org - Rclone Commands https://rclone.org/commands/