Compute Quick Start¶
Have questions or need help with compute, including activation or issues? Follow this link.
1. Getting Connected¶
If you are off campus, you will need to use a VPN to access compute1.
Instructions for accessing the WashU VPNs can be found here: https://it.wustl.edu/items/connect/
If you run into issues using the VPN, you will need to follow the directions in the previous link to contact WashU IT proper.
- WashU has several VPNs. compute1 can be accessed from the following VPNs:
- Access to the Compute Service is via the SSH protocol to one of several numbered access points with names like compute1-client-N, where N is a number.
The first portion of the name, compute1, is the name of the cluster.
At the time of this writing, there is one compute cluster named compute1, but there are likely to be more in the future.
The second portion of the name, client-1 is the first client in the cluster. There are currently 4 clients on compute1, client-1, client-2, client-3, client-4.
You can use SSH to connect to the client host with your WUSTL Key username.
Here is an example of connecting to client-1.
These are shared systems, whose purpose is to launch compute jobs into the HPC environment, not for running “compute loads” directly. Be aware of the load you impose on these systems. Be courteous of your co-workers. If your work is deemed to comsume too many resources on the compute clients, your sessions will be terminated.
2. The User $HOME Directory¶
Users are represented by Wash U WUSTL Key IDs. Upon first login to a compute1-client host, the system will create a new home directory for you. From the command line, you can run the following command to confirm your home directory.
RIS has observed that some users’ home directory is not /home/wustlkey but rather has a pattern like /home/IDC-ID-12345. We are working to correct this.
Before launching a job, you must ensure you are a member of a compute- group.
RIS mediates access to services by groups and you will be unable to submit a job if you are not part of one.
Usually a user will be part of only 1 group.
You can find out what groups you’re in by running the group command.
The compute cluster offers a number of queues. This is how the scheduler organizes jobs.
You can use the bqueues command to see a list of queues available.
- We have learned over time that the proliferation of queues to support special projects and features becomes unwieldy, so we strive to keep the minimum number of queues possible.
general : This is the default queue where non-interactive jobs land. Most of your high performance work should land here.
general-interactive : This is the queue where the interactive feature is supported. There is a job run time limit of 24 hours here.
If you have large numbers of jobs or intensive jobs you should use the general queue.
If you have ad hoc type or similar type of work to perform, you should use the general-interactive queue.
5. Starting Your First Job¶
This example shows how to run a simple job that will demonstrate how to use the system and runs the date command using an alpine docker image to do so.
After you connect to the compute1 environment as shown above, launch the job with the following command.
bsub -Is -q general-interactive -a 'docker(alpine)' date
If you are a member of multiple LSF User Groups you must also specify an LSF User Group with -G group_name or by setting the LSB_SUB_USER_GROUP variable.
Congratulations! You have now run your first job within the RIS Compute environment. The rest of the documentation found within The User Manual can help you get working on your data. Compute Recipes has documentation on common use cases within the service.
6. Here Are a Few Next Steps¶
Connecting to My Research Storage Space¶
You can find out more about getting connected to storage here.
You can also find out more information about mounting storage into a docker image with variables here.
Learning More About Utilization of High Performance Computing¶
Our workshops go over the basics of using a Linux environment and the compute1 environment and can be found here.
There is also a lot of good documentation about the different aspects of using the compute1 HPC platform within the Compute Recipes section of the User Manual.
Understanding What a Container Is¶
Put simply, a container consists of an entire runtime environment: an application, plus all its dependencies, libraries and other binaries, and configuration files needed to run it, bundled into one package.
By containerizing the application platform and its dependencies, differences in OS distributions and underlying infrastructure are managed.
- WashU RIS uses Docker to manage the containers within compute1.
You can find more information on using Docker here.
You can find documentation on creating and using Docker image within the compute1 environment here.
You can find further documentation on how docker relates to the compute1 environment here.
- There are developers who create containers and you can often find official Docker images from the developer.
Docker hub is where you can find a list of publicly available docker images.
You can also host your own images at Docker hub for you or your group’s usage.
- RIS also hosts containers that have been developed for users. These are developed on a case by case basis.
These images are hosted on a RIS owned registry:
You can find information about such images in the tools section of the RIS User Manual.
If you have questions or are unable to get a docker image to work, you can request help from RIS here.
Understanding the LSF Environment and
LSF is an IBM management platform that includes a job scheduler and is designed for high performance computing.
There are environment variables that are used within this system that you can read more about here.
The job submission command
bsubcan be learned about more including its options here.