Managed GPU support requires Gigantum Desktop >=0.5.9 or Gigantum CLI >=0.15 to work.
The Gigantum Client manages starting a Project container with GPU acceleration available, when supported. This lets you move a project between a laptop without GPU acceleration and a workstation with GPU acceleration easily.
There are bases with CUDA installed that should be used to enable GPU support (filter bases by the CUDA version you want to use when creating a project). These Bases are currently built on the Nvidia CUDA runtime Docker images. Gigantum currently supports CUDA 9.0, 9.1, 9.2, and 10.0 on a Linux Docker host only.
Now when you start a Project, the Client will first Check if the Project is CUDA enabled
- If CUDA is not enabled, start normally
- If CUDA is enabled, check for compatibility with the Nvidia drivers installed on the host
- If drivers are missing or not compatible, start normally
- If drivers are compatible, start with nvidia-docker and shared memory set to 2GB
Unfortunately, there are multiple paths for installation of appropriate Nvidia drivers on a workstation. Below we provide example routes to configuring Ubuntu 18.04 on a cloud provider and include some more general notes below.
If you have configured docker and gigantum to run on your own machine (as described in the Installation Overview), you've already completed step 1! Note that if you want to be able to start and stop Gigantum remotely, you will need to configure the Gigantum CLI and see the instructions for Remote workstations. We have a Quick-start Script to streamline setup for you.
All that remains is to continue with configuring nvidia-specific components below.
For more detail, please see our full article about the use of Docker Machine. The only additional concern is to explicitly provision a GPU instance. For example, assuming docker-machine is configured correctly to work with EC2, the following command will provision the cheapest GPU instance available. The only difference from the Docker Machine tutorial is that we switch the instance type from
p2.xlarge. Note that while Nvidia has released an up-to-date AMI for p3 instances, we are not aware of a public image with a recent Nvidia driver (supporting CUDA 10) for p2 instances.
docker-machine create --driver amazonec2 ` --amazonec2-ami ami-0ac019f4fcb7cb7e6 ` --amazonec2-instance-type p2.xlarge ` --amazonec2-region us-east-1 ` gigantum-server
docker-machine create --driver amazonec2 \ --amazonec2-ami ami-0ac019f4fcb7cb7e6 \ --amazonec2-instance-type p2.xlarge \ --amazonec2-region us-east-1 \ gigantum-server
As described in the full article, Docker Machine can be used with many providers, but not all providers have GPUs available. See the Docker Machine Driver documentation for details. Note also that some providers (such as Google Cloud Platform, aka GCP) do not provide default instance types with GPUs. While there are several approaches to dealing with this, the simplest will be to use the GCP console manaually (more tips on using GPUs with a service like GCP are below).
Once your server is running, you can connect to it with
docker-machine ssh gigantum-server. To reduce the need to use
sudo, please add the default user to the docker group. In this step we'll also install the Gigantum CLI. Both of these steps will take full effect the next time you log in (and you'll need to log out and back in below).
sudo usermod -aG docker $USER sudo apt-get install -y python3-pip pip3 install --user gigantum
Having SSH'd into your remote machine, continue with configuring nvidia-specific components below.
You can find an overview of provisioning different providers in Working remotely including Google and Amazon cloud platforms, using Docker Machine, and using your own remote workstation.
While it would be impossible to describe all the ways GPUs could be obtained across providers, we provide specific examples using the AWS Console and Google Cloud Platform console. Regarding AWS, we again note that while Nvidia has released an up-to-date AMI for p3 instances, we are not aware of a public image with a recent Nvidia driver (supporting CUDA 10) for p2 instances - thus we suggest a standard machine image and installing the graphics drivers yourself. Note also that some vendors, such as GCP, offer different GPUs in different regions. The difference in cost can be significant - for example the cost for a Tesla P100 instance (available in
us-east-1) is about twice the cost of a Tesla P4 instance. The described approaches can be adapted to other providers, but if you need further assistance, you can ask in our discussion forum.
Once your server is provisioned, the steps are no different than for a personal remote server. You'll want to SSH in to:
- install Docker,
- Set up the Gigantum CLI, and
- Continue with configuring nvidia-specific components below.
Nvidia driver versions
If you are able to choose, please use driver version 430. This is the latest version that we have verified works with Gigantum. Previous versions are however supported based on minimum required drivers for each CUDA version.
There are several ways to get the latest Nvidia drivers. CUDA is not needed on the host OS to support Docker or Gigantum. However, if you do install CUDA on your host (e.g., to enable other GPU applications outside of Gigantum or Docker), you will also get an appropriate version of nvidia-driver included. Instructions on installing CUDA are provided by Nvidia. You will generally have a GeForce or Titan card in consumer machines, and a Tesla card on cloud instances.
On Ubuntu 18.04 it is easier and often more robust to simply use the graphics-drivers PPA. The following commands should be run on the docker host (you should SSH to it if needed - see instructions in Step 1). The following commands will install the current stable graphics driver.
sudo add-apt-repository -y ppa:graphics-drivers/ppa sudo apt-get install -y linux-aws nvidia-headless-430 nvidia-utils-430
linux-aws at the same time, which may prompt you to accept or reject changes to GRUBs
menu.lst. You can safely accept them.)
The following commands will then install the Nvidia Docker drivers (note that they have been left general to other Linux distributions):
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \ sudo apt-key add - distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \ sudo tee /etc/apt/sources.list.d/nvidia-docker.list sudo apt-get update sudo apt-get install -y nvidia-docker2
For these drivers to be loaded, you will need to reboot your system.
If you are using docker machine, you'll need to exit from the Linux Docker host (ctrl-D or type
exit). Then from your local command prompt, run:
docker-machine restart gigantum-server
Otherwise, please use the standard method to restart the machine or instance you are using.
While we could in theory directly connect to a remote machine IP, it's easier for a number of reasons to use SSH tunneling. If you are running on a local workstation, you may skip this step.
ssh -L 10000:localhost:10000 <user>@<server address>
docker-machine ssh gigantum-server -L 10000:localhost:10000
At this point, you should be able to run
gigantum install followed by
gigantum start as usual. If you're running on a remote server, you'll need to manually enter the following URL into your browser:
http://localhost:10000 (we are able to use localhost to access a remote machine because of port forwarding).
Once you've configured your remote server with docker-machine, it's easy to manage it with
docker-machine stop gigantum-server,
docker-machine start gigantum-server, or
docker-machine rm gigantum-server. Note that a stopped server will likely be on the order of pennies a month. So there's no hurry to delete it! If you're going to stop the machine, it will also stop the gigantum client.
If you are running on other forms of remote servers, please see the relevant documentation on how to stop and/or delete those instances.
Updated over 1 year ago