Skip to main content
Nvidia DGX Station

In Spring 2018, ITS Research Computing Center acquired 1 Nvidia DGX Station and 3 Nvidia DGX-1 Server to start up a pilot project to study GPU Computing, Container Technology, Cloud Computing, etc.  In this setup,  we use the Nvidia servers as the compute node forming a little cluster using Kubernetes as job manager.  A separate virtual machine is being used to provide administrative function for Kubernetes.

The DGX Station has 20 physical CPU cores and 4 Tesla Volta V100 GPUs with 64GB memory.  Each of the 3 DGX-1 servers has 40 physical CPU cores and 8 Tesla Volta V100 GPUs with 512GB memory.

In October 2018, 2 Dell PowerEdge C4140 machines with 40 physical CPU cores, 4 Tesla Volta V100 GPUs with 256GB memory, are migrated to the DGX cluster from the Longleaf cluster.

To gain access to the DGX cluster, browse “User Account” page to get started.

After your DGX account is available and you have created your Kubernetes token for DGX cluster, you can log into one of the following machines to access the cluster.

  1. Longleaf Login Nodes: One can log into Longleaf login nodes to access the DGX cluster.  This is useful if working directories are already in ITS Research Computing maintained file systems, such as /proj, /pine.  One can check status of DGX cluster and submit jobs through Kubernetes.
  2. VCL with GPU: One can get a reservation and connect to a particular VCL image, TarHeel Linux, CentOS 7 (Full Blade with GPU) to access the DGX cluster.  This is powerful if you would like test out Docker images, run executables interactively with and without GPU in VCL session.  Then, one can submit long running jobs to DGX cluster.
Nvidia DGX-1 Server