Skip to main content

About Gromacs

Gromacs is a versatile software package to run molecular dynamics which is to simulate the Newtonian equations of motion for systems with hundreds to millions of particles.  Gromacs provides extremely high performance in computing.  With GPU acceleration, Gromacs runs extremely fast in the Longleaf and DGX clusters.  In this tutorial, we discuss how we can run Gromacs jobs at UNC Research Computing Center.

Preparing Gromacs Job

First of all, we would like to designate a directory to run a Gromacs job, either a test or a production run, it is always a good idea to have a separate directory for each job.  In this tutorial, we are going to have a Gromacs job in /proj file system.

mkdir -p /proj/its/cdpoon/project/adh
cd /proj/its/cdpoon/project/adh

Here we download the standard Gromacs benchmark data set, ADH.

wget ftp://ftp.gromacs.org/benchmarks/ADH_bench_systems.tar.gz
tar -zxvf ADF_bench_systems.tar.gz

We then have a directory named ADH which has 4 sets of data.  We are going to focus on one of them.

cd ADH/adh_cubic

In that directory, 2 Gromacs parameter sets are provided.  We are going to create a link to one of them with a default Gromacs parameter set name.

ln -s pme_verlet.mdp grompp.mdp

Gromacs allows us to use default filenames to simplify commands.  For examples, Gromacs parameter set name is grompp.mdp.  Gromacs input atom coordinate filename is conf.gro.  Gromacs topology filename is topol.top.  If not using default filenames, we will have to enter filenames in the command line.

Next, we are going to create the Gromacs run input file.  That is the file Gromacs needed to compute.  To do that, we set up Gromacs in our computing environment.  In this tutorial, we use Gromacs 2021.3.

module load gcc/9.1.0
module load cuda/11.4
source /nas/longleaf/apps/gromacs/2021.3/avx2_256-cuda11.4/bin/GMXRC.bash
gmx grompp

A new file named topol.tpr is created and that is the Gromacs run input file.  Then we can submit the job to run in either Longleaf or DGX cluster.

Running Gromacs in Longleaf Cluster

To run Gromacs job in Longleaf cluster, it would be much easier to use a Slurm job submission script.  For the ADH benchmark job, we create something like the following.  We name this script as run.slrum.

#!/bin/bash

#SBATCH --job-name=adh_cubic
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=10
#SBATCH --mem=1G
#SBATCH --time=4:00:00
#SBATCH --partition=beta-gpu
#SBATCH --output=log.%x.%j
#SBATCH --gres=gpu:1
#SBATCH --qos=gpu_access

unset OMP_NUM_THREADS
module load gcc/9.1.0
module load cuda/11.4
source /nas/longleaf/apps/gromacs/2021.3/avx2_256-cuda11.4/bin/GMXRC.bash

# Change to working directory
cd /proj/its/cdpoon/project/ADH/adh_cubic

# Run Gromacs MD
gmx_gpu mdrun -ntmpi 1 -ntomp 10 -update gpu -nb gpu -bonded gpu -pme gpu

In this script, we ask to allocate 1 NVIDIA A100 GPU, 10 CPU cores, 1 GB of memory, and 4 hours of maximum run time for this job.  And, we would like the job to run in beta-gpu partition.  If you have not had permission to use any of the GPU partitions, email research@unc.edu and state your needs.  To submit this job to Longleaf, we use the following command.  Note the Slurm job ID from the output.

sbatch run.slurm

To monitor the progress of this job, we can use this command.  Replace <ONYEN> with your real ONYEN.  There is also a Slurm Job ID in the output.

squeue -u <ONYEN>

When the job is finished, it is a good idea to check the job efficiency to make sure that the resource allocation is not too excessive.  Replace <JOB_ID> with the job’s real ID.  <JOB_ID> can also be found from the Slurm job log file filename.  In that log filename, we can extract <JOB_ID>.

seff <JOB_ID>

If we are using too much memory in the job, cut that down for the next run.

Gromacs keeps log in a file named md.log.  Read that file to see how the job runs.  If the job finishes successfully, you should see the job performance at the end of the file.

Running Gromacs in DGX Cluster

For running Gromacs jobs on the DGX cluster, follow this direction.

To submit Gromacs job to the DGX cluster, we need to create a YAML file which defines job name, resource requirement, Docker image, location of work directory, etc.  For work directory, we are going to use /proj which is locally mounted on the DGX cluster.  This example also uses a Docker image created at UNC by the Research Computing Center and can be pulled from NVIDIA GPU Cloud (NGC) registry.  This requires that you have NGC account and you have access to our UNC Research Computing Center private Docker registry.

In Spring 2021, we have implemented Volcano for job scheduling.  The following YAML file shows how to submit Gromacs jobs to Volcano scheduler gpu queue with 1 GPU, 8 CPUs, and 4 GB of memory.

apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: job_name
spec:
  minAvailable:1
  schedulerName: volcano
  queue: queue_name
  policies:
  - event: PodEvicted
    action: RestartJob
  tasks:
  - replicas: 1
    name: task_name
    policies:
    - event: TaskCompleted
      action: CompleteJob
    backoffLimit: 5
    activeDeadlineSeconds: time_limit
    template:
      metadata:
        name: volcano-job
        labels:
          environment: research
      spec:
        restartPolicy: Never
        imagePullSecrets:
        - name: your_secret
        volumes:
        - name: proj
          hostPath:
            path: /proj 
            type: Directory
        containers:
        - name: gromacs 
          image: nvcr.io/uncchrc/gromacs:2020.4-cuda9.2-ubuntu18.04
          resources:
            requests:
              cpu: 8
              memory: 4Gi
              nvidia.com/gpu: 1
            limits:
              cpu: 8
              memory: 4Gi
              nvidia.com/gpu: 1
          volumeMounts:
          - name: proj
            mountPath: /proj
            readOnly: false
          command:
            - "/bin/bash"
            - "-c"
            - >
              cd work_directory &&
              gmx_command

In the above YAML file, change the tags to what you desire.

job_name : Name of the Kubernetes job, has to be unique, no other job in the cluster should have the same name

queue_name: Name of the Volcano queue

task_name: Name of the task, this name will be used in creating the pod name

time_limit: Set time limit for the job in second, when the job exceeds this limit, it will be terminated, for example, set to 86400 for one day time limit

your_secret : Enter the Kubernetes secret here, this Kubernetes holds the credential for the container to access your NGC registry, follow direction in “Nvidia GPU Cloud” page to create your own Kubernetes secret

work_directory : Enter the work directory, should be in /proj

gmx_command : Enter the Gromacs gmx command here, for running Gromacs molecular dynamics, use “gmx_gpu mdrun -ntomp 8 -ntmpi 1” command to run molecular dynamics with GPU acceleration using 1 GPU, 8 CPUs

In this example, we allocate 8 CPU cores, 4 GB of memory, and 1 GPU to run the job.  Change these numbers according to your job requirement.  This YAML also asks to use the image, nvcr.io/uncchr/gromacs with tag 2020.4-cuda9.2-ubuntu18.04.

To find all the Volcano queues in the setup, use this command.

kubectl get queue

To submit your Gromacs job, run this command.

kubectl create -f name_yaml

Replace name_yaml with the real filename of your YAML file.

We have created a script named “kubelist” to list all your Volcano jobs in the DGX cluster.

kubelist

When the job starts to run, Kubernetes will create a pod with pod name based on the job name and the task name.  We can use the following command to list out all pods in the DGX cluster.  Use the -owide to provide long listing.

kubectl get pod
kubectl get pod -owide

From the list, you can find the pod name of the job, pod_name, you submit.  When the pod is running, its status should show as “Running”.  We can also view the output of the pod by the following command.

kubectl logs pod_name

Once the job is finished, you will have to delete the Volcano job from the job list.  Use the following command to delete your Volcano job.  It is important that we delete all the completed Volcano jobs from the list to keep the list clean.

kubectl delete vcjob job_name