SSTMap Solvation Structure and Thermodynamic Mapping

Running SSTMap Calculations

SSTMap implements two approaches for mapping water structure and thermodynamics on to solute surfaces, such as protein binding sites, the hydration site analysis (HSA) and Grid Inhomogeneous Solvation Theory (GIST). The command lines to run each of these analyses are simply run_hsa and run_gist which are the main command-line tools in SSTMap. Here we provide examples of the command line arguments required to run these analyses with different popular MD packages.

A detailed list of command-line arguments for these programs and the MD trajectory requirements can be found at the bottom of this page. For a detailed description of the output genrated by these programs, see this post.

Amber

$ run_hsa -i testcase.prmtop -t md100ps.nc -l ligand.pdb -s 0 -f 100 -o testcase

$ run_gist -i testcase.prmtop -t md100ps.nc -l ligand.pdb -g 20 20 20 -s 0 -f 100 -o testcase

Since amber prmtop file contains both non-bonded parameters and topology information for the system, therefore, you can leave out the -p flag, which is used to specify additional parameter files.

Charmm/NAMD

run_hsa -i testcase.psf -t md100ps.dcd -l ligand.pdb -p toppar/ -s 0 -f 100 -o testcase
$ run_gist -i testcase.psf -t md100ps.dcd -l ligand.pdb -g 10 10 10 -p toppar/ -s 0 -f 100 -o testcase

An additional -p flag is mandatory for Charmm, NAMD, Gromacs in order to obtain the non-bonded parameters for energy calculations. These additional files are required by the underlying Parmed parsers.

NAMD uses the same forcefileds as Charmm so the same input files are valid for NAMD.

Gromacs

$ run_hsa -i testcase.gro -t md100ps.xtc -l ligand.pdb -p params.top -s 0 -f 100 -o testcase
$ run_gist -i testcase.gro -t md100ps.xtc -l ligand.pdb -p params.top -g 20 20 20 -s 0 -f 100 -o testcase

OpenMM

OpenMM is an MD toolkit that allows running simulations using Amber99SB forcefield or custom forcefields (written as xml file formats as described here). Indeed, systems can also be build from coordinates and topology files for Charmm and Gromacs. Depending on which of the three forcefields was used to build the system for the OpenMM simulation, the correspnding run_hsa and run_gist commands for the packages will be valid.

Desmond/Others

The desmond cms file format, which contains forcefield and topology information for simulations run in Desmond, is not supported by ParmEd, the underlying parameter and topology parser program. However, with three additional steps, it is possible to use Desmond simulations for SSTMap calculations.

  • Step 1: Convert cms file into pdb file. This can be done in Schrodinger’s Maestro or VMD.

  • Step 2: Convert dtr file into a netcdf file. Although MFTraj allows reading dtr file format but the cross-platform trajectory loaders of mdtraj (e.g., mdtraj.load() or mdtraj.load_frame()) do not accept dtr file format. Only a specific loader mdtraj.load_dtr() is available for this purpose. In SSTMap code, the trajectory reading is only done through MDTraj’s cross-platform loaders. As a result, to process Desmond simulation through SSTMap it is necessary to convert them into netcdf format (although alternative format, dcd, xtc, H5 etc could also be used but netcdf is recommended).Th dtr to netcdf conversion can be be done in VMD using a Tcl script or using a Python script that uses MDTraj’s dtr loader. An example Tcl script for this purpose dtr_to_netcdf.tcl is shown below:

mol new input.cms type mae first 0 last -1 step 1 filebonds 1 autobonds 1 waitfor all
mol addfile 1traj/clickme.dtr type dtr first 0 last -1 step 10 filebonds 1 autobonds 1 waitfor all
animate write output.nc beg 1 end -1

This cane be run on the command-line as:

$ vmd -dispdev text -eoftext < dtr_to_netcdf.tcl

Similarly, an example Python script for this, dtr_to_netcdf.py is given below:

traj = md.load_dtr("traj/clickme.dtr", top=input.pdb)
traj.save_netcdf("converted.nc")
  • Step 3: Finally, extract non-bonded parameters for each atom in the system. This can be done by running an auxiliary Python script provided in SSTMap repository under scripts folder: desmond_extract_nbparams.py. This script uses Schrodinger Suite’s Python API to parse cms file and extract non-bonded parameters. In ordder to run this script, make sure $SCHRODINGER enviornment variable is set on your terminal (which is set by default when a Desmond or Schrodinger installation is done.). For example:
    $SCHRODINGER/run desmond_extract_nbparams.py input.cms
    

    will generate a text file called inputcms_nb_parms.txt. This text file is a \(N \times 3\) matrix of non-bonded prameters, where N is the number of atoms and three columns correspond to sigma, epsilon and charge parameters for each atom in the system.

Once the convetred pdb file, netcdf file and parameters text file is available, run_hsa ad run_gist can be run as:

$ run_hsa -i testcase.pdb -t md100ps.nc -l ligand.pdb -p params.txt -s 0 -f 100 -o testcase

run_gist -i testcase.pdb -t md100ps.nc -l ligand.pdb -p params.txt -g 20 20 20 -s 0 -f 100 -o testcase

This approach is applicable to any MD package that’s not supported. SSTMap calculations are feasibale as long as you can convert your topology and trajectory files to suitable format and extract non-bonded paramters for every atom of your system into a text file with the same format as given above.

MD Trajectory Requirements for SSTMap

SSTMap calculations require an explicit solvent molecular dynamics trajectory with restrained solute. While SSTMap is agnostic of the water model used, however, it has been tested only for TIP3P, TIP4P, TIP4P-Ewald, TIP5P and OPC models. This list will be updated as we test more water models. Also note that currently only simulation run in orthorhombic periodic boundary conditions are supported. We intend to provide support for non-orthorhombic periodic boundary conditions in future. The simulations can be generated in one of the following packages: Amber, Charmm, Gromacs, NAMD, OpenMM and Desmond. This list is based on simulation packages that are supported by MDTraj and ParmEd, both of which are dependecnies of SSTMap. It’s however, possible to expand the applicability of SSTMap calculations even beyond these packages, as long as the trajectory and toplogy formats can be converted to those currently supported (See the Desmond example for this apporach).

Command-line arguments

A list of command-line arguments for run_hsa and run_gist (with brief explanations) can be obtained at the terminal by running these commands without any argument or with -h flag.

$ run_hsa -h
usage: run_hsa [-h] -i INPUT_TOP -t INPUT_TRAJ -l LIGAND [-f NUM_FRAMES]
               [-c CLUSTERS] [-p PARAM_FILE] [-s START_FRAME]
               [-d BULK_DENSITY] [-b CALC_HBONDS] [-o OUTPUT_PREFIX]

Run SSTMap site-based calculations (Hydration Site Analysis) through command-
line.

required arguments:
  -i INPUT_TOP, --input_top INPUT_TOP
                        Input toplogy File.
  -t INPUT_TRAJ, --input_traj INPUT_TRAJ
                        Input trajectory file.
  -l LIGAND, --ligand LIGAND
                        Input ligand PDB file.
  -f NUM_FRAMES, --num_frames NUM_FRAMES
                        Total number of frames to process.

optional arguments:
  -h, --help            show this help message and exit
  -c CLUSTERS, --clusters CLUSTERS
                        PDB file containing cluster centers.
  -p PARAM_FILE, --param_file PARAM_FILE
                        Additional parameter files, specific for MD package
  -s START_FRAME, --start_frame START_FRAME
                        Starting frame.
  -d BULK_DENSITY, --bulk_density BULK_DENSITY
                        Bulk density of the water model.
  -b CALC_HBONDS, --calc_hbonds CALC_HBONDS
                        True or False for whether to calculate h-bonds during
                        calculations.
  -o OUTPUT_PREFIX, --output_prefix OUTPUT_PREFIX
                        Prefix for all the results files.
$ run_gist -h
usage: run_gist [-h] -i INPUT_TOP -t INPUT_TRAJ -l LIGAND -g GRID_DIM GRID_DIM
                GRID_DIM [-f NUM_FRAMES] [-p PARAM_FILE] [-s START_FRAME]
                [-d BULK_DENSITY] [-b CALC_HBONDS] [-o OUTPUT_PREFIX]

Run SSTMap grid-based (GIST) calculations through command-line.

required arguments:
  -i INPUT_TOP, --input_top INPUT_TOP
                        Input toplogy File.
  -t INPUT_TRAJ, --input_traj INPUT_TRAJ
                        Input trajectory file.
  -l LIGAND, --ligand LIGAND
                        Input ligand PDB file.
  -g GRID_DIM GRID_DIM GRID_DIM, --grid_dim GRID_DIM GRID_DIM GRID_DIM
                        grid dimensions e.g., 10 10 10
  -f NUM_FRAMES, --num_frames NUM_FRAMES
                        Total number of frames to process.

optional arguments:
  -h, --help            show this help message and exit
  -p PARAM_FILE, --param_file PARAM_FILE
                        Additional parameter files, specific for MD package
  -s START_FRAME, --start_frame START_FRAME
                        Starting frame.
  -d BULK_DENSITY, --bulk_density BULK_DENSITY
                        Bulk density of the water model.
  -b CALC_HBONDS, --calc_hbonds CALC_HBONDS
                        True or False for whether to calculate h-bonds during
                        calculations.
  -o OUTPUT_PREFIX, --output_prefix OUTPUT_PREFIX
                        Prefix for all the results files.

The arguments supplied for -i, -t and -p flags vary depending on the MD packages used for simulation. For demonstrative purposes, we use input topology and trajectories from a repository of test cases, which is available on Github. You can download the full test suite from here (since Github repository doesn’t contain trajectory files). For a given platform, cd to its sub-directory and run the commands as shown below.