Volume 30, Issue 14
Software News and Update
Free Access

RedMD—Reduced molecular dynamics package

Adam Górecki

Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, 02‐106 Warsaw, Poland

Department of Biophysics, Faculty of Physics, University of Warsaw, 02‐089 Warsaw, Poland

Search for more papers by this author
Marcin Szypowski

Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, 02‐106 Warsaw, Poland

Search for more papers by this author
Maciej Długosz

Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, 02‐106 Warsaw, Poland

Search for more papers by this author
Joanna Trylska

Corresponding Author

E-mail address: joanna@icm.edu.pl

Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, 02‐106 Warsaw, Poland

Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, 02‐106 Warsaw, PolandSearch for more papers by this author
First published: 24 August 2009
Citations: 6

Abstract

We developed a software package (RedMD) to perform molecular dynamics simulations and normal mode analysis of reduced models of proteins, nucleic acids, and their complexes. With RedMD one can perform molecular dynamics simulations in a microcanonical ensemble, with Berendsen and Langevin thermostats, and with Brownian dynamics. We provide force field and topology generators which are based on the one‐bead per residue/nucleotide elastic network model and its extensions. The user can change the force field parameters with the command line options that are passed to generators. Also, the generators can be modified, for example, to add new potential energy functions. Normal mode analysis tool is available for elastic or anisotropic network models. The program is written in C and C++ languages and the structure/topology of a molecule is based on an XML format. OpenMP technology for shared‐memory architectures was used for code parallelization. The code is distributed under GNU public licence and available at http://bionano.icm.edu.pl/software/. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2009

Introduction

Biomolecules are not static and their atoms undergo fluctuations leading to internal collective motions that are essential to molecular function. One of the widely used theoretical techniques to investigate internal movements is molecular dynamics (MD). At atomic resolution MD uses an anharmonic potential energy function and solves numerically, using discrete time steps, the Newton's equations of motion. As a result, MD provides trajectories i.e., the coordinates and momenta/velocities of N atomic nuclei in the system as a function of time. Analyses of MD data allow one to derive and combine information about functionally relevant internal motions.

Another such technique is normal mode analysis (NMA) in which the potential energy surface is approximated as harmonic around a configuration that represents its minimum. The 3N × 3N matrix of second derivatives of the potential energy function with respect to atomic displacements (i.e., the Hessian) is constructed. Next, the eigenproblem is solved leading to representation of the motions as a combination of vibrational normal modes.1-5 NMA is usually used to characterize the slowest, large‐amplitude motions of a molecule.

However, both for MD and NMA, simulations with all‐atomic descriptions of biomolecules are often too computationally demanding. In MD, due to large number of atoms and femtosecond time steps used to integrate the equations of motions, it is still difficult to approach biologically relevant time scales. For larger systems, all‐atomic NMA is also computationally expensive due to minimization of the molecular mechanics energy function and memory requirements to diagonalize the Hessian. Moreover, the number of crystallographic structures of large macromolecular complexes available in Protein Data Bank6 is constantly growing and many of those structures provide only low resolution information about the system. Therefore, to achieve both microsecond simulation time scales and retrieve dynamics of low resolution structures, reduced models of proteins and nucleic acids together with simplified force fields have recently become popular.

These simplified models reduce the system's number of degrees of freedom by coarse‐graining its representation; each residue or nucleotide is composed of only one or a few interacting centers. The formula for the effective potential energy (more accurately a potential of mean force) is usually parameterized based on some reference configurations derived from crystallographic data or short all‐atom MD simulations. Coarse‐grained or reduced models were mainly applied in NMA but to incorporate anharmonicity of motions their applications in MD have also been extensive. For a review of the reduced models, their advantages and drawbacks see for example.5, 7-9

To perform coarse‐grained MD simulations, one usually needs to modify the already existing all‐atom MD packages such as DL_POLY,10 NAMD,11 Amber,12 or Charmm.13 This needs preparation of the coarse‐grained force field in a format that is read by those packages and is often not straightforward. Contrary to many all‐atom MD programs, there are very few packages that are exclusively dedicated to perform reduced MD dynamics both for proteins and nucleic acids. One of them is the YUP molecular simulation package14 based on Python programming language. Recently, a MARTINI15 coarse‐grained force field for molecular dynamics for proteins and lipids was implemented into Gromacs.16 On the other hand, the normal mode analysis tools are numerous and often available as web servers. However, their main drawback is that the tools work mainly for proteins and applications to nucleic acids are often not possible. In this work, we provide a complete package to perform reduced MD and NMA for various one‐bead models both for proteins and nucleic acids.

The RedMD package that we have developed allows the user to build a model of a molecule according to the chosen force field (i.e., reduce the degrees of freedom and generate the topology and structure files based on the input PDB or XML PDB files), visualize, verify and edit the model, as well as conduct an MD simulation and output its results. All force fields currently implemented in RedMD are based on one‐bead mapping. Each amino acid or nucleotide is represented as a single spherical bead centered either on a Cα carbon (CA) or P phosphorus with a mass corresponding to the total mass of a given amino acid or nucleotide. The force field generators include those based on elastic network models2, 17 and their extensions initially developed for the ribosome,18 HIV‐1 protease,19 nucleosome,20 and myoglobin21 (REACH). Force field parameters can be modified using options passed to command line driven force field generators. One can also write extensions to the code in order to modify implemented potentials or apply one's own force field. RedMD is efficiently scalable on shared‐memory multiprocessor computer systems. We also provide a NMA tool (RedNMA) that allows for vibrational analysis within the framework of elastic2 or anisotropic network models.22, 23 The code is written in C and C++ languages and is available for Linux platforms. We provide the documentation and user's manual. The program distribution contains the source code and examples.

General Overview of the RedMD Package

The structure and force field of the molecule are generated in an XML‐based format. RedMD uses the XML ‐ Extensible Markup Language (http://www.w3.org/XML). We provide utilities to produce the input force field XML files with each program working from a command line and using GNU arguments parsing. The steps to be performed while using RedMD are presented in Figure 1 and described below. Details are provided in the user's manual available at http://bionano.icm.edu.pl/software/.

  • 1

    As an input the user needs to provide the coordinates of atomic positions together with atom names. Currently supported formats are Protein Data Bank PDB and XML PDB.6 The program also reads the gzipped .xml.gz files. The PDB‐downloaded XML is recommended if one wants to use the information on the secondary structure of DNA or RNA such as Watson‐Crick base pair connections. The file may contain coordinates of all heavy atoms of the molecule or the its CA and/or P atoms. In the first case, the program automatically extracts the CA and/or P positions.

  • 2

    Next, the basic topology of the molecule is extracted and saved in an XML format with the *.txml extension. The tools used to extract the topology are RedMD_extractPDBXML or RedMD_extractPDB.

  • 3

    The topology is a scaffold that contains information about the reduced representation of the molecule i.e., extracted pseudoatoms, connectivity within linear chains (trace) and if applicable the secondary base pair connections in DNA or RNA. The latter can be only extracted from the PDBML files.

  • 4

    Some PDB files are incomplete and lack certain Cα or P atoms; the user needs to either add them before generating the topology or decide to put a connection between pseudo‐atoms that are not subsequent in the chain. In the following step, we recommend that the user verifies and if needed modifies the topology with our preview/editing tools that work together with the VMD package.24 All modifications, additions or removals of pseudo‐atoms and connections, can be also performed directly by editing the *.txml file.

  • 5

    Next, the user chooses the force field to generate the entire biomolecular model for further MD simulations. Currently, we provide five force field generators. First, is a simple harmonic elastic network model2 which contains three parameters: harmonic spring constant, and the cutoff values for the Cα and P pseudoatoms (see next section). Next, are the previously developed coarse‐grained parameterizations of the ribosome,18 HIV‐1 protease19 and nucleosome,20 and the REACH model.21 The user may easily change the force field parameters with command line options.

  • 6

    As a result of the previous step one obtains the molecular structure *.sxml file that contains information on its topology and all interactions between pseudoatoms. This file serves as one of the input files for an MD simulation. The *.sxml force field file is also in the XML format which is described in detail in the next sections. The file contains atom numbers, names, masses, coordinates, possible constraints, and parameters of all bonded and nonbonded interactions in the system (e.g., harmonic, Morse, quartic). Apart from coordinates it is also possible to give initial momenta.

  • 7

    In this step, we recommend that the user verifies the structure file by viewing the *.sxml with our preview/editing tools that work under the VMD program.24

  • 8

    This stage involves input preparation for generating the MD trajectory. The input is a text file containing various keywords describing the requested simulation. One needs to define the basic set of parameters such as the time step, number of MD integration steps and temperature. Exemplary input files with detailed descriptions of keywords are provided with the RedMD distribution.

  • 9

    The last part is the MD simulation that generates the trajectory and outputs the energetics of the system in time.

image

Flowchart of the RedMD package showing how to perform molecular dynamics simulations. Figures of the small 30S ribosomal subunit are based on the 1GIX PDB entry. [Color figure can be viewed in the online issue, which is available at www.interscience.wiley.com.]

Force Field Generators

The currently implemented force fields are based on the elastic network model (ENM) and its extensions. We implemented one bead models in which a residue or a nucleotide is represented by one interacting center, Cα (CA) carbon or P phosphorus atom. In the ENM such pseudoatoms are interconnected by harmonic potentials. Only few parameters define the interactions—a single harmonic constant for the CA–CA, P–P and CA–P interactions and cutoff values to account only for connections that fall within a predefined cutoff distance. The scheme of the ENM model is presented in Figure 2.

image

Reduced elastic network model of the bovine pancreatic trypsin inhibitor (PDB code 5PTI, protein trace shown as cartoon). CA carbons are shown as spheres and interactions as lines. Each CA pseudoatom interacts only with pseudoatoms that are located within a predefined cutoff distance.

Another force field includes the extension of the ENM previously applied to the ribosome.18 The general idea of the model is presented in Figure 3. The potential energy function characterizing various interactions of the elastic net is partitioned into the following terms:

equation image(1)
The bonded potentials are applied to 1–2, 1–3, and 1–4 pseudo‐atom connections in the sequence. In case of nucleic acids base‐pairing interactions, Ebp, can be also included because they are usually provided in the PDBML files. The nonbonded potentials are divided into two kinds according to the pseudoatom distance: local and nonlocal. The local nonbonded interactions are realized with a Morse potential whose equilibrium bond length is taken from the initial configuration and the well depth decreases with the increasing distance between beads. This accounts for weakening of interaction strengths with distance. The number of the local interactions does not change in the simulation. They serve to assure the stability of local secondary structures in the one‐bead only model. The nonlocal nonbonded Morse is generic and its parameters depend only on the pseudoatom type (CA···CA, P···P, CA···P). These interactions are applied between Rcutoff and Rurn:x-wiley:01928651:media:JCC21223:tex2gif-stack-1 schematically shown in Figure 3. The pair list of nonlocal interactions is updated every chosen number of integration steps. The use of the Morse potential functions allows for larger fluctuations from the starting configurations. The interaction terms in eq. (1) are functions of a single degree of freedom and were parameterized based on statistical analysis obtained from the crystallographic structure of the ribosome. The potential energy terms were approximated as Boltzmann inversion of the corresponding pair distribution function resulting from this statistics.25, 26 Details of the model and parameterization of the force field were described in detail elsewhere.18

image

Left: Pseudobonded harmonic interactions for subsequent pseudoatoms in the chains. Right: Schematic representations of the two cutoff distances for local and generic nonlocal nonbonded interactions.

Another implemented model is based on the force field initially developed for HIV‐1 protease.19, 27 The 1–3 and 1–4 interactions act on the angle and not on the distance as in the model presented in Figure 3. This is a knowledge‐based parameterization as well and the statistics was derived from a set of crystal structures of HIV‐1 protease. The use of the bistable potential for the CA–CA–CA interactions and Morse potential for the nonbonded interactions enables large configurational changes such as a 20 Å wide opening of flaps covering the active site of the enzyme.

Furthermore, we included a recently developed force field for the nucleosome.20 Its parameterization was derived based on statistical analysis of an all‐atom MD simulation of the nucleosome in explicit solvent. The force field contains local harmonic interactions, as well as nonbonded Morse potentials.

We also implemented the REACH model21 that is based on a different approach than the ones presented above. In the REACH model the potential energy of the system is defined as:

equation image(2)
where N is the number of residues, dij or durn:x-wiley:01928651:media:JCC21223:tex2gif-stack-2 is the distance between the dynamical or equilibrium coordinates of Cα atoms i and j, and kij is the force constant for the elastic spring acting between atoms i and j. In the original REACH approach, the force constants were calculated from the variance‐covariance matrix obtained from atomistic MD simulations for myoglobin.21

It should be emphasized that majority of the models implemented in the RedMD package were originally developed for specific molecules (HIV‐1 protease, ribosome, nucleosome, and monomeric or dimeric myoglobin in case of the REACH force field) and tested only for a limited number of systems. However, these force fields with their flexible energy functions and modifiable parameters constitute useful templates for any users willing to introduce their own parameterization. Details are provided in the user's manual available at http://bionano.icm.edu.pl/software/.

XML‐Based Format of Molecular Force Field

In the RedMD program the molecular topology and structure files are generated in the XML‐based format (http://www.w3.org/TR/REC‐xml/). XML is often applied as a format for document storage but its hierarchical structure can be also efficiently used in representing molecular data—the structure and interactions. The XML syntax is clear, its format is strict, and thus the XML‐formatted files can be easily parsed. It is easy to create and modify self‐describing, XML‐formatted force fields. There are no rules on the line format, such as e.g., in the plain text PDB6 or PSF format of CHARMM.13 The XML format does not require certain number of characters per sign or any column aligning. This format allows storing every desired property. A disadvantage is that the information stored in XML‐based formats usually takes more disk space, but the compression of XML by common algorithms is very effective. RedMD uses functions implemented in the XML Input/Output library libxml2 (http://xmlsoft.org/) available with many Linux distributions that allows operations on GNU gzipped XML files.

An exemplary *.sxml structure file is presented in Figure 4. It contains one root element delimited with tags <STRUCTURE> </STRUCTURE>. The NONBONDED element stores information on nonbonded nonlocal interactions, in this example defined between CA pseudoatoms. These are van der Waals like‐type interactions modeled with the Morse potential and applied up to the cutoff distance of 20 Å (<NLMORSE cutoff = "20.00">). The parameters of the Morse function are defined with attribute values alpha, E0, and l0 of the PAIR element. The MOLECULE element has an attribute molId of a unique value mol1. This uniqueness is particularly useful in linking together the *.sxml files of two different molecules. The elements nested within <MOLECULE></MOLECULE> tags are in this example:

  • ATOM—pseudoatoms with attributes specifying indexes (id), names(name), positions (x,y,z), residues (resName), chains (chainID), sequence within chains (resSeq) and masses (m)

  • BOND—harmonic bonds between pseudoatoms with attributes identifying connected pseudoatoms (idAtom1, idAtom2), force constants (k) and equilibrium values (l0) for harmonic bond lengths

  • DIH_HARM—dihedral angles modeled with harmonic potentials with attributes identifying involved atoms (idAtom1, idAtom2, idAtom3, idAtom4), force constants (kAlpha), and equilibrium values (alpha0)

  • ANGLE_QUAR—bond angles modeled with double‐well potentials with attributes identifying involved atoms (idAtom1, idAtom2, idAtom3) and parameters defining the potential function (see RedMD manual or ref.19)

  • MORSE—local nonbonded interactions (described with Morse functions) between atoms given with attributes idAtom1 and idAtom2, with the shape of Morse functions defined with attributes alpha, l0 and E0.

image

Exemplary *.sxml structure file defining the topology and interactions in the molecule. See text for description of the format.

The XML format enables hierarchy of elements in the file. Types of interactions can be grouped together with a <GROUP> </GROUP> tag. The clustering can be performed according to any element or attribute such as BOND or mass m.

With such XML‐based description of a biomolecule, information on the force field and coordinates needed to perform simulations such as normal mode analysis or MD is gathered in one single file. This is an advantage because in most of MD packages the user needs separate files with parameter sets and initial coordinates. Another advantage of our XML format is the ease of combining molecules into one *.sxml structure file without the need to change atomic ids or numbering.

Previewing and Verification Tools

We provide additional tools that enable the user to visualize and verify the topology and structure of the molecule. These are described below.

  • RedMD_prevModel—from the *.txml topology or *.sxml structure files (or its gzipped version) this program produces a PDB file containing information about the backbone connections, i.e., the trace of the molecule or pseudoatom interactions. The user may choose which tag to output. The <BOND/> or <MORSE/> tags are represented in the end of the output PDB file as lines begining with the CONECT phrase. This file can be visualized under VMD24 (see exemplary presentation of a molecule in Fig. 5). This allows to double check if each atom <ATOM /> has only two subsequent connections in each chain, if the molecular trace is linear, and detect the missing connections between the subsequent pseudoatoms in each chain.

  • RedMD_grepNode—a tool to extract nodes and its properties from the *.sxml file. This enables plotting data of interest such as e.g., the parameters of the Morse potential or coordinates of the molecule.

  • RedMD_updateXML—a program to update coordinates and momenta in the *.sxml file using an external PDB file.

  • sxmlplugin.so—VMD plugin to read the *.txml and *.sxml files.

image

Visualization of the RedMD topology and structure files for the topoisomerase I/DNA complex (PDB entry 1A36). A: All‐atom model. B: Molecular topology with Cα atoms in magenta and P in blue and backbone connections (subsequent 1‐2 bonds and base pairing). C: Harmonic 1‐2 (magenta and blue), 1‐3, and 1‐4 pseudobonds (green) characterized with the tag BOND from the *.sxml structure file. D: Harmonic pseudobonds (BOND colors as in C) and local nonbonded interactions (MORSE, orange) between pseudoatoms.

Molecular Dynamics Simulations

Molecular dynamics simulations are performed with the main simulation module—RedMD. To begin an MD simulation, two files are required: a *.sxml structure file and an input simulation control file which sets up the MD simulation parameters requested by the user. Input and output file parameters need to be in the following units: length, Å; energy, kcal/mol; temperature, K; time, ps; mass, atomic units. The MD input file is a simple text file composed of keywords and its values.

MD simulations can be performed in the microcanonical ensemble, canonical ensemble with Berendsen thermostat,28 with Langevin bath,29 and by means of Brownian dynamics.30 We solve the Langevin equation with Brünger‐Brooks‐Karplus integrator (BBK).31 The supported output trajectory formats are .xyz, .pdb and Charmm/X‐PLOR .dcd.13, 32 An output text file that monitors the temperature and potential energy components as a function of the simulation step is also written out. For the microcanonical ensemble we provide two integration algorithms: velocity Verlet and leap‐frog.33 One can also apply the shake or rattle34 algorithm to constrain certain pseudobonds (e.g., trace of the molecule) and increase the integration time step. The files needed to restart a simulation are automatically outputed.

OpenMP Parallelization

Parallel optimization is performed using the force distribution model with the OpenMP Application Program Interface (http://openmp.org) for shared‐memory architectures. Based on the information stored in the *.sxml file, the gradient of the potential energy function E(X) is calculated in order to obtain forces acting on pseudoatoms as a function of their coordinates, F(X). In parallel calculations, in every MD integration step, RedMD partitions (X) into components calculated by Nthread different processes (also called threads). Therefore, every parallel process k calculates its partial forces:

equation image(3)
Next, the main thread (the so called Thread 0) collects energies and partial forces calculated by every other thread:
equation image(4)
The aforementioned operation is justified because the potential and forces are additive and E(X) is a sum of either only bonded potential energy terms as in case of ENM or E(X) = Ebonded + Eurn:x-wiley:01928651:media:JCC21223:tex2gif-stack-3 + Eurn:x-wiley:01928651:media:JCC21223:tex2gif-stack-4 for other models [see eq. (1)]. For the first two terms, the number of interactions does not need to be updated in the simulation because it depends only on the initial configuration of the molecule. For best scaling one needs to assign the calculations of those terms uniformly among threads so every thread obtains preferentially the same number of potential energy terms of certain type. Therefore, the bonded Ebonded and local nonbonded Eurn:x-wiley:01928651:media:JCC21223:tex2gif-stack-5 potential energy terms are divided into separate types (e.g., in case of the HIV‐1 protease force field into BOND, DIH_HARM, ANGLE_QUAR, MORSE). The number of interactions of each type is then calculated and equally partitioned for calculations by different threads (in the simplest ENM model only Ebonded terms are subject to partitioning)
equation image(5)
with the corresponding force
equation image(6)
where p is the index of a potential term. This method is applicable to every new potential provided that one has an analytical formula or numerical method for calculating ∇Etype,p(X).

For the nonbonded nonlocal Eurn:x-wiley:01928651:media:JCC21223:tex2gif-stack-6 term the approach is different because this term depends on positions of all atoms. In our case, Eurn:x-wiley:01928651:media:JCC21223:tex2gif-stack-7 is a pairwise radial function that disappears at large distances greater than a predefined Rurn:x-wiley:01928651:media:JCC21223:tex2gif-stack-8 (see Fig. 3). In the MD simulation a list of pseudo‐atom pairs interacting with nonlocal nonbonded potential is created and refreshed every chosen number of steps (10 by default). Recomputing of nonlocal nonbonded interactions is the most time consuming operation of Eurn:x-wiley:01928651:media:JCC21223:tex2gif-stack-9 calculation and this part is OpenMP optimized as follows. Pairs of atoms that interact with nonbonded nonlocal potentials and contain atom i with property: i%Nthread = k are verified by a certain thread, k. Every thread searches an assigned partial list of pairs (pairsk), for pairs satisfying the condition r < Rpairlist, where r is the distance between atoms within a pair and Rpairlist > Rurn:x-wiley:01928651:media:JCC21223:tex2gif-stack-10. Every thread needs to search almost equal number of pairs as |pairsk| ≈ N(N − 1)/(2Nthread) where N is the number of atoms. Every thread stores its extracted pairs and calculates Eurn:x-wiley:01928651:media:JCC21223:tex2gif-stack-11 and its gradient.

The described algorithm provides almost linear scaling for most simulated problems on shared memory architectures. The speed of the OpenMP optimization of the force calculations for a test NVE simulation with Leap frog integration algorithm is presented in Figure 6.

image

The speedup of force calculation in 5000 steps of the NVE simulation for the nucleosome model (PDB code 1ID3). The speedup is defined as the ratio of time on one thread to time on all threads Nthreads. MD simulations were carried out on a Sun Fire X2200M server with two Quad‐Core AMD Opteron(tm) Processors 2354 running at 2200 MHz.

Normal Mode Analysis Tool

In addition to MD programs, we provide a reduced normal mode analysis tool (RedNMA). This tool enables analysis of vibrational motions through diagonalization of the Hessian using M. Tirion's elastic network model2 or the anisotropic network model with nonuniform spring force constants.22, 23 Similar to MD, a biomolecule is represented on the residue level – amino acids are replaced with Cα and nucleotides with P pseudoatoms. Beads are assigned either unit masses or total masses of corresponding residues when (as an option) a mass‐weighted Hessian is applied. The Hessian eigenproblem is solved using either LAPACK routines (http://www.netlib.org/lapack) or, if only a limited set of lowest eigenvalues (thus lowest frequency molecular motions) is requested, with the ARPACK package (http://www.caam.rice.edu/software/ARPACK). The RedNMA output consists of the full Hessian in the sparse format, eigenvalues and eigenvectors, relative frequencies of molecular motions, collectivities of normal modes, and normalized temperature factors (crystallographic B‐factors). Additionally, xyz‐formatted trajectories are outputed to illustrate motions along each of the normal mode vectors.

Benchmarks

Table 1 shows the RedNMA memory requirements for normal mode analysis. Memory usage tests were performed for five systems: the apo‐form of the catalytic subunit of protein kinase CK2 (PDB entry 1LR4, 327 pseudoatoms), elongation factor G (1FNM, 655 pseudoatoms), nucleosome core particle (1KX5, 1266 pseudoatoms), HSP90/SBA1 closed chaperone complex (2CG9, 1457 pseudoatoms) and the 30S ribosomal subunit (1GIX, 4094 pseudoatoms). In each case a full eigenproblem was solved, i.e., all eigenvalues of the Hessian were determined.

Table 1. Memory Required to Perform Normal Mode Analysis for Various Molecules Annotated by Their PDB Codes
PDB code Hessian order memory (MB)
1LR4 981 21.5
1FNM 1965 70.7
1KX5 3798 239.6
2CGJ 4371 316.4
1GIX 12282 2393.1
Table 2. Average Times (with standard deviations) Needed to Generate 5000 Steps of Molecular Dynamics Trajectory Using Different Simulation Algorithms
Simulation algorithm Time (s)
Langevin dynamics 695.5 ± 1.3
NVE ensemble (leap frog integrator) 703.9 ± 0.3
Brownian dynamics 694.4 ± 2.3
NVT ensemble (Berendsen thermostat) 685.0 ± 1.2
  • Averages were calculated over 20 intervals of 5000 integration steps.

Table 2 presents a real time cost needed to perform 5000 simulation steps using different integration algorithms. Timings were collected for the 30S ribosomal subunit (PDB entry 1J5E, 3909 pseudoatoms) using a single core of a Quad‐Core AMD Opteron(tm) processor 2354 running at 2.3 GHz. Local interactions in this system consist of 12182 BOND and 62446 MORSE terms and these numbers are constant during simulations. The list of nonlocal terms (NLMORSE) was refreshed every 10 steps. Simulation time differences among algorithms arise from their different complexity and different magnitude of molecular fluctuations. This results in a different number of nonlocal nonbonded terms evaluated during various types of simulations.

Examples

To illustrate the performance and usefulness of the RedMD package, we show the results of reduced MD and NMA calculations for two proteins whose secondary structures are presented in Figure 7. One is the apo‐form of the second catalytic subunit of protein kinase CK2 (PDB code 1LR4, resolution 2.0 Å, 327 CA carbons) and the other—the triclinic hen egg‐white lysozyme (3LZT, resolution 0.9 Å, 129 CA carbons).

image

Cartoon representations of the apo‐form of the second catalytic subunit of protein kinase CK2 (top, 1LR4) and hen egg‐white lysozyme (bottom, 3LZT). Structures are colored according to amino acid serial numbers (left, Black‐White‐Green color scale) and experimental B factors (right, Black‐White‐Red color scale). Experimental B factors are not normalized.

MD simulations were performed with elastic network, HIV‐1 protease, ribosome and REACH models. Default values of force field parameters were used following their original applications;18-21, 27 only the cutoff for truncating the nonbonded interactions was modified to reach the most significant agreement with experimantal data. For each molecule and force field a 1μs‐long MD trajectory was generated with the Berendsen algorithm for temperature coupling. Integration time step was set to 0.1 ps for the ENM, 0.04 ps for the ribosome model, and 0.02 ps for the HIV‐1 protease and REACH models. NMA was performed with the RedNMA tool and elastic network model2 also using various cutoff distances.

A comparison of the residual flexibility determined from the reduced MD and NMA simulations with the experimental temperature (B) factors taken from the PDB entries is shown in Figure 8. To enable such a comparison, the B‐factors for each pseudoresidue were normalized according to:35

equation image(7)
with μ being the average value computed over a set of temperature factors and σ the standard deviation. The correlation coefficient between the experimental (BX) and theoretical (BT) temperature factors was computed using the Pearson product‐moment coefficient defined as:
equation image(8)
where index i runs over N Cα or P atoms.

image

Correlation coefficients between the NMA or MD derived B‐factors obtained with different force fields and experimental values. Cutoff distances (in Å) used to truncate the interactions are shown in parentheses. For comparison, the correlation of experimental B‐factors with those obtained with an all‐atom MD simulation of lysozyme (MD) is also shown. The latter was a 1.6 ns long MD trajectory of explicitly solvated lysozyme performed with NAMD11 package and Charmm13 force field.

As shown in Figure 8, apart from the ENM, the applied force fields perform reasonably well, considering their dedication to specific molecular systems, and give correlation coefficients above 0.6. For lysozyme, the highest correlation with experimental data of above 0.8 is observed when the protein is represented with the HIV‐1 protease model. The performance of NMA is worse in this case; the best correlation coefficient obtained for the cutoff of 17 Å is below 0.7. In case of the CK2 protein, the best performance is observed for the ribosome model with a correlation coefficient above 0.75. The highest similarity to experimental data giving a correlation coefficient above 0.8 results from the NMA performed with a cutoff of 28 Å. The overall performance of HIV‐1 protease, ribosome and REACH force fields seems to be better for the smaller of the two proteins, i.e., the lysozyme. The opposite is observed, consistently, for the ENM and NMA.

For a few cases, the temperature factors are also plotted as a function of the residue number and presented in Figure 9. Two best agreements with the PDB B‐factors are shown: for the lysozyme with the HIV‐1 protease force field for MD and for the CK2 protein with the NMA model.

image

Normalized values of experimental and theoretical B‐factors of hen egg‐white lysozyme (top, 3LZT) and protein kinase CK2 (bottom, 1LR4) plotted as a function of residue numbers.

In general, the simulation B‐factors give reasonable correspondence with the experimental ones. Nevertheless, a more thorough and systematic study is needed to clarify the matter of transferability of the presented force fields between different protein structural classes (both lysozyme and CK2 belong to an α/β class), as well as different molecular systems. However, as previously stated, various energy functions implemented in RedMD might serve as useful templates for constructing new parameterizations based either on experimental crystallographic or NMR data or all‐atom MD simulations.

Conclusions

We developed a reduced MD package that allows one to perform simulations of biomolecules on a micro‐ to milisecond time scales. The package is freely available and can be downloaded from the http://bionano.icm.edu.pl group web page. To represent molecular topology and interactions the code makes use of the XML format that is easy to read, write and edit. The force field generators are based on elastic network models and their extensions. MD can be performed with leap frog and velocity Verlet integration algorithms, NVE and NVT ensembles, and Langevin and Brownian dynamics propagation schemes. Holonomic constraints are also implemented. OpenMP parallelization for shared memory architectures provides nearly linear scaling of the code (according to our tests up to eight processors). The NMA tool for vibrational analysis is also provided. Future plans include adding different potential energy functions and new force field generators, for example, the heterogenous elastic network model.36

Acknowledgements

The authors express special thanks to Konrad Wawruch for helpful advice in designing the framework of the code. J.T. likes to thank Valentina Tozzini, Chia‐en A. Chang, Karine Voltz, and J. Andrew McCammon for their work on the development of the extensions to elastic network models that were used in the code.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.