First Molecular Dynamic Simulation
Step-8. Analysis

Pietro Buffa

If everything was doing well, we can start analyzing our system after simulation. MD produces a really big quantity of data so, it is always better to have clear what we want to analyze into our system from the beggining. In this part of our tutorial, we will introduce some specific basic tools in order to show you how you can use them.

In our last step, we generated the following files:

md_01.tpr (File generated by grompp which contains the starting structure of your simulation: coordinates, velocities, the molecular topology and all the simulation parameters)
md_01.trr (The full precision trajectory containing the positions, velocities and forces over time)
md_01.xtc (A light weight trajectory, containing only coordinates in low precision)
md_01.edr (Energy related parameters over time)
md_01.gro (Fixed-column coordinate file format for the system)
md_01.cpt (Checkpoint File produced by mdrun at specified intervals)
md_01.log (File containing information about the simulation)

As we have already seen, in general, each analysis produce graphs in the form of .xvg files which can be viewed using specific external programs like "xmgrace plotting tool".

To check if the simulation finished properly, the internal tool "gmxcheck" allows you to verify if the simulation ran for the established time:

> gmxcheck -f md_01.trr

The program "trjconv" is normaly used as the first post-processing tool in order to correct a trajectory for periodicity, or to extract specific frames from a trajectory for analysis. Let's say to the program to generate a "corrected" trajectory file which can be .xtc or .pdb:

> trjconv -s md_01.tpr -f md_01.xtc -o md_01-trajectory.xtc -pbc mol -ur compact

> trjconv -s md_01.tpr -f md_01.xtc -o md_01-trajectory.pdb -pbc mol -ur compact

Select "System" from the list.

1] Graphical Visualization of the System

The file md_01-trajectory.pdb includes all the trajectories generated. You can input this file to the program VMD in order to have a graphical and dynamic visualization of your simulated system.

2] Calculate the Convergence of the System (RMSD)

We can now evaluate the structural stability of MD simulations using the Root Mean Square Deviation (RMSD) as an indicator of convergence of the structure towards an equilibrium state. Typing the following command:

> g_rms -s md_01.tpr -f md_01-trajectory.xtc -o rmsd.xvg -tu ns

The (-tu) flag will output the results in terms of ns even if the trajectory was written in ps.

Select "Backbone" from the list and use "xmgrace plotting tool" to get the graphical representation of the result.

The plot shows the RMSD relative to the our minimized, equilibrated system. As you can see, the MD simulation shows to be stable after the first 300 ps and the average RMSD is arround 0.8 angstrom.

3] Calculate the Root Mean Square Fluctuations (RMSF)

The Root Mean Square Fluctuations (RMSF) is a measurement of the thermal motions of a residue. It captures, for each atom, the fluctuation about its average position. This analysis gives an overview about the flexible regions into the protein and corresponds to the crystallographic b-factors (temperature factors). Typing the following command:

> g_rmsf -s md_01.tpr -f md_01.trajectory.xtc -o rmsf.xvg

Use "xmgrace plotting tool" to get the graphical representation of the result.

The plot shows the RMSF relative to the our minimized, equilibrated system. As aspected, our protein has just few thermal motions regions

4] Calculate the Radius of Gyration of the Protein (Rg)

The radius of gyration of the protein gives an indication of the shape (compactness) of the molecule at each time. If a protein is folded, it will maintain a relatively steady value of Rg. If a protein unfolds, its Rg will change over time. With this second analysis we want to analyze the Rg for Lysozyme. Typing the following command:

> g_gyrate -s md_01.tpr -f md_01-trajectory.xtc -o gyrate.xvg

Select "Backbone" from the list and use "xmgrace plotting tool" to get the graphical representation of the result.

As we can see, the reasonably invariant Rg of the protein shows that it remains very stable in its compact fold for all the time of simulation at 300K.

5] Calculate the Solvent Accessible Surface Area of the Protein (SASA)

One property which can be of interest is the surface area of the protein which is accessible to solvent, it is commonly called Solvent Accessible Surface Area (SASA). The tool that we are going to use here also allows to calculate the average SASA over time per residue using (-or) flag. Typing the following command:

> g_sas -s md_01.tpr -f md_01-trajectory.xtc -o sasa.xvg -or res-sasa.xvg

Select "Protein" from the list and use "xmgrace plotting tool" to get the graphical representation of the result.

Specific residues can be investigated generating an "index file" which contains the numbers of the residues that you are going to analyze. Thanks to this file the program will give you an overview about the behaviour of these residues over time.

[Torna su]