GROMACS Trajectory Tutorial#
This tutorial explains in detail the
1k-water-100-frame_groxtc example input file,
covering how ChemNetworks reads GROMACS trajectories and reduces a per-atom hydrogen bonding graph to a per-molecule
network.
The System#
The trajectory is a box of 1,000 water molecules using a three-point model (3,000 atoms) simulated in GROMACS and stored
across two files: a .gro topology file and a .xtc binary coordinate file. The .gro file supplies the molecular
structure and atom naming for a single frame, while the .xtc file supplies the coordinates for all 100 frames. Read
together as a groxtc trajectory, each frame provides the mol_id, mol_type, atom_id, atom_type,x, y, and
z node attributes.
The first lines of the .gro file show the GROMACS atom naming convention, where every water residue (SOL) contains
one oxygen (OW) and two distinct hydrogens (HW1, HW2):
SOL
3000
1SOL OW 1 1.656 1.573 2.451 0.5751 -0.1557 -0.9540
1SOL HW1 2 1.730 1.516 2.429 0.1127 -1.2115 0.1140
1SOL HW2 3 1.599 1.568 2.374 1.1211 -0.1550 -1.3691
2SOL OW 4 1.287 2.932 2.616 -0.8504 -0.1453 -0.0601
...
The goal is to find every water-water hydrogen bond, build a graph in which each water molecule is a single node and each hydrogen bond is an edge, and then compute graph-theoretic properties (degree, betweenness, connected components) for every molecule.
SETTINGS Block#
TIMESTEPS = 0 -1 analyzes from the first frame through the last available
frame, using the default stride of 1. The -1 placeholder for the final timestep avoids hardcoding the frame count,
so every frame in the .xtc file is processed.
TRAJECTORY Block#
[TRAJECTORY]
NAME = 1k_water_100_frame_groxtc_example
FILE = production.gro production.xtc
TYPE = groxtc
SCALE = 10
PBC.DIM = read
NAME = 1k_water_100_frame_groxtc_example assigns an identifier that is referenced
later by the ADD.NODES.TRAJECTORY command.
FILE = production.gro production.xtc lists two files. The groxtc reader combines
them: the .gro file provides the per-atom structure and naming, and the .xtc file provides the coordinates for each
frame. The order is significant, with the .gro topology first and the .xtc coordinates second.
TYPE = groxtc selects the combined GROMACS reader. Unlike a standalone .gro
file, the groxtc type does not provide velocities even when they are present in the .gro file, because the
coordinates are taken from the .xtc frames.
SCALE = 10 multiplies every coordinate by 10. GROMACS stores coordinates in
nanometers, but the Z-matrix distance constraints below are written in angstroms. Scaling by 10 converts the trajectory
to angstroms so that the dist ranges (such as 1.2 to 2.5) are interpreted in the correct unit.
PBC.DIM = read reads the simulation box dimensions from the trajectory each
frame. For a groxtc trajectory, the box vectors are read from the .xtc frames. ChemNetworks then applies the
minimum image convention to all distance calculations, correctly handling hydrogen bonds that span a periodic boundary.
ZMATRIX Block#
[ZMATRIX]
NAME = Water
ROW = $H
ROW = $O 1 $HB
ROW = $H 2 $CB 1 $ANG
$H = attr atom_name H
$O = attr atom_name O
$HB = dist 1.2 2.5
$CB = dist 0.0 1.2
$ANG = angle 135 180
This three-row ZMATRIX block describes a single hydrogen bond between two water molecules. Reading the rows:
- Row 1 defines a hydrogen atom (
$H). - Row 2 defines an oxygen (
$O) within$HB(1.2 to 2.5 angstroms) of the row 1 hydrogen. This is the hydrogen bond itself, the acceptor oxygen of one molecule approached by the donor hydrogen of another. - Row 3 defines a second hydrogen (
$H) within$CB(0.0 to 1.2 angstroms) of the row 2 oxygen, forming an angle$ANG(135 to 180 degrees) at that oxygen between the covalently bonded hydrogen (row 3) and the hydrogen-bonded hydrogen (row 1). The short$CBdistance is the covalent O-H bond, and the wide angle keeps the hydrogen bond close to linear.
The $VAR = ... lines define each variable.
attr atom_name H matches nodes whose atom_name attribute equals H. Note
that the trajectory itself has no atom_name attribute, only the GROMACS atom_type values OW, HW1, and HW2.
The atom_name attribute is created in the GRAPH block below, before the search runs.
dist 0.0 1.2 matches pairs of nodes within that distance range, and
angle 135 180 matches triplets forming an angle in that range. See
Z-matrix Commands for all available commands.
GRAPH Block#
[GRAPH]
NAME = OH
ADD.NODES.TRAJECTORY = 1k_water_100_frame_groxtc_example
ADD.ATTRIBUTE.NODE.NODE = atom_type OW atom_name O
ADD.ATTRIBUTE.NODE.NODE = atom_type HW1 atom_name H
ADD.ATTRIBUTE.NODE.NODE = atom_type HW2 atom_name H
ZMATRIX_SEARCH = Water
ADD.EDGES.ZMATRIX = Water 2 1
ANALYSIS = merge_equal_nodes mol_id atom_type:ignore atom_name:ignore first
ANALYSIS = degree
ANALYSIS = betweenness node
ANALYSIS = connected_components
WRITE.EDGES = ./edg_files/edges
WRITE.NODES = ./nds_files/nodes
WRITE.NODES.ATTRIBUTES = degree betweenness connected_components
WRITE.ZMATRIX = Water ./zmt_files/zmatrices
ADD.NODES.TRAJECTORY = 1k_water_100_frame_groxtc_example populates the
graph with all 3,000 atoms from the trajectory defined earlier. Each atom becomes a node carrying its trajectory
attributes (mol_id, mol_type, atom_type, atom_id, x, y, z).
Unifying The Hydrogen Names#
ADD.ATTRIBUTE.NODE.NODE = atom_type OW atom_name O
ADD.ATTRIBUTE.NODE.NODE = atom_type HW1 atom_name H
ADD.ATTRIBUTE.NODE.NODE = atom_type HW2 atom_name H
GROMACS distinguishes the two hydrogens in each water molecule as HW1 and HW2, but the Z-matrix needs to treat them
identically. Each ADD.ATTRIBUTE.NODE.NODE command sets a new
atom_name attribute on every node matching a given atom_type: OW becomes O, while both HW1 and HW2 become
H. After these three commands, the Z-matrix variables $O and $H can match against the single atom_name
attribute regardless of the original GROMACS naming.
Because the search reads the atom_name attribute, these commands must appear before
ZMATRIX_SEARCH.
See Command Ordering
for the full set of ordering rules.
Building Edges From The Search#
ZMATRIX_SEARCH = Water runs the hydrogen bond template against the graph,
recording every set of atoms that satisfies it along with the calculated distances and angle.
ADD.EDGES.ZMATRIX = Water 2 1 creates an edge for every match between the
node at row 2 (the acceptor oxygen) and the node at row 1 (the donor hydrogen). At this stage the graph is still
per-atom, so each edge connects an oxygen to a hydrogen.
Reducing To A Per-Molecule Network#
ANALYSIS = merge_equal_nodes mol_id atom_type:ignore atom_name:ignore first
ANALYSIS = degree
ANALYSIS = betweenness node
ANALYSIS = connected_components
merge_equal_nodes mol_id ... contracts every node sharing a
mol_id value into a single node, so each of the 1,000 water molecules collapses to one node and the per-atom hydrogen
bond edges become molecule-to-molecule edges. The attribute combination atom_type:ignore atom_name:ignore first
drops the now-ambiguous atom_type and atom_name attributes and keeps the remaining attributes from the first atom
of each molecule.
The three remaining analyses operate on this per-molecule network.
degree records each molecule's number of hydrogen bonds in the degree
node attribute. betweenness node records node betweenness
centrality in the betweenness attribute.
connected_components labels each molecule with the ID of the
hydrogen bonded cluster it belongs to in the connected_components attribute.
Writing Output#
WRITE.EDGES = ./edg_files/edges
WRITE.NODES = ./nds_files/nodes
WRITE.NODES.ATTRIBUTES = degree betweenness connected_components
WRITE.ZMATRIX = Water ./zmt_files/zmatrices
Unlike the Water-LiCl example, the write commands here appear directly under the [GRAPH] block
rather
than in a [GRAPH.WRITE] sub-block. Both forms are equivalent.
WRITE.EDGES = ./edg_files/edges writes the molecule-to-molecule edge list to
./edg_files/edges.0.edg, ./edg_files/edges.1.edg, and so on.
WRITE.NODES = ./nds_files/nodes writes one node file per frame, and
WRITE.NODES.ATTRIBUTES = degree betweenness connected_components
restricts the node files to the three analysis results.
WRITE.ZMATRIX = Water ./zmt_files/zmatrices writes the raw hydrogen bond search
results, including the matched atom indices and the calculated distances and angle for every match at each frame.
Running the Example#
From the example directory:
for serial execution, or:
for parallel execution using MPI, which distributes the 100 frames across the available processes. The output is identical regardless of parallelization: