The Input File#
The input file is split into four command blocks: [SETTINGS], [TRAJECTORY], [ZMATRIX], and [GRAPH]. Each of these command blocks contains a set of commands that perform different functions in constructing the graph. Their specific functions and corresponding input file commands are described on their respective pages.
When running ChemNetworks, the input file is passed into the executable using the -i CLI argument. For example:
Below is a description of the formatting of the input file itself: how the file is formatted, how command blocks and individual commands are defined and set, and some other common conventions used often in various input file commands.
Input File Format#
The input file is a plain text file with the (optional) .cn extension. Lines beginning with # are treated as
comments and ignored. The # character does not do any mid-line commenting, so SOME = thing # comment is interpreted
as having three parameters, not one. Blank lines are also ignored. The file is organized into blocks, each beginning
with a block header and containing a set of command definitions.
Block Headers#
A block header is a line containing a name enclosed in square brackets. Block names are case-sensitive and uppercase.
[TRAJECTORY], [ZMATRIX], and [GRAPH] blocks can appear multiple times. Each instance must begin with a
NAME command as its first entry to uniquely identify it. [SETTINGS] appears once and does not require a NAME
command.
Blocks may appear in any order in the file. They are processed in the order they appear.
Dot notation can be used in the block header to address a sub-level of a command block. For example, [GRAPH.WRITE] is
equivalent to placing WRITE.* commands directly under [GRAPH]. Internally, the two forms produce the same
command paths. A NAME command is not required when re-entering a block this way. For example, the below code block
demonstrates calling the GRAPH.WRITE.EDGES command twice, the second time using the sub-block format.
Command Definitions#
Commands use a KEY = VALUE format, one per line. Leading and trailing whitespace is trimmed from both the key and
value. The = sign is required.
Dot-delimited keys group related commands hierarchically. For example, WRITE.EDGES and WRITE.EDGES.ATTRIBUTES.EDGES
share a common prefix. This is the same mechanism that makes sub-block headers like [GRAPH.WRITE] work.
Command keys are case-sensitive and uppercase. Command values are used as provided.
File Naming Convention#
WRITE commands accept an optional $file_name parameter. By default, output files are named using
<base>.<timestep>.<ext>. If a custom $file_name is provided, the timestep is appended as
<file_name>.<timestep>.<ext>.
Every instance of a * character in the file name will be replaced with the timestep. For example,
GRAPH.WRITE.EDGES = ./edges/edges_*_output produces ./edges/edges_0_output.edg, ./edges/edges_1_output.edg, etc.
Attribute Resolution#
The LAMMPS, MOL2, and PDB output formats resolve node attributes using fallback chains. The first attribute found is used; if none are present, the listed default is applied.
- Atom names:
atom_type,atom_name,element,name,type(default "X") - Residue names:
mol_type,moltype,residue_name,resname,res_name(default "UNK") - Residue IDs:
mol_id,molid,residue_id,resid,res_id(default 1) - Bond type numbering: Unique bond types are assigned sequential integer IDs based on the atom type pairing at each
edge. Pairings are order-independent (e.g.,
O-HandH-Oreceive the same ID).
Attribute Combination#
When operations need to merge attributes (e.g., contracting nodes, simplifying edges, converting between directed and undirected graphs), the combination behavior is specified using a colon-delimited ( see Attribute Combination) space-separated map string.
Specification#
Tokens with colons specify per-attribute methods like weight:sum or distance:min to define what combination method
should be performed on each individual attribute. If the attribute is unspecified, such as first or ignore, then
this combination method is used as the default method. If no default is provided, ignore is used. If multiple defaults
are provided, the right-most default is used.
Note that the node index is not an attribute that can be controlled with this command.
Combination Methods#
The available combination methods are:
ignore, leaves the attribute empty.first, uses the attribute value of the node with the smallest index.last, uses the attribute value of the node with the largest index.sum, uses the sum of the attributes.prod/product, uses the product of the attributes.min, uses the minimum of the attributes.max, uses the maximum of the attributes.mean, uses the average of the attributes.med/median, uses the median of the attributes.rand/random, uses a random attribute.cat/concat/concatenate, concatenates all the attributes.
Example#
For the specification:
mol_type:first x:mean y:mean z:mean charge:sum atom_index:ignore atom_type:cat first
The attributes for an example water molecule would be contracted as shown below:
| Attr | Node 1 | Node 2 | Node 3 | New Node |
|---|---|---|---|---|
| i | 0 | 1 | 2 | 0 |
| x | 0.00 | 0.76 | -0.76 | 0.000 |
| y | -0.07 | 0.52 | 0.52 | 0.323 |
| z | 0.0 | 0.0 | 0.0 | 0.000 |
| charge | -0.834 | 0.417 | 0.417 | 0.000 |
| mol_id | 1 | 1 | 1 | 1 |
| mol_type | SOL | SOL | SOL | SOL |
| atom_index | 1 | 2 | 3 | >ignored< |
| atom_type | O | H | H | OHH |