Skip to content

Creating Z-matrix Commands#

Z-matrix commands define the criteria used during the structure search to match atoms. Each command evaluates a geometric or attribute-based condition on a set of nodes and returns whether the condition is satisfied by that set of nodes. Commands self-register with a factory, and a CMake module (cmake/modules/AutoIncludeFunc.cmake) auto-generates a zmatrix.h include file from all .hpp files in src/zmatrix/, so adding a new command only requires creating the command file in that directory.

This guide walks through creating a new command using the Distance command (src/zmatrix/distance.hpp) as a reference. For the full list of existing commands and their usage, see Z-matrix Commands.

Command Base Class#

All commands inherit from Command (src/zmatrix/command.hpp), which defines three virtual methods:

virtual void validate() {};
virtual std::tuple<bool, std::string> execute(igraph_t* graph, const std::vector<int>& nodes, const igraph_matrix_t& coords) {
    return std::make_tuple(false, "");
};
virtual bool execute(double value) {
    return false;
};
  • validate() parses the parameter string and checks that the values are valid.
  • execute(igraph_t*, const std::vector<int>&, const igraph_matrix_t&) evaluates the command on a set of node indices. The coordinate matrix contains pre-fetched vertex positions (rows are nodes, columns 0/1/2 are x/y/z). Returns a success flag and a string representation of the computed value (written to .zmt output files).
  • execute(double) is an optional fast path for pairwise commands that receive a pre-computed value from the Verlet neighbor list.

The base class also defines several member variables that control how the command integrates with the search:

  • num_node_args: Number of nodes the command operates on. Determines which Z-matrix column position the command can occupy (1 for first column, 2 for second, 3 for third, 4 for fourth, -1 for any).
  • pairwise_command: When true, the command participates in Verlet neighbor list generation. Requires min_value and max_value to be set during validation.
  • uses_coordinates: When true, indicates that the command reads from the coordinate matrix. The search uses this flag to determine whether to build the coordinate matrix for the current timestep.
  • is_value_stored: When true, the search calls execute(double) with the pre-computed value instead of execute(igraph_t*, const std::vector<int>&, const igraph_matrix_t&).
  • min_value, max_value: Distance bounds used by the Verlet neighbor list to filter candidate pairs.

Distance Command Walkthrough#

Namespace and Subclass#

#pragma once

#include <sstream>
#include <igraph.h>

#include "command.hpp"

namespace ChemNetworks::zmatrix {
    class Distance : public Command {

Commands live in the ChemNetworks::zmatrix namespace and inherit from Command.

Self-Registering Boilerplate#

    public:
        const static bool registered;
        static std::shared_ptr<Command> create_shared_ptr(std::string params) {
            return std::make_shared<Distance>(params);
        }
        const inline static std::string name = "dist";
  • registered is a static bool that triggers registration at program startup.
  • create_shared_ptr is the factory method passed to the factory during registration.
  • name is the string used in input files to refer to this command (e.g., dist in $B = dist 0.0 1.2).

The registration itself happens at file scope, outside the class:

    const bool Distance::registered =
        ZMatrix_Command_Factory::register_command(Distance::name, Distance::create_shared_ptr);

Constructor#

        Distance(std::string params) : Command(params) {
            num_node_args = 2;
            is_value_stored = true;
            uses_coordinates = true;
            Distance::validate();
        }

The constructor sets the command's node count and flags, then calls validate() to parse the parameter string. For Distance, num_node_args = 2 places it in the second Z-matrix column (a two-body command), is_value_stored = true enables the fast execute(double) path during the search, and uses_coordinates = true signals that this command reads from the coordinate matrix.

Validation#

        void validate() override {
            std::stringstream ss(params);
            ss >> min_distance >> max_distance;

            if (not(min_distance >= 0 && max_distance >= 0)) {
                IO::Log::error("ZMatrix command {} : distance criteria must be positive.", name);
            }

            if (min_distance > max_distance) {
                IO::Log::error("ZMatrix command {} : minimum distance must be <= maximum distance.", name);
            }

            squared_min_distance = pow(min_distance, 2);
            squared_max_distance = pow(max_distance, 2);
            min_value = min_distance;
            max_value = max_distance;
            pairwise_command = true;
        }

validate() parses the parameter string using std::stringstream, checks that the values are valid (calling IO::Log::error() to terminate on invalid input), and sets min_value, max_value, and pairwise_command for Verlet neighbor list integration. Pre-computing squared distances avoids repeated sqrt calls during execution.

Execution#

        bool execute(const double value) override {
            return value >= min_distance && value <= max_distance;
        }

        std::tuple<bool, std::string> execute(igraph_t* igraph, const std::vector<int>& nodes, const igraph_matrix_t& coords) override {
            const double dx = MATRIX(coords, nodes[0], 0) - MATRIX(coords, nodes[1], 0);
            const double dy = MATRIX(coords, nodes[0], 1) - MATRIX(coords, nodes[1], 1);
            const double dz = MATRIX(coords, nodes[0], 2) - MATRIX(coords, nodes[1], 2);

            const double squared_r_distance = dx * dx + dy * dy + dz * dz;
            if (squared_r_distance >= squared_min_distance && squared_r_distance <= squared_max_distance) {
                return std::make_tuple(true, std::to_string(sqrt(squared_r_distance)));
            }

            return std::make_tuple(false, "");
        }

The execute(double) overload receives a pre-computed distance from the Verlet neighbor list and checks it against the bounds. This is the path used during the structure search when is_value_stored is true.

The execute(igraph_t*, const std::vector<int>&, const igraph_matrix_t&) overload computes the distance directly from the pre-fetched coordinate matrix using MATRIX(coords, node, col) where columns 0, 1, 2 correspond to x, y, z. It returns a tuple of the success flag and the computed distance as a string.

Placing the File#

Place the command file in src/zmatrix/. CMake will pick it up automatically during the configure step and add it to the generated zmatrix.h include file, which ensures the static registered bool is initialized at program startup and triggers factory registration.

Commands Without Verlet Neighbor Lists#

Not all commands need Verlet neighbor lists. The Attribute command (src/zmatrix/attribute.hpp) is a one-body command with no pairwise optimization:

        Attribute(std::string params) : Command(params) {
            this->params = params;
            num_node_args = 1;
            Attribute::validate();
        }

It only implements validate() and the execute(igraph_t*, const std::vector<int>&, const igraph_matrix_t&) overload. No min_value/max_value, no pairwise_command, no execute(double). The Angle and Dihedral commands follow the same pattern with num_node_args set to 3 and 4 respectively, and uses_coordinates = true.

Two-body commands can also skip Verlet optimization by leaving pairwise_command as false. This is necessary when the distance cutoff cannot be determined from the command's parameters alone. For example, a command that computes an energy from per-particle attributes (such as charges) cannot know the distance cutoff without knowing the attribute values, which vary per pair. In this case, set num_node_args = 2 but leave pairwise_command and is_value_stored at their defaults. The search will call execute(igraph_t*, const std::vector<int>&, const igraph_matrix_t&) for every candidate pair rather than using a neighbor list.