Basic Usage

metatensor-models is designed for an direct usage from the the command line (cli). The general help of metatensor-models can be accessed using

metatensor-models --help

We now demonstrate how to train and evaluate a model from the command line. For this example we use the SOAP-BPNN architecture and a subset of the QM9 dataset. You can obtain the reduced dataset from our website.

Training

To train models, metatensor-models uses a dynamic override strategy for your training options. We allow a dynamical composition and override of the default architecture with either your custom options.yaml and even command line override grammar. For reference and reproducibility purposes metatensor-models always writes the fully expanded, including the overwritten option to options_restart.yaml. The restart options file is written into a subfolder named with the current date and time inside the output directory of your current training run.

The sub-command to start a model training is

metatensor-models train

To train a model you have to define your options. This includes the specific architecture you want to use and the data including the training systems and target values

The default model and training hyperparameter for each model are listed in their corresponding documentation page. We will use these minimal options to run an example training using the default hyperparameters of an SOAP BPNN model

# architecture used to train the model
architecture:
  name: experimental.soap_bpnn

# Mandatory section defining the parameters for system and target data of the
# training set
training_set:
  systems: "qm9_reduced_100.xyz" # file where the positions are stored
  targets:
    energy:
      key: "U0" # name of the target value

test_set: 0.1  # 10 % of the training_set are randomly split and taken for test set
validation_set: 0.1 # 10 % of the training_set are randomly split and for validation

For each training run a new output directory in the format output/YYYY-MM-DD/HH-MM-SS based on the current date and time is created. We use this output directory to store checkpoints, the train.log log file as well the restart options_restart.yaml file. To start the training create an options.yaml file in the current directory and type

metatensor-models train options.yaml

# The functions saves the final model `model.pt` to the current output folder for later
# evaluation. All command line flags of the train sub-command can be listed via

metatensor-models train --help

Evaluation

The sub-command to evaluate an already trained model is

metatensor-models eval

Besides the trained model, you will also have to provide a file containing the system and possible target values for evaluation. The system of this eval.yaml is exactly the same as for a dataset in the options.yaml file.

systems: "qm9_reduced_100.xyz" # file where the positions are stored
targets:
  energy:
    key: "U0" # name of the target value

Note that the targets section is optional. If the targets section is present, the function will calculate and report RMSE values of the predictions with respect to the real values as loaded from the targets section. You can run an evaluation by typing

# We now evaluate the model on the training dataset, where the first arguments specifies
# trained model and the second an option file containing the path of the dataset for evaulation.

metatensor-models eval model.pt eval.yaml

# The evaluation command predicts those properties the model was trained against; here
# "U0". The predictions together with the systems have been written in a file named
# ``output.xyz`` in the current directory. The written file starts with the following
# lines

head -n 20 output.xyz

# All command line flags of the eval sub-command can be listed via

metatensor-models eval --help

Molecular simulations

The trained model can also be used to run molecular simulations. You can find how in the Tutorials section.