The Test Suite¶

The MESA test suites live in star/test_suite, binary/test_suite, and astero/test_suite.

Running Tests¶

You can count the total number of available tests:

./count_tests

You can list the available tests:

./list_tests

You can get the name of a test (or tests) by specifying one or more integers as arguments:

./list_tests <N>

You can run all tests:

./each_test_run

You can run an individual test by specifying a single integer, corresponding to the number from the list of tests:

./each_test_run <N>

After a test runs, the files out.txt and err.txt will contain the output from stdout and stderr respectively.

Philosophy of the tests¶

The MESA test suite serves multiple roles. For developers, it provides day-to-day checks that code changes did not cause regressions and provides longer term opportunities to monitor the evolving performance of the code. For users, the test suite primarily serves as a source of examples.

The coverage and quality of the test suite must be sufficient to ensure that as MESA is developed, it retains its key capabilities (e.g., the ability to evolve through the He flash or to robustly evolve massive stars). To this end, a number of test suite cases descend from examples presented in MESA instrument papers.

Anatomy of a test¶

Note

If you want to add a new test case, the first step is to understand the layout and motivation of the standard tests, as described in this section. As a starting point, a simple template exists at star/test_suite/test_case_template (and the example files rendered below are contained therein). For more complex situations, you may want to take inspiration from existing test cases.

In the test case directory¶

All MESA test suite cases must be designed to be multi-part—that is to run multiple inlists—even if they only run a single inlist in practice.

The script that runs a single part is called rn1 (which usually looks like the rn script in the stock work directory). The restart script re is usually stock and so restarts a single part.

The rn script is responsible for the logic related to running multiple parts. Generally, each test part has a header inlist (inlist_*_header), which then points to other inlists. At the start of each part, this will be copied to inlist, and will be the file that MESA reads as its primary inlist.

A good test should be able to regenerate its starting model. The rn script should look something like

#!/bin/bash

# uncomment the following line to skip the optional inlists
# MESA_SKIP_OPTIONAL=t

# this provides the definition of do_one (run one part of test)
# do_one [inlist] [output model] [LOGS directory]
MESA_DIR=../../..
source "${MESA_DIR}/star/test_suite/test_suite_helpers"

date "+DATE: %Y-%m-%d%nTIME: %H:%M:%S"

# check if can skip building starting model
if [ -n "$MESA_SKIP_OPTIONAL" ]; then
   cp standard_start.mod start.mod
else
   do_one inlist_start start.mod
   cp start.mod standard_start.mod
fi

do_one inlist_x_header final.mod

date "+DATE: %Y-%m-%d%nTIME: %H:%M:%S"

echo 'finished x'

When the environment variable MESA_SKIP_OPTIONAL is set, some parts of the test run may be skipped by copying standard versions of saved models (which must be included in the test case).

It is essential to make use of test_suite_helpers as this ensures that important information is produced in a TestHub-friendly format.

Similarly, it is essential that the run_star_extras in the test suite case use the test_suite_extras (see star/job/test_suite_extras_def.inc and star/job/test_suite_extras.inc). The calls to these hooks (see below) generate the necessary information for the MESA Test Hub.

      subroutine extras_startup(id, restart, ierr)
         integer, intent(in) :: id
         logical, intent(in) :: restart
         integer, intent(out) :: ierr
         type (star_info), pointer :: s
         ierr = 0
         call star_ptr(id, s, ierr)
         if (ierr /= 0) return
         call test_suite_startup(s, restart, ierr)
      end subroutine extras_startup

      subroutine extras_after_evolve(id, ierr)
         integer, intent(in) :: id
         integer, intent(out) :: ierr
         type (star_info), pointer :: s
         real(dp) :: dt
         ierr = 0
         call star_ptr(id, s, ierr)
         if (ierr /= 0) return
         call test_suite_after_evolve(s, ierr)
      end subroutine extras_after_evolve

Properties of a good test¶

A test case must not write to stderr. The presence of output on stderr will cause the test to be classified as a failure. This restriction catches stop statements, calls to mesa_error, or other error conditions.

A good test should run relatively quickly. Costly parts can be skipped over using a saved model when MESA_SKIP_OPTIONAL is set.

A good test should check its stopping condition before producing an output model (i.e., set required_termination_code_string).

A good test should use run_star_extras to perform more detailed physical and/or numerical checks or report longitudinally interesting values to the TestHub (see MESA Test Hub).

A good test should be numerically converged. This is particularly important in order to ensure that the test is robust. Unconverged tests can often fail in response to innocuous changes. Models should always be sufficiently converged that they run reliably and that any quantities checked by the test are converged to better than the tolerances of those checks.

Note

In some instances, performing good science with a given test case may require even tighter convergence criteria that are practically excluded by test suite run time considerations. In such a case, note this fact in the test case documentation.

A good test should have physically-motivated time step limits. (This often also an important part of ensuring convergence.) The test should trigger few retries and have few time steps limited by solver convergence or by catch-all quantities like varcontrol. The star_job options show_retry_counts_when_terminate = .true. and show_timestep_limit_counts_when_terminate = .true. will summarize the retries and timestep limits encountered during the run.

Some tests should only be run in certain circumstances (e.g., if GYRE is installed, if OP MONO opacities are present). Such a test should still compile and run even when these conditions are not met, but should output the string “this test was intentionally skipped”. When this string is present in the test output, the test scripts will consider the test a success and no further checks will be performed.

README¶

Test suite cases should have a README.rst file that contains a brief description of the test and its purpose.

If possible, articulate the criteria that indicate a passing test and include information that would allow someone else to evaluate the test status. Supplementary material like plots, plotting scripts, etc., should go in a subdirectory docs.

For tests that are likely to serve as examples of MESA usage, provide additional information about key options or important caveats. This assists users who are adapting the test case to their own science.

The README file should end with a line that describes when the most recent significant changes to this test case occurred. This gives both users and developers a sense of how “fresh” the information is.

Last-Updated: 2020-10-01 (mesa r14552) by Josiah Schwab

In the test suite directory¶

The existence of a sub-directory in test_suite is not sufficient to tell MESA to perform a given test. The list of tests associated with each module lives in the file test_suite/do1_test_source. This has an a series of entries like:

do_one 1.3M_ms_high_Z "stop because log_surface_luminosity >= log_L_upper_limit" "final.mod" x300

The 4 arguments to do_one are:

test name (this should be the directory name)
termination condition (this string must exist in the test output for the test to be considered a success)
model filename (this is a file from the end of the run that will have its checksum computed)
photo filename (MESA will restart from this file and check that the checksum of the file was the same after the run and the restart)

The model filename argument permits the special value skip which causes this check to be skipped. The photo filename argument permits the special value skip which causes this check to be skipped, or auto which restarts from the antepenultimate file (determined by filesystem modification times).

Once an entry in do1_test_source has been added, the test can be run as described in Running Tests.

In the docs directory¶

Once the README.rst file is created, a link in docs/source/test_suite should be created so that it will be rendered as part of the documentation.

ln -s ../../../star/test_suite/test_case/README.rst test_case.rst

This instructs Sphinx to incorporate the contents of the README.rst file.

An entry linking to the test case page should be included in docs/source/test_suite.rst. This page will eventually list all cases. Briefly describing the test cases all in one place will make it easier for people to find a test of interest.

The description should be one sentence broadly describing what the case does and then one optional sentence about anything interesting illustrated by the case (e.g., other_hooks).

MESA Test Hub¶

The MESA Test Hub records the results of nightly test runs. The owner/maintainer of the MESA Test Hub is Bill Wolf.

When a test is run through each_test_run, a file testhub.yml will be produced. This is the information that will be reported to the TestHub by mesa_test. The file will look similar to this:

---
test_case: make_co_wd
module: :star
omp_num_threads: 36
run_optional: false
fpe_checks: false
inlists:
    - inlist: inlist_co_core_header
      runtime_minutes:   3.20
      model_number:         1008
      star_age:     4.3063267775397134E+08
      num_retries:            7
      log_rel_run_E_err:        -7.0353816429993685
      steps:         1008
      retries:            7
      redos:            0
      solver_calls_made:         1015
      solver_calls_failed:            7
      solver_iterations:        10280
    - inlist: inlist_remove_env_header
      runtime_minutes:   2.62
      model_number:         1568
      star_age:     4.3064418442484504E+08
      num_retries:           30
      log_rel_run_E_err:        -6.9640200618973518
      steps:          961
      retries:           34
      redos:            0
      solver_calls_made:          995
      solver_calls_failed:           34
      solver_iterations:         6403
    - inlist: inlist_settle_header
      runtime_minutes:   2.44
      model_number:         1746
      star_age:     4.3359174056506151E+08
      num_retries:           10
      log_rel_run_E_err:       -10.1236273392339484
      steps:          184
      retries:            4
      redos:            0
      solver_calls_made:          184
      solver_calls_failed:            0
      solver_iterations:         1097
mem_rn: 11815904
success_type: :run_test_string
restart_photo: x650
mem_re: 4635248
success_type: :photo_checksum
checksum: cb6df95a221722e7317a6e53c9c61272
outcome: :pass

The output is collected in a variety of places. The highest level information (i.e. no indent) that summarizes the test case itself comes from each_test_run. The inlist name information (i.e. lines with -) comes from do_one used in the test suite rn scripts (which is provided by test_suite_helpers).

Note

If a test suite case contains a file named .ignore_checksum, the checksum value will not be reported to the TestHub.

The per-inlist information about the performance of MESA is provided by the test_suite_extras (see star/job/test_suite_extras_def.inc and star/job/test_suite_extras.inc) included in the run_star_extras / run_binary_extras of the test suite case. In particular, in the star/binary after_evolve hook, the call to test_suite_after_evolve (and its callees) append the relevant info to the testhub.yml file.

This output is generated each time MESA terminates (except after a restart). Therefore, the per-inlist quantities that can be reported by TestHub are those accessible within a single part of a MESA run. By default, we report

runtime_minutes

steps

retries

redos

solver_calls_made

solver_calls_failed

solver_iterations

In a multi-part test case, the per-part values can be summed to give the properties of the complete run.

TestHub also reports quantities that can reflect information preserved by MESA across parts. These are transmitted via their inclusion in the model file. That means the values reported by cases that use saved models to skip optional parts will be influenced by the performance at the time the saved model was generated. Additionally, some parts may include inlist options that reset or modify these quantities. TestHub reports the values at the end of each part, but the precise meaning of these quantities cannot be understood without reference to the details of the test case.

model_number

star_age

num_retries

log_rel_run_E_err

Note

These values can be useful when diagnosing test case issues because they directly correspond to quantities in the terminal output.

Some test cases may want to output additional information. To do so, set elements in the provided arrays testhub_extras_names and testhub_extras_values. The values in these arrays (at the time the after evolve hook is called) will be automatically added to testhub.yml and reported to the TestHub.

Warning

The values of extra_testhub_names must be unique within each test case. In a multi-part test, you cannot report a value with the same name in each part. Use some extra identifier to break this ambiguity (i.e., he_core_mass_part1, he_core_mass_part2).

For example, in c13_pocket, the run_star_extras sets:

! put target info in TestHub output
testhub_extras_names(1) = 'max_c13'; testhub_extras_vals(1) = max_c13
testhub_extras_names(2) = 'mass_max_c13' ; testhub_extras_vals(2) = mass_max_c13
testhub_extras_names(3) = 'pocket_mass_c13'; testhub_extras_vals(3) = pocket_mass_c13
testhub_extras_names(4) = 'delta_surface_c12'; testhub_extras_vals(4) = delta_surface_c12

which results in the additional output

- inlist: inlist_c13_pocket_header
  ...
  extra_testhub_names:
        - 'max_c13'
        - 'mass_max_c13'
        - 'pocket_mass_c13'
        - 'delta_surface_c12'
  extra_testhub_vals:
        -     7.5183958374071852E-02
        -     5.7873466406466934E-01
        -     3.3303926652486070E-05
        -     7.2015481753889745E-05

Setting up new machine with MESA TestHub¶

Make an account on the TestHub by messaging one of the admins (Bill Wolf). You will receive a token that you will need later in order to submit logs to the logs server. Remember also the email and password you use; you’ll need these later in the terminal.
Make sure ruby is installed, for example using the Ruby version manager (rvm).

If using the rvm, follow the instructions on that page to install the gpg keys. If this does not work, then execute the next line (\curl -sSL https://get.rvm.io | bash -s stable) instead, and follow the instructions printed in the terminal.

That line only installs the rvm; you also need Ruby itself. One can execute \curl -sSL https://get.rvm.io | bash -s stable --ruby as per the rvm installation page, but that requires sudo access. For a local installation, one can follow this StackOverflow answer.

Download and set up mesa_test, by doing the following:

gem install mesa_test

mesa_test setup (here you will supply your email, password, and token from earlier; this will create a settings file in ~/.mesa_test/config.yml)

mesa_test install_and_test main will check out the main branch, test it, and submit the results to the testhub.

If you want to set up mesa_test to run automatically on a cluster, Rob Farmer has created a set of scripts that work with the Slurm workload manager. These scripts pull all commits and submit a job to the cluster queue for each new commit. You must edit the paths in all of the scripts to point to your own directories.

You can set up mesa_test or Rob’s cluster script to run recurrently as a cronjob by doing crontab -e to edit the cronjob table.

Add for example: 10 * * * * ~/mesa/mesa-helios-test/runMesaTest.sh >/dev/null 2>&1 to make it run every 10 minutes (or swap out runMesaTest with a mesa_test command). The parts at the end of that line prevent it from emailing you each time it runs.

Continuous integration testing¶

Multiple developers have set up their machines to enable continuous integration testing. These machines will automatically pull the changes in the repository, run the test suite, and report back to testhub. To make more efficient the usage of these machines they will respond to certain keywords if found in the commit message.

Note

It is up to each person providing the computing resources to implement each keyword. Thus some machines will ignore these keywords and run the test suite normally. Therefore, these are only “requests” for the computing machines not “orders”.

The message (with brackets) may appear anywhere in the commit message.

[ci skip]¶

Compile MESA but do not run the test suite. Useful when changes only touch documentation or the changes can not affect the final result.

[ci split]¶

Splits the running of the test suite between machines. Current, if set, cannon will run the first half of the test cases while helios will run the second half.

[ci optional]¶

Runs MESA with the environment variable MESA_SKIP_OPTIONAL unset. This requests that all parts of each test case be run (i.e., including optional parts).

[ci optional n]¶

Where n is an integer. Same as [ci optional] but only run the first n test cases.

[ci fpe]¶

Compiles and runs MESA with the environment variable MESA_FPE_CHECKS_ON=1 set. This requests that we turn on additional debugging checks.

[ci converge]¶

Runs the test suite with the environment variable MESA_TEST_SUITE_RESOLUTION_FACTOR set to a factor, giving a different temporal and spatial resolution (and max model number).