Running tests in a scheduler#
Tests can be run under a workload manager (scheduler) such as Slurm or PBS by adding the following options to canary run:
canary run [-b spec=(duration:T|count:{max,auto,N})[,layout:{flat,atomic}][,nodes:{any,same}]] -b scheduler=SCHEDULER -b workers=N ...
When run in “batch” mode, canary will group tests into “batches” and submit each batch to SCHEDULER.
Batching options#
Batch spec#
if
duration:T: create batches with approximate run length ofTsecondsif
count:max: one test per batchif
count:auto: auto batch depending on other optionsif
count:N: create at mostNbatchesif
layout:flat: batches have no intra-batch dependencies but may have inter-batch dependenciesif
layout:atomic: batch have no inter-batch dependencies but may have intra-batch dependenciesif
nodes:any: tests are batched with respect to node count of test casesif
nodes:same: tests are batched with tests having the same node count
The default batch spec is duration:30m,nodes:any,layout:flat.
Note
-b spec=count:N and -b spec=duration:T are mutually exclusive.
Batch scheduler#
-b scheduler=S: use schedulerSto run batches.-b option=option: pass option to the scheduler. If option contains commas, it is split into multiple options at the commas. Eg,-b option="-q debug,-A ABC123"passes-q debugand-ABC123directly to the scheduler.
The following schedulers are currently supported:
Note
The shell scheduler is not performant and its primary utility is running examples on machines which don’t have an actual batch scheduler setup.
Batch concurrency#
Batch concurrency can be controlled by
--workers=N: SubmitNconcurrent batches to the scheduler at any one time. The default is 5.-b workers=N: Execute the batch asynchronously using a pool of at mostNworkers. By default, the maximum number of available workers is used.
Examples#
Run the canary example suite in 4 batches
$ canary run --workers=1 -b scheduler=shell -b spec=count:4 . INFO: Initializing empty canary workspace at . INFO: Collecting generator files from . INFO: Instantiating generators from collected files INFO: Generating test specs from generators WARNING: cmake not found, test cases cannot be generated INFO: Searching for duplicated tests INFO: Resolving test spec dependencies INFO: Generated 81 test specs from 38 generators INFO: Excluded 1 test spec during generation Reason Count ────────────────────────────────────────────────────────── options=enable evaluated to False for options=[] 1 INFO: Caching test specs INFO: Created selection 'silent-gorge' INFO: Selecting test cases based on runtime environment INFO: Excluded 13 test cases Reason Count ──────────────────────────────────── insufficient slots of cpus 10 Resource unavailable: gpus 3 INFO: Starting session 2026-04-21T15-33-20.458951 INFO: Generated 4 test batches from 67 test cases INFO: Starting process pool with max 1 workers Job ID Status Queued Elapsed Rank ──────────────────────────────────────────────────────────────────────────────────────────────────────────── TestBatch(id=d8d260c) d8d260c SUBMITTED 1/4 TestBatch(id=d8d260c) d8d260c STARTED 1.3s 1/4 TestBatch(id=d8d260c) d8d260c NONE (16 PASS, 1 SKIPPED, 1 FAILED) 1.3s 4.3s 1/4 TestBatch(id=7135212) 7135212 SUBMITTED 2/4 TestBatch(id=7135212) 7135212 STARTED 1.0s 2/4 TestBatch(id=7135212) 7135212 NONE (18 PASS, 2 TIMEOUT) 1.0s 6.0s 2/4 TestBatch(id=b05eef3) b05eef3 SUBMITTED 3/4 TestBatch(id=b05eef3) b05eef3 STARTED 1.0s 3/4 TestBatch(id=b05eef3) b05eef3 NONE (16 PASS, 1 DIFFED, 3 FAILED) 1.0s 4.0s 3/4 TestBatch(id=f3ca50c) f3ca50c SUBMITTED 4/4 TestBatch(id=f3ca50c) f3ca50c STARTED 1.0s 4/4 TestBatch(id=f3ca50c) f3ca50c NONE (9 PASS) 1.0s 3.0s 4/4 ┌──────────────┬────────────┬────────────────┬───────────┬───────────┬─────────────────────────────────────────────────┐ │ Job │ ID │ Status │ Queued │ Elapsed │ Details │ ├──────────────┼────────────┼────────────────┼───────────┼───────────┼─────────────────────────────────────────────────┤ │ skip │ e01f382 │ SKIP (SKIPPED) │ 0.0s │ 0.2s │ Test exited with skip exit code = 80 │ │ xdiff-fail │ f4263d4 │ FAIL (FAILED) │ 0.0s │ 0.2s │ xdiff-fail: expected test to diff │ │ timeout │ 1c2f507 │ FAIL (TIMEOUT) │ 0.0s │ 2.2s │ Job timed out after 2.0 s. │ │ timeout │ 7127eb6 │ FAIL (TIMEOUT) │ 0.0s │ 2.2s │ Job timed out after 2.0 s. │ │ diff │ b4597e3 │ FAIL (DIFFED) │ 0.0s │ 0.2s │ Test exited with diff exit code = 64 │ │ fail │ a850e81 │ FAIL (FAILED) │ 0.0s │ 0.2s │ Test exited with exit code = 65 │ │ xfail-fail │ 9db7d1b │ FAIL (FAILED) │ 0.0s │ 0.2s │ xfail-fail: expected to exit with code != 0 │ │ willfail │ e4caa24 │ FAIL (FAILED) │ 0.0s │ 0.1s │ Test exited with exit code = 1 │ └──────────────┴────────────┴────────────────┴───────────┴───────────┴─────────────────────────────────────────────────┘ 67/67 COMPLETE, 56 SUCCESS, 1 XDIFF, 2 XFAIL, 1 SKIPPED, 1 DIFFED, 4 FAILED, 2 TIMEOUT, in 00:00:17 INFO: Finished session in 17.49 s. with returncode 14 INFO: Updating view at /home/docs/checkouts/readthedocs.org/user_builds/canary-wm/checkouts/release-26.4.16/src/canary/examples/TestResults
Run the canary example suite in 4 batches, running tests in serial in each batch
$ canary run --workers=1 -b scheduler=shell -b spec=count:4 -b workers=1 . INFO: Initializing empty canary workspace at . INFO: Collecting generator files from . INFO: Instantiating generators from collected files INFO: Generating test specs from generators WARNING: cmake not found, test cases cannot be generated INFO: Searching for duplicated tests INFO: Resolving test spec dependencies INFO: Generated 81 test specs from 38 generators INFO: Excluded 1 test spec during generation Reason Count ────────────────────────────────────────────────────────── options=enable evaluated to False for options=[] 1 INFO: Caching test specs INFO: Created selection 'solar-aurora' INFO: Selecting test cases based on runtime environment INFO: Excluded 13 test cases Reason Count ──────────────────────────────────── insufficient slots of cpus 10 Resource unavailable: gpus 3 INFO: Starting session 2026-04-21T15-33-38.772597 INFO: Generated 4 test batches from 67 test cases INFO: Starting process pool with max 1 workers Job ID Status Queued Elapsed Rank ──────────────────────────────────────────────────────────────────────────────────────────────────────────── TestBatch(id=5026185) 5026185 SUBMITTED 1/4 TestBatch(id=5026185) 5026185 STARTED 1.3s 1/4 TestBatch(id=5026185) 5026185 NONE (17 PASS, 1 SKIPPED) 1.3s 6.3s 1/4 TestBatch(id=d8f9169) d8f9169 SUBMITTED 2/4 TestBatch(id=d8f9169) d8f9169 STARTED 1.0s 2/4 TestBatch(id=d8f9169) d8f9169 NONE (16 PASS, 2 FAILED, 2 TIMEOUT) 1.0s 10.0s 2/4 TestBatch(id=a15f049) a15f049 SUBMITTED 3/4 TestBatch(id=a15f049) a15f049 STARTED 1.0s 3/4 TestBatch(id=a15f049) a15f049 NONE (17 PASS, 1 DIFFED, 2 FAILED) 1.0s 6.0s 3/4 TestBatch(id=3f0d83e) 3f0d83e SUBMITTED 4/4 TestBatch(id=3f0d83e) 3f0d83e STARTED 1.0s 4/4 TestBatch(id=3f0d83e) 3f0d83e NONE (9 PASS) 1.0s 3.0s 4/4 ┌──────────────┬────────────┬────────────────┬───────────┬───────────┬─────────────────────────────────────────────────┐ │ Job │ ID │ Status │ Queued │ Elapsed │ Details │ ├──────────────┼────────────┼────────────────┼───────────┼───────────┼─────────────────────────────────────────────────┤ │ skip │ e01f382 │ SKIP (SKIPPED) │ 0.0s │ 0.2s │ Test exited with skip exit code = 80 │ │ timeout │ 7127eb6 │ FAIL (TIMEOUT) │ 0.0s │ 2.2s │ Job timed out after 2.0 s. │ │ timeout │ 1c2f507 │ FAIL (TIMEOUT) │ 0.0s │ 2.2s │ Job timed out after 2.0 s. │ │ xdiff-fail │ f4263d4 │ FAIL (FAILED) │ 0.0s │ 0.2s │ xdiff-fail: expected test to diff │ │ xfail-fail │ 9db7d1b │ FAIL (FAILED) │ 0.0s │ 0.2s │ xfail-fail: expected to exit with code != 0 │ │ diff │ b4597e3 │ FAIL (DIFFED) │ 0.0s │ 0.2s │ Test exited with diff exit code = 64 │ │ fail │ a850e81 │ FAIL (FAILED) │ 0.0s │ 0.2s │ Test exited with exit code = 65 │ │ willfail │ e4caa24 │ FAIL (FAILED) │ 0.0s │ 0.1s │ Test exited with exit code = 1 │ └──────────────┴────────────┴────────────────┴───────────┴───────────┴─────────────────────────────────────────────────┘ 67/67 COMPLETE, 56 SUCCESS, 1 XDIFF, 2 XFAIL, 1 SKIPPED, 1 DIFFED, 4 FAILED, 2 TIMEOUT, in 00:00:25 INFO: Finished session in 25.50 s. with returncode 14 INFO: Updating view at /home/docs/checkouts/readthedocs.org/user_builds/canary-wm/checkouts/release-26.4.16/src/canary/examples/TestResults