Rodrigo Arias Mallo
64f077c4f6
stages: prepend the stage name to messages
2021-04-16 09:29:33 +02:00
Rodrigo Arias Mallo
7c94997023
control: add trap for bad exit
2021-04-16 09:29:33 +02:00
Rodrigo Arias Mallo
bde54c69c5
sbatch: store queued status
2021-04-16 09:29:33 +02:00
Rodrigo Arias Mallo
422d359b48
script: stop on error by default
2021-04-16 09:29:33 +02:00
Rodrigo Arias Mallo
71c06d02da
stages: add baywatch stage to check the exit code
...
This workaround stage prevents srun from returning 0 to the upper stages
when a signal happens after MPI_Finalize. It writes the return code to a
file named .srun.rc.$rank and later checks that exists and contains a 0.
When the program is killed, exits with non-zero and the error is
propagated to the baywatch stage, which aborts immediately without
creating the rc file.
2021-04-16 09:29:26 +02:00
Rodrigo Arias Mallo
b0af9b8608
srun: add postSrun hook
2021-04-12 17:41:59 +02:00
Rodrigo Arias Mallo
87fa3bb336
sbatch: assert types to avoid silent parse errors
2021-03-19 16:37:31 +01:00
Rodrigo Arias Mallo
051a74b85d
srun: allow commands to run before srun
2021-02-26 17:00:09 +01:00
Rodrigo Arias Mallo
8a77900201
srun: don't expand variables on install
2021-02-26 16:59:29 +01:00
Rodrigo Arias Mallo
ebcbf91fbe
exec: allow manual specification of program path
2021-02-23 15:22:18 +01:00
Rodrigo Arias Mallo
e5561b8735
control: save total execution time
2021-02-08 14:14:08 +01:00
Rodrigo Arias Mallo
2b9c3da911
Add script stage
2021-01-12 18:19:49 +01:00
Rodrigo Arias Mallo
aeac1a6068
exec: Force newlines
...
Allow single line commands like pre="true"
2021-01-11 19:15:37 +01:00
Rodrigo Arias Mallo
130fe39c8e
exec: Abort on error
...
We need exit on the first error, as otherwise we cannot track a bad
execution when no exec is done (when post is not empty).
2021-01-11 18:29:30 +01:00
Rodrigo Arias Mallo
7d4db6b6de
control: Exit on error
...
This prevents srun from silently returning with an error, without
actually queueing the job of a run.
2020-12-07 16:33:40 +01:00
Rodrigo Arias Mallo
1bdeca9e7d
unit: Remove dangerous slash from index names
2020-12-03 16:33:48 +01:00
Rodrigo Arias Mallo
c858f521bf
isolate: add $TMPDIR in the namespace
2020-12-03 13:22:10 +01:00
Rodrigo Arias Mallo
da4bbf8533
isolate: only load some files from /etc
2020-12-03 12:04:51 +01:00
Rodrigo Arias Mallo
f87d830218
isolate: preserve TERM
2020-12-02 13:06:55 +01:00
Rodrigo Arias Mallo
3d352fee19
isolate: allow argument passing
2020-12-02 13:06:35 +01:00
Rodrigo Arias Mallo
1f841649f8
exec: add support for nixPrefix
2020-12-02 11:57:40 +01:00
Rodrigo Arias Mallo
a147a396d9
trebuchet: add the experiment as attribute
2020-11-20 15:35:36 +01:00
Rodrigo Arias Mallo
8bc5656461
tools: recursive getExperiment
...
It allows getExperimentStage to be called from any stage above the
experiment.
2020-11-20 15:34:14 +01:00
Rodrigo Arias Mallo
d192a59fdc
control: Export the run iteration
2020-11-20 15:32:41 +01:00
Rodrigo Arias Mallo
734d494d96
stdexp: Allow extra mounts
2020-11-20 15:30:47 +01:00
David Alvarez
0c438d4dac
Setup for test experiment
2020-11-20 13:57:12 +01:00
Rodrigo Arias Mallo
e8f649327a
exec: Avoid variable expansion at build
...
All bash variables passed in env, pre or post are now expanded at
execution time..
2020-11-20 13:54:45 +01:00
Rodrigo Arias Mallo
e1e34ddf75
exec: add pre and post code to allow cleanup tasks
2020-11-17 16:09:38 +01:00
Rodrigo Arias Mallo
641e752bd5
Add a trace message at unit evaluation
2020-11-17 11:12:12 +01:00
Rodrigo Arias Mallo
317409f6ac
Move index and out inside the user directory
2020-11-03 19:10:00 +01:00
Rodrigo Arias Mallo
5e2797bcde
Create index files for the experiments
2020-11-03 19:10:00 +01:00
Rodrigo Arias Mallo
efd7df068e
Print full experiment path
2020-11-03 19:10:00 +01:00
Rodrigo Arias Mallo
3bd4e61f3f
WIP: Testing with automatic fetching
2020-11-03 19:09:59 +01:00
Rodrigo Arias Mallo
59346fa97e
control: Add status file
2020-11-03 19:09:59 +01:00
Rodrigo Arias Mallo
4beb069627
WIP: postprocessing pipeline
...
Now each run is executed in a independent folder
2020-11-03 19:09:59 +01:00
Rodrigo Arias Mallo
2680dcb66f
Don't nest the unit results
...
The experiment directory now contains symlinks to the units, keeping the
old structure. The unit results are directly placed in the garlic out
directory.
2020-11-03 19:09:58 +01:00
Rodrigo Arias Mallo
c3659d316d
Add perf stage
2020-11-03 19:09:58 +01:00
Rodrigo Arias Mallo
80ccd1240a
Less verbose execution
2020-10-14 16:29:22 +02:00
Rodrigo Arias Mallo
9d8f7d9074
Print the experiment being run
2020-10-14 16:28:27 +02:00
Rodrigo Arias Mallo
c7d2e2d866
Write the unit config in a file
2020-10-14 16:27:47 +02:00
Rodrigo Arias Mallo
7a37913b4e
Set the ssh host from the machine config
2020-10-13 14:30:03 +02:00
Rodrigo Arias Mallo
a38ff31cca
Introduce the runexp stage
2020-10-13 13:00:59 +02:00
Rodrigo Arias Mallo
6ab448b10a
Fix trebuchet description
2020-10-09 20:28:00 +02:00
Rodrigo Arias Mallo
4de20d3aa5
Remove old stages and update some
2020-10-09 20:12:52 +02:00
Rodrigo Arias Mallo
27bc977590
Remove strace from isolate stage
2020-10-09 19:50:28 +02:00
Rodrigo Arias Mallo
332b738889
Move apps into garlic/apps
2020-10-09 16:42:06 +02:00
Rodrigo Arias Mallo
a576be8031
WIP stage redesign
2020-10-09 16:42:06 +02:00
Rodrigo Arias Mallo
654e243735
Include an index in the trebuchet
2020-10-09 16:42:06 +02:00
Rodrigo Arias Mallo
45afe7d391
Simplify experiment stage
2020-10-09 16:42:06 +02:00
Rodrigo Arias Mallo
d599b8c52f
New naming convention
2020-10-09 16:42:06 +02:00