.COVER .TL Garlic: User guide .AF "Barcelona Supercomputing Center" .AU "Rodrigo Arias Mallo" .COVEND .H 1 "Overview" Dependency graph of a complete experiment that produces a figure. Each box is a derivation and arrows represent \fBbuild dependencies\fP. .DS CB .PS linewid=0.9; right box "Source" "code" arrow <-> "Develop" above box "Program" arrow <-> "Experiment" above box "Results" arrow <-> "Data" "exploration" box "Figures" .PE .DE .H 1 "Development" .P The development phase consists in creating a functional program by modifying the source code. This process is generally cyclic, where the developer needs to compile the program, correct mistakes and debug the program. .P It requires to be running in the target machine. .\" =================================================================== .H 1 "Experimentation" The experimentation phase begins with a functional program which is the object of study. The experimenter then designs an experiment aimed at measuring some properties of the program. The experiment is then executed and the results are stored for further analysis. .H 2 "Writing the experiment configuration" .P The term experiment is quite overloaded in this document. We are going to see how to write the recipe that describes the execution pipeline of an experiment. .P Within the garlic benchmark, experiments are typically sorted by a hierarchy depending on which application they belong. Take a look at the \fCgarlic/exp\fP directory and you will find some folders and .nix files. .P Each of those recipes files describe a function that returns a derivation, which, once built will result in the first stage script of the execution pipeline. .P The first part of states the name of the attributes required as the input of the function. Typically some packages, common tools and options: .DS I .VERBON { stdenv , stdexp , bsc , targetMachine , stages , garlicTools }: .VERBOFF .DE .P Notice the \fCtargetMachine\fP argument, which provides information about the machine in which the experiment will run. You should write your experiment in such a way that runs in multiple clusters. .DS I .VERBON varConf = { blocks = [ 1 2 4 ]; nodes = [ 1 ]; }; .VERBOFF .DE .P The \fCvarConf\fP is the attribute set that allows you to vary some factors in the experiment. .DS I .VERBON genConf = var: fix (self: targetMachine.config // { expName = "example"; unitName = self.expName + "-b" + toString self.blocks; blocks = var.blocks; nodes = var.nodes; cpusPerTask = 1; tasksPerNode = self.hw.socketsPerNode; }); .VERBOFF .DE .P The \fCgenConf\fP function is the central part of the description of the experiment. Takes as input \fBone\fP configuration from the cartesian product of .I varConfig and returns the complete configuration. In our case, it will be called 3 times, with the following inputs at each time: .DS I .VERBON { blocks = 1; nodes = 1; } { blocks = 2; nodes = 1; } { blocks = 4; nodes = 1; } .VERBOFF .DE .P The return value can be inspected by calling the function in the interactive nix repl: .DS I .VERBON nix-repl> genConf { blocks = 2; nodes = 1; } { blocks = 2; cpusPerTask = 1; expName = "example"; hw = { ... }; march = "skylake-avx512"; mtune = "skylake-avx512"; name = "mn4"; nixPrefix = "/gpfs/projects/bsc15/nix"; nodes = 1; sshHost = "mn1"; tasksPerNode = 2; unitName = "example-b2"; } .VERBOFF .DE .P Some configuration parameters were added by .I targetMachine.config , such as the .I nixPrefix , .I sshHost or the .I hw attribute set, which are specific for the cluster they experiment is going to run. Also, the .I unitName got assigned the proper name based on the number of blocks, but the number of tasks per node were assigned based on the hardware description of the target machine. .P By following this rule, the experiments can easily be ported to machines with other hardware characteristics, and we only need to define the hardware details once. Then all the experiments will be updated based on those details. .H 2 "First steps" .P The complete results generally take a long time to be finished, so it is advisable to design the experiments iteratively, in order to quickly obtain some feedback. Some recommendations: .BL .LI Start with one unit only. .LI Set the number of runs low (say 5) but more than one. .LI Use a small problem size, so the execution time is low. .LI Set the time limit low, so deadlocks are caught early. .LE .P As soon as the first runs are complete, examine the results and test that everything looks good. You would likely want to check: .BL .LI The resources where assigned as intended (nodes and CPU affinity). .LI No errors or warnings: look at stderr and stdout logs. .LI If a deadlock happens, it will run out of the time limit. .LE .P As you gain confidence over that the execution went as planned, begin increasing the problem size, the number of runs, the time limit and lastly the number of units. The rationale is that each unit that is shared among experiments gets assigned the same hash. Therefore, you can iteratively add more units to an experiment, and if they are already executed (and the results were generated) is reused. .TC