bscpkgs/README

463 lines
15 KiB
Plaintext
Raw Normal View History

2020-07-04 00:25:22 +08:00
2020-08-31 23:34:37 +08:00
bscpkgs: User guide
2020-07-04 00:25:22 +08:00
2020-08-31 23:29:32 +08:00
ABSTRACT
2020-07-04 00:25:22 +08:00
This repository contains a set of nix packages used in the Barcelona
Supercomputing Center by the Programming Models group.
2020-08-31 23:29:32 +08:00
The current setup uses the xeon07 machine to build packages, which are
automatically uploaded to MareNostrum4, due to lack of permissions in
the latter to perform the build safely.
2020-07-04 00:25:22 +08:00
Some preliminary steps must be done manually to be able to build and
install packages (derivations in nix jargon).
2020-08-31 23:29:32 +08:00
1. Introduction
2020-07-06 17:19:20 +08:00
To easily connect to xeon07 in one step, setup the SSH (for version
7.3 and upwards) configuration file in ~/.ssh/config adding these
lines:
Host cobi
HostName ssflogin.bsc.es
User your-username-here
Host xeon07
ProxyJump cobi
HostName xeon07
User your-username-here
You should be able to connect with:
2020-08-31 23:29:32 +08:00
laptop$ ssh xeon07
2020-07-06 17:19:20 +08:00
2020-07-04 00:25:22 +08:00
1.1 Network access
2020-08-31 23:29:32 +08:00
In order to use nix you would need to be able to download the sources
2020-07-06 17:19:20 +08:00
from Internet. Usually the download requires the ports 22, 80 and 443
to be open for outgoing traffic.
2020-07-04 00:25:22 +08:00
2020-08-31 23:29:32 +08:00
Check that you have network access in xeon07 provided by the
environment variables "http_proxy" and "https_proxy". Try to fetch a
webpage with curl, to ensure the proxy is working:
2020-07-04 00:25:22 +08:00
2020-08-31 23:29:32 +08:00
xeon07$ curl x.com
x
2020-07-04 00:25:22 +08:00
2020-08-31 23:29:32 +08:00
1.2 SSH keys
2020-07-04 00:25:22 +08:00
Package sources are usually downloaded directly from the git server,
so you must be able to access all repositories without a password
prompt.
Most repositories at https://pm.bsc.es/gitlab are open to read for
2020-07-06 17:19:20 +08:00
logged in users, but there are some exceptions (for example the nanos6
repository) where you must have explicitly granted read access.
2020-07-04 00:25:22 +08:00
2020-07-06 17:19:20 +08:00
If you don't have a ssh key at ~/.ssh/*.pub in xeon07 create a new one
without password protection by running:
2020-07-04 00:25:22 +08:00
2020-07-06 17:19:20 +08:00
xeon07$ ssh-keygen
2020-07-04 00:25:22 +08:00
Generating public/private rsa key pair.
Enter file in which to save the key (~/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in ~/.ssh/id_rsa.
Your public key has been saved in ~/.ssh/id_rsa.pub.
...
By default it will create the private key at ~/.ssh/id_rsa. Copy the
contents of your public ssh key in ~/.ssh/id_rsa.pub and paste it in
GitLab at:
https://pm.bsc.es/gitlab/profile/keys
2020-07-06 17:19:20 +08:00
Then, configure it for use in the ~/.ssh/config file, adding:
2020-07-04 00:25:22 +08:00
2020-12-07 20:47:17 +08:00
Host bscpm03.bsc.es
2020-07-06 17:19:20 +08:00
IdentityFile ~/.ssh/id_rsa
2020-07-04 00:25:22 +08:00
Finally verify the SSH connection to the server works and you get a
greeting from the GitLab server with your username:
2020-12-07 20:47:17 +08:00
xeon07$ ssh git@bscpm03.bsc.es
2020-07-04 00:25:22 +08:00
PTY allocation request failed on channel 0
Welcome to GitLab, @rarias!
2020-12-07 20:47:17 +08:00
Connection to bscpm03.bsc.es closed.
2020-07-04 00:25:22 +08:00
2020-08-31 23:29:32 +08:00
Verify that you can access nanos6/nanos6 repository (otherwise you
2020-07-04 00:25:22 +08:00
first need to ask to be granted read access), at:
2020-08-31 23:29:32 +08:00
https://pm.bsc.es/gitlab/nanos6/nanos6
2020-07-04 00:25:22 +08:00
2020-08-31 23:29:32 +08:00
Finally, you should be able to download the nanos6/nanos6 git
2020-07-04 00:25:22 +08:00
repository without any password interaction by running:
2020-12-07 20:47:17 +08:00
xeon07$ git clone git@bscpm03.bsc.es:nanos6/nanos6.git
2020-07-04 00:25:22 +08:00
2020-11-06 21:21:48 +08:00
You will also need to access MareNostrum 4 from the xeon07 node, in
order to submit experiments. Add the following lines as well to the
~/.ssh/config file and set your user name:
Host mn0 mn1 mn2
User your-mn4-username
IdentityFile ~/.ssh/id_rsa
Then copy the key to MareNostrum 4 (it will ask you the first time for
your password):
xeon07$ ssh-copy-id -i ~/.ssh/id_rsa.pub mn1
And ensure that you can connect without a password:
xeon07$ ssh mn1
...
login1$
2020-08-31 23:29:32 +08:00
1.3 The bscpkgs repo
2020-07-04 00:25:22 +08:00
Once you have Internet and you have granted access to the PM GitLab
2020-08-31 23:29:32 +08:00
repositories you can begin building software with nix. First ensure
2020-07-06 17:19:20 +08:00
that the nix binaries are available from your shell in xeon07:
2020-07-04 00:25:22 +08:00
2020-07-06 17:19:20 +08:00
xeon07$ nix --version
2020-07-04 00:25:22 +08:00
nix (Nix) 2.3.6
2020-08-31 23:29:32 +08:00
Now you are ready to build and install packages with nix. Clone the
bscpkgs repository:
2020-07-04 00:25:22 +08:00
2020-12-07 20:47:17 +08:00
xeon07$ git clone git@bscpm03.bsc.es:rarias/bscpkgs.git
2020-07-04 00:25:22 +08:00
Nix looks in the current folder for a file named "default.nix" for
2020-08-31 23:29:32 +08:00
packages, so go to the repo directory:
2020-07-04 00:25:22 +08:00
2020-08-31 23:29:32 +08:00
xeon07$ cd bscpkgs
2020-07-04 00:25:22 +08:00
2020-08-31 23:56:58 +08:00
Now you should be able to build nanos6:
2020-07-04 00:25:22 +08:00
2020-08-31 23:29:32 +08:00
xeon07$ nix-build -A bsc.nanos6
..
/nix/store/3i0qkdywm9xjv2cm1ldx9smb552sf6r1-nanos6-2.4-6f10a32
2020-08-31 23:56:58 +08:00
The installation is placed in the nix store (with the path stated in
the last line of the build process), with the "result" symbolic link
pointing to the same location:
2020-08-31 23:29:32 +08:00
xeon07$ readlink result
/nix/store/3i0qkdywm9xjv2cm1ldx9smb552sf6r1-nanos6-2.4-6f10a32
1.4 Configuration of mn4 (MareNostrum 4)
In order to execute the programs built at xeon07, you first need to
2020-08-31 23:56:58 +08:00
enter nix environment. To do so, add to the end of the file ~/.bashrc
2020-08-31 23:29:32 +08:00
in mn4 the following line:
export PATH=/gpfs/projects/bsc15/nix/bin:$PATH
2020-08-31 23:56:58 +08:00
Then logout and login again (our source the ~/.bashrc file) and you
2020-08-31 23:29:32 +08:00
will now have the `nix-setup` command available. This command executes
a new shell where the /nix store is available. To execute it:
2020-07-04 00:25:22 +08:00
2020-08-31 23:29:32 +08:00
mn4$ nix-setup
2020-07-04 00:25:22 +08:00
2020-08-31 23:29:32 +08:00
Now you will see a new shell, where you can access the nix store:
nix|mn4$ ls /nix
gcroots profiles store var
The last build of nanos6 can be also found in mn4 at the same
location:
/nix/store/3i0qkdywm9xjv2cm1ldx9smb552sf6r1-nanos6-2.4-6f10a32
Remember to enter the nix environment by running `nix-setup` when you
need something from the nix store.
You cannot perform any build operations from mn4: to do so use the
xeon07 machine.
2020-07-04 00:25:22 +08:00
2. Basic usage of nix
Nix is a package manager which handles easily reproducibility and
configuration of packages and dependencies. See more info here:
https://nixos.org/nix/manual/
We will only cover the basic usage of nix for the BSC packages.
2.1 The user environment
All nix packages are stored under the /nix directory. When you need to
"install" some binary from nix, a symlink is added to a folder
2020-07-06 17:19:20 +08:00
included in the $PATH variable. In particular, you should have
2020-07-04 00:25:22 +08:00
something similar added to your $PATH:
2020-07-06 17:19:20 +08:00
xeon07$ echo $PATH | sed 's/:/\n/g' | grep nix
2020-07-04 00:25:22 +08:00
/home/Computational/rarias/.nix-profile/bin
/nix/var/nix/profiles/default/bin
The first one is your custom installation of packages that are stored
in your home directory and the second one is the default installation
which contains the nix tools (which are installed in the /nix
directory as well).
2020-07-06 17:19:20 +08:00
Use `nix search` to look for official packages in the "nixpkgs"
channel (the default repository of packages):
xeon07$ nix search cowsay
warning: using cached results; pass '-u' to update the cache
* cowsay (cowsay)
A program which generates ASCII pictures of a cow with a message
* neo-cowsay (neo-cowsay)
Cowsay reborn, written in Go
* ponysay (ponysay-3.0.3)
Cowsay reimplemention for ponies
* tewisay (tewisay-unstable-2017-04-14)
Cowsay replacement with unicode and partial ansi escape support
2020-07-04 00:25:22 +08:00
When you need a program that is not available in your environment,
much like when you use "module load ..." you can use nix-env to modify
what is currently loaded. For example:
2020-07-06 17:19:20 +08:00
xeon07$ nix-env -iA nixpkgs.cowsay
2020-07-04 00:25:22 +08:00
2020-07-06 17:19:20 +08:00
Notice that you should specify the prefix "nixpkgs." before. The
command will download (if not found already in the nix store), compile
(if necessary) and load the program `cowsay` from the nixpkgs
2020-07-04 00:25:22 +08:00
repository in the environment. You should be able to run it as:
2020-07-06 17:19:20 +08:00
xeon07$ cowsay "hello world"
2020-07-04 00:25:22 +08:00
_____________
< hello world >
-------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||
2020-07-06 17:19:20 +08:00
You can now inspect the ~/.nix-profile/bin folder, and see that a new
2020-07-04 00:25:22 +08:00
symlink was added to the actual installation of the binary:
2020-07-06 17:19:20 +08:00
xeon07$ file ~/.nix-profile/bin/cowsay
2020-07-04 00:25:22 +08:00
/home/Computational/rarias/.nix-profile/bin/cowsay: symbolic link to
`/nix/store/673gczmhr5b449521srz2n7g1klykz6n-cowsay-3.03+dfsg2/bin/cowsay'
You can list the current packages installed in your environment by
running:
2020-07-06 17:19:20 +08:00
xeon07$ nix-env -q
2020-07-04 00:25:22 +08:00
cowsay-3.03+dfsg2
nix-2.3.6
Notice that this setup only affects your user environment. Also, it is
permanent for any new session until you modify the environment again
and is immediate, all sessions will have the new environment
instantaneously.
You can remove any package from the environment using:
2020-07-06 17:19:20 +08:00
xeon07$ nix-env -e cowsay
2020-07-04 00:25:22 +08:00
See the manual with `nix-env --help` if you want to know more details.
2.2 Building packages
Usually, all official packages are already compiled and distributed
from a cache server so you don't need to rebuild them again. However,
BSC packages are distributed only in source code form as we don't have
any binary cache server yet.
Nix will handle the build process without any user interaction (with a
2020-07-06 17:19:20 +08:00
few exceptions which you shouldn't have to worry). If any other user
has already built the package then the build process is not needed,
and the package is used as is.
2020-07-04 00:25:22 +08:00
2020-08-31 23:29:32 +08:00
In order to build a BSC package go to the `bscpkgs` directory, and
2020-07-04 00:25:22 +08:00
run:
2020-07-06 17:19:20 +08:00
xeon07$ nix-build -A bsc.dummy
2020-07-04 00:25:22 +08:00
2020-07-06 17:19:20 +08:00
Notice the "bsc." prefix for BSC packages. The package will be built
and installed in the /nix directory, then a symlink is placed in the
result directory:
2020-07-04 00:25:22 +08:00
2020-07-06 17:19:20 +08:00
xeon07$ find result/ -type f
2020-07-04 00:25:22 +08:00
result/
result/bin
result/bin/dummy
The way in which nix handles the packages and dependencies ensures
2020-07-06 17:19:20 +08:00
that the environment of the build process of any package is exactly
the same, so the generated output should be the same if the builds are
deterministic.
2020-07-04 00:25:22 +08:00
You can check the reproducibility of the build by adding the "--check"
flag, which will rebuild the package and compare the checksum of every
2020-07-06 17:19:20 +08:00
file with the ones previously built:
2020-07-04 00:25:22 +08:00
2020-07-06 17:19:20 +08:00
xeon07$ nix-build -A bsc.dummy --check
2020-07-04 00:25:22 +08:00
...
2020-07-06 17:19:20 +08:00
xeon07$ echo $?
2020-07-04 00:25:22 +08:00
0
A return code of zero ensures the output is bit by bit identical to
2020-07-06 17:19:20 +08:00
the one installed. There are some packages that include
indeterministic information in the build process (such as the
timestamp of the current time) which will produce an error. Those
packages must be patched to ensure the output is deterministic.
Notice that if you "cd" into the "result/" directory you will be at
/nix directory (as you have follow the symlink) where you don't have
write permission. Therefore if your program attempts to write to the
current directory it will fail. It is recommended to instead run your
program from the top directory:
xeon07$ result/bin/dummy
2020-07-04 00:25:22 +08:00
Hello world!
Or you can install it in the environment:
2020-07-06 17:19:20 +08:00
xeon07$ nix-env -i ./result
2020-07-04 00:25:22 +08:00
And "cd" into any directory where you want to output some files and
just run it by the name:
2020-07-06 17:19:20 +08:00
xeon07$ cd /tmp
xeon07$ dummy
2020-07-04 00:25:22 +08:00
Hello world!
Finally, you can remove it from the environment if you don't need it:
2020-07-06 17:19:20 +08:00
xeon07$ nix-env -e dummy
2020-07-04 00:25:22 +08:00
If you want to know more details use "nix-build --help" to see the
manual.
2.3 The build process
Each package is built following a programmable configuration
2020-07-06 17:19:20 +08:00
description in the nix language. Builds in nix are performed under
very strict conditions. No access to any file in the file system is
allowed, unless stated in the dependencies, which are in the /nix
store only.
2020-07-04 00:25:22 +08:00
There is no network access in the build process and other restrictions
2020-07-06 17:19:20 +08:00
are enforced so that the build environment is reproducible. See more
details here:
2020-07-04 00:25:22 +08:00
https://nixos.wiki/wiki/Nix#Sandboxing
2020-08-31 23:29:32 +08:00
The top level "default.nix" file of the bscpkgs serves as a index
2020-07-06 17:19:20 +08:00
of all BSC packages. You can see the definition for each package, for
example the nbody app:
2020-07-04 00:25:22 +08:00
nbody = callPackage ./bsc/apps/nbody/default.nix {
stdenv = pkgs.gcc9Stdenv;
mpi = intel-mpi;
icc = icc;
tampi = tampi;
nanos6 = nanos6-git;
};
The compilation details are specified in the
"bsc/apps/nbody/default.nix" file. You can configure the package by
changing the inputs, for example, what specific implementation of
nanos6 or MPI you want to use. To change the MPI implementation to the
official MPICH package use:
nbody = callPackage ./bsc/apps/nbody/default.nix {
stdenv = pkgs.gcc9Stdenv;
mpi = pkgs.mpich; # Notice pkgs prefix for official packages
icc = icc;
tampi = tampi;
nanos6 = nanos6-git;
};
Then you can rebuild the nbody package:
2020-07-06 17:19:20 +08:00
xeon07$ nix-build -A bsc.nbody
2020-07-04 00:25:22 +08:00
...
And verify that the binary is indeed linked to MPICH now:
2020-07-06 17:19:20 +08:00
xeon07$ ldd result/bin/nbody_mpi.N2.2048.exe | grep mpi
2020-07-04 00:25:22 +08:00
libmpi.so.12 => /nix/store/dwkkcv78a5bs8smflpx9ppp3klhz3i98-mpich-3.3.2/lib/libmpi.so.12 (0x00007f6be0f07000)
If you modify a package which another package requires as a
dependency, nix will rebuild all required packages to propagate your
changes on demand.
However, if you come back to the original configuration, the package
will still be in the /nix store (unless the garbage collector was
manually run and removed your old build), so you don't need to rebuild
it again.
For example if nbody is configured back to use Intel MPI:
nbody = callPackage ./bsc/apps/nbody/default.nix {
stdenv = pkgs.gcc9Stdenv;
mpi = intel-mpi;
icc = icc;
tampi = tampi;
nanos6 = nanos6-git;
};
The build process now is not required:
2020-07-06 17:19:20 +08:00
xeon07$ nix-build -A bsc.nbody
2020-07-04 00:25:22 +08:00
/nix/store/rbq7wrjcmg6fzd6yhrlnkfvzcavdbdpc-nbody
2020-07-06 17:19:20 +08:00
xeon07$ ldd result/bin/nbody_mpi.N2.2048.exe | grep mpi
2020-07-04 00:25:22 +08:00
libmpifort.so.12 => /nix/store/jvsjvxj2a08340fpdrqbqix9z3mpp3bd-intel-mpi-2019.7.217/lib/libmpifort.so.12 (0x00007f3a00402000)
libmpi.so.12 => /nix/store/jvsjvxj2a08340fpdrqbqix9z3mpp3bd-intel-mpi-2019.7.217/lib/libmpi.so.12 (0x00007f39fed34000)
Take a look at the different package description files in the
2020-08-31 23:29:32 +08:00
bscpkgs repository if you want to understand more details. Also
2020-07-04 00:25:22 +08:00
the nix pills are a very good reference:
https://nixos.org/nixos/nix-pills/
2020-08-25 18:59:44 +08:00
2.4 Debugging the build process
It may happen that the build process fails in an unexpected way. Most
problems are related to missing dependencies and can be easily found
by looking at the error messages.
Other build problems are more subtle and require more debugging time.
One way of inspecting a build problem is by adding the breakpointHook
hook to the nativeBuildInputs array in a nix derivation (see
https://nixos.org/nixpkgs/manual/#ssec-setup-hooks for more info),
which will stop the build process and allow a shell to be attached to
the sandbox.
xeon07$ nix-build -A bsc.nbody
...
/nix/store/gvqm2yc9xx4vh3nglgckz8siya66jnkx-stdenv-linux/setup: line
83: fake-missing-command: command not found
build failed in buildPhase with exit code 127
To attach install cntr and run the following command as root:
cntr attach -t command \
cntr-/nix/store/sk2nsj7xfr62cjk6m3725ydfyswqz7n1-nbody
The command must run as root user, so you can use `sudo -i` to run it,
(the -i option is required to load the shell profile which provides
the nix path containing the cntr tool):
xeon$ sudo -i cntr attach -t command \
cntr-/nix/store/sk2nsj7xfr62cjk6m3725ydfyswqz7n1-nbody
nixbld@localhost:/var/lib/cntr> ls
bin build dev etc nix proc tmp var
Then you can inspect the build environment to see why the build
failed. Source the build/env-vars file to get the same environment
variables (which include the $PATH) of the build process.
2020-07-04 00:25:22 +08:00
/* vim: set ts=2 sw=2 tw=72 fo=watqc expandtab spell autoindent: */