doc: introduce CUDA user guide

Signed-off-by: Connor Baker <ConnorBaker01@gmail.com>
(cherry picked from commit fa4f91662a)
This commit is contained in:
Connor Baker
2025-07-16 16:07:46 +00:00
committed by github-actions[bot]
parent 32cc0153bc
commit cf0d992bfd
2 changed files with 31 additions and 14 deletions

View File

@@ -2,6 +2,8 @@
Compute Unified Device Architecture (CUDA) is a parallel computing platform and application programming interface (API) model created by NVIDIA. It's commonly used to accelerate computationally intensive problems and has been widely adopted for High Performance Computing (HPC) and Machine Learning (ML) applications.
## User Guide {#cuda-user-guide}
Packages provided by NVIDIA which require CUDA are typically stored in CUDA package sets.
Nixpkgs provides a number of CUDA package sets, each based on a different CUDA release. Top-level attributes providing access to CUDA package sets follow these naming conventions:
@@ -19,7 +21,7 @@ Here are two examples to illustrate the naming conventions:
All CUDA package sets include common CUDA packages like `libcublas`, `cudnn`, `tensorrt`, and `nccl`.
## Configuring Nixpkgs for CUDA {#cuda-configuring-nixpkgs-for-cuda}
### Configuring Nixpkgs for CUDA {#cuda-configuring-nixpkgs-for-cuda}
CUDA support is not enabled by default in Nixpkgs. To enable CUDA support, make sure Nixpkgs is imported with a configuration similar to the following:
@@ -59,7 +61,7 @@ Certain CUDA capabilities are not targeted by default, including capabilities be
The `cudaForwardCompat` boolean configuration option determines whether PTX support for future hardware is enabled.
## Configuring CUDA package sets {#cuda-configuring-cuda-package-sets}
### Modifying CUDA package sets {#cuda-modifying-cuda-package-sets}
CUDA package sets are created by `callPackage`-ing `pkgs/top-level/cuda-packages.nix` with an explicit argument for `cudaMajorMinorVersion`, a string of the form `"<major>.<minor>"` (e.g., `"12.2"`), which informs the CUDA package set tooling which version of CUDA to use. The majority of the CUDA package set tooling is available through the top-level attribute set `_cuda`, a fixed-point defined outside the CUDA package sets.
@@ -97,9 +99,9 @@ final: prev: {
}
```
## Extending CUDA package sets {#cuda-extending-cuda-package-sets}
### Extending CUDA package sets {#cuda-extending-cuda-package-sets}
CUDA package sets are scopes, so they provide the usual `overrideScope` attribute for overriding package attributes (see the note about `cudaMajorMinorVersion` and `_cuda` in [Configuring CUDA package sets](#cuda-configuring-cuda-package-sets)).
CUDA package sets are scopes, so they provide the usual `overrideScope` attribute for overriding package attributes (see the note about `cudaMajorMinorVersion` and `_cuda` in [Configuring CUDA package sets](#cuda-modifying-cuda-package-sets)).
Inspired by `pythonPackagesExtensions`, the `_cuda.extensions` attribute is a list of extensions applied to every version of the CUDA package set, allowing modification of all versions of the CUDA package set without having to know what they are or find a way to enumerate and modify them explicitly. As an example, disabling `cuda_compat` across all CUDA package sets can be accomplished with this overlay:
@@ -113,7 +115,7 @@ final: prev: {
}
```
## Using `cudaPackages` {#cuda-using-cudapackages}
### Using `cudaPackages` {#cuda-using-cudapackages}
::: {.caution}
A non-trivial amount of CUDA package discoverability and usability relies on the various setup hooks used by a CUDA package set. As a result, users will likely encounter issues trying to perform builds within a `devShell` without manually invoking phases.
@@ -153,7 +155,15 @@ When using `callPackage`, you can choose to pass in a different variant, e.g. wh
Overriding the CUDA package set used by a package may cause inconsistencies, since the override affects niether the direct nor transitive dependencies of the package. As a result, it is easy to end up with a package which uses a different CUDA package set than its dependencies. If at all possible, it is recommended to change the default CUDA package set globally, to ensure a consistent environment.
:::
## Using `cudaPackages.pkgs` {#cuda-using-cudapackages-pkgs}
### Nixpkgs CUDA variants {#cuda-nixpkgs-cuda-variants}
Nixpkgs CUDA variants are provided primarily for the convenience of selecting CUDA-enabled packages by attribute path. As an example, the `pkgsForCudaArch` collection of CUDA Nixpkgs variants allows one to access an instantiation of OpenCV with CUDA support for an Ada Lovelace GPU with the attribute path `pkgsForCudaArch.sm_89.opencv`, without needing to modify the `config` provided when importing Nixpkgs.
::: {.caution}
Nixpkgs variants are not free: they require re-evaluating Nixpkgs. Where possible, import Nixpkgs once, with the desired configuration.
:::
#### Using `cudaPackages.pkgs` {#cuda-using-cudapackages-pkgs}
Each CUDA package set has a `pkgs` attribute, which is a variant of Nixpkgs where the enclosing CUDA package set is made the default CUDA package set. This was done primarily to avoid package set leakage, wherein a member of a non-default CUDA package set has a (potentially transitive) dependency on a member of the default CUDA package set.
@@ -163,11 +173,11 @@ Package set leakage is a common problem in Nixpkgs and is not limited to CUDA pa
As an added benefit of `pkgs` being configured this way, building a package with a non-default version of CUDA is as simple as accessing an attribute. As an example, `cudaPackages_12_8.pkgs.opencv` provides OpenCV built against CUDA 12.8.
## Using `pkgsCuda` {#cuda-using-pkgscuda}
#### Using `pkgsCuda` {#cuda-using-pkgscuda}
The `pkgsCuda` attribute set is a variant of Nixpkgs configured with `cudaSupport = true;` and `rocmSupport = false`. It is a convenient way access a variant of Nixpkgs configured with the default set of CUDA capabilities.
## Using `pkgsForCudaArch` {#cuda-using-pkgsforcudaarch}
#### Using `pkgsForCudaArch` {#cuda-using-pkgsforcudaarch}
The `pkgsForCudaArch` attribute set maps CUDA architectures (e.g., `sm_89` for Ada Lovelace or `sm_90a` for architecture-specific Hopper) to Nixpkgs variants configured to support exactly that architecture. As an example, `pkgsForCudaArch.sm_89` is a Nixpkgs variant extending `pkgs` and setting the following values in `config`:
@@ -191,7 +201,7 @@ To illustrate: support for Blackwell (e.g., `sm_100`) was only added in CUDA 12.
The `pkgsForCudaArch` attribute set makes it possible to access packages built for a specific architecture without needing to manually call `pkgs.extend` and supply a new `config`. As an example, `pkgsForCudaArch.sm_89.python3Packages.torch` provides PyTorch built for Ada Lovelace GPUs.
## Running Docker or Podman containers with CUDA support {#cuda-docker-podman}
### Running Docker or Podman containers with CUDA support {#cuda-docker-podman}
It is possible to run Docker or Podman containers with CUDA support. The recommended mechanism to perform this task is to use the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/index.html).
@@ -236,7 +246,7 @@ $ nix run nixpkgs#jq -- -r '.devices[].name' < /var/run/cdi/nvidia-container-too
all
```
### Specifying what devices to expose to the container {#cuda-specifying-what-devices-to-expose-to-the-container}
#### Specifying what devices to expose to the container {#cuda-specifying-what-devices-to-expose-to-the-container}
You can choose what devices are exposed to your containers by using the identifier on the generated CDI specification. Like follows:
@@ -257,7 +267,7 @@ GPU 1: NVIDIA GeForce RTX 2080 SUPER (UUID: <REDACTED>)
By default, the NVIDIA Container Toolkit will use the GPU index to identify specific devices. You can change the way to identify what devices to expose by using the `hardware.nvidia-container-toolkit.device-name-strategy` NixOS attribute.
:::
### Using docker-compose {#cuda-using-docker-compose}
#### Using docker-compose {#cuda-using-docker-compose}
It's possible to expose GPU's to a `docker-compose` environment as well. With a `docker-compose.yaml` file like follows: