n卡驱动关系

n卡驱动关系
gogongxt| 层级 | 包含的核心组件 | 查看命令示例 | 可变性 (Mutability) |
|---|---|---|---|
| 物理机 (宿主机) | GPU 硬件、NVIDIA 显示驱动 (如
570.xx) |
nvidia-smi (看 Driver
Version) |
不可变。 容器内无法修改,只能由物理机升级 |
| 物理机 (宿主机) | 驱动决定 CUDA API 上限 (如
12.8,最高支持nvcc 12.8) |
nvidia-smi (看右上角 CUDA
Version) |
不可变。 它是驱动的附属只读属性,是一块“天花板”,可以升级物理机驱动改变。 |
| 镜像 (Image) | CUDA Toolkit
(nvcc)、cuDNN |
nvcc -V (在容器内执行) |
镜像本身不可变 |
| 容器 (Container) | 镜像的nvcc版本,torch版本,cuda-runtime版本 | nvcc -V
pip show torch 等 |
动态可变 |
IMPORTANT
需要保证一条自下而上的兼容链条:
nvcc版本,torch版本,cuda-runtime版本最好三个保持一致,可以减少一些cuda编译和运行库的奇怪bug- 这三个版本可以低于
CUDA Version,因为这个是支持的上限,低于它没关系
查看详细版本信息命令
nvidia-smi
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
Thu Apr 23 20:50:41 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.148.08 Driver Version: 570.148.08 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA H200 On | 00000000:18:00.0 Off | 0 |
| N/A 32C P0 123W / 700W | 130220MiB / 143771MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA H200 On | 00000000:2A:00.0 Off | 0 |
| N/A 32C P0 125W / 700W | 130308MiB / 143771MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 2 NVIDIA H200 On | 00000000:3A:00.0 Off | 0 |
| N/A 31C P0 115W / 700W | 130308MiB / 143771MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 3 NVIDIA H200 On | 00000000:5D:00.0 Off | 0 |
| N/A 30C P0 114W / 700W | 130308MiB / 143771MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 4 NVIDIA H200 On | 00000000:9A:00.0 Off | 0 |
| N/A 31C P0 114W / 700W | 130308MiB / 143771MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 5 NVIDIA H200 On | 00000000:AB:00.0 Off | 0 |
| N/A 32C P0 114W / 700W | 130308MiB / 143771MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 6 NVIDIA H200 On | 00000000:BA:00.0 Off | 0 |
| N/A 31C P0 115W / 700W | 130308MiB / 143771MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 7 NVIDIA H200 On | 00000000:DB:00.0 Off | 0 |
| N/A 30C P0 114W / 700W | 129828MiB / 143771MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 255235 C sglang::scheduler_TP0 13021... |
| 1 N/A N/A 255236 C sglang::scheduler_TP1 13029... |
| 2 N/A N/A 255237 C sglang::scheduler_TP2 13029... |
| 3 N/A N/A 255238 C sglang::scheduler_TP3 13029... |
| 4 N/A N/A 255239 C sglang::scheduler_TP4 13029... |
| 5 N/A N/A 255240 C sglang::scheduler_TP5 13029... |
| 6 N/A N/A 255241 C sglang::scheduler_TP6 13029... |
| 7 N/A N/A 255242 C sglang::scheduler_TP7 12981... |
+-----------------------------------------------------------------------------------------+nvcc --version
1
2
3
4
5
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:23:50_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0pip list | grep torch
1
torch 2.9.1pip show torch
1
2
3
4
5
6
7
8
9
10
Name: torch
Version: 2.9.1
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org
Author:
Author-email: PyTorch Team <packages@pytorch.org>
License: BSD-3-Clause
Location: /usr/local/lib/python3.12/dist-packages
Requires: filelock, fsspec, jinja2, networkx, nvidia-cublas-cu12, nvidia-cuda-cupti-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-runtime-cu12, nvidia-cudnn-cu12, nvidia-cufft-cu12, nvidia-cufile-cu12, nvidia-curand-cu12, nvidia-cusolver-cu12, nvidia-cusparse-cu12, nvidia-cusparselt-cu12, nvidia-nccl-cu12, nvidia-nvjitlink-cu12, nvidia-nvshmem-cu12, nvidia-nvtx-cu12, setuptools, sympy, triton, typing-extensions
Required-by: cache_dit, compressed-tensors, flashinfer-python, outlines, quack-kernels, runai-model-streamer, sglang, st_attn, timm, torch_c_dlpack_ext, torchaudio, torchvision, vllm, vsa, xgrammarpip show nvidia-cuda-runtime-cu12
1
2
3
4
5
6
7
8
9
10
Name: nvidia-cuda-runtime-cu12
Version: 12.8.90
Summary: CUDA Runtime native Libraries
Home-page: https://developer.nvidia.com/cuda-zone
Author: Nvidia CUDA Installer Team
Author-email: compute_installer@nvidia.com
License: NVIDIA Proprietary Software
Location: /usr/local/lib/python3.12/dist-packages
Requires:
Required-by: torch 评论
匿名评论隐私政策




