TrueNAS Scale and NVidia cards
Description
Problem/Justification
Impact
is duplicated by
SmartDraw Connector
Katalon Manual Tests (BETA)
Activity
William Gryzbowski January 20, 2021 at 4:06 PM
I have related a ticket that seems to be a duplicated, let me know if I got that wrong.
Jack January 19, 2021 at 11:15 AM
Do you need any additional info from my side?
Jack January 18, 2021 at 6:51 PM
Welcome to TrueNAS
sirius# modprobe nvidia-current-drm
sirius# nvidia-smi
Mon Jan 18 21:49:10 2021
-----------------------------------------------------------------------------
NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 | ||
------------------------------+--------------------+---------------------+ | ||
GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
|
| MIG M. |
=========================================================================== | ||
0 GeForce GTX 105... Off | 00000000:06:00.0 Off | N/A |
38% 43C P8 N/A / 75W | 0MiB / 4040MiB | 0% Default |
|
| N/A |
---------------------------------------------------------------------------
-----------------------------------------------------------------------------
Processes: |
GPU GI CI PID Type Process name GPU Memory |
ID ID Usage |
============================================================================= |
No running processes found |
-----------------------------------------------------------------------------
Kris Moore January 18, 2021 at 3:44 PM
Lets confirm first if this is an issue with the module not loading or k3's specifically:
Can you run "modprobe nvidia-current-drm" from the CLI? Does that load the kernel module?
Anda January 18, 2021 at 3:19 PM
@Kris Moore
I'm on TrueNAS-SCALE-21.01-MASTER-20210117-112917 and get these errors in my console:
Jan 18 16:14:54 truenas k3s[21752]: E0118 16:14:54.577137 21752 container.go:677] error occurred while collecting nvidia stats for container /kubepods/besteffort/pod7947e7d4-b617-440e-b6ca-4b7e880825f1/f893d4a683f9bef76db8bb878dcc08601986e3073c0fef6215c219726812cb83: %!s(<nil>)
Jan 18 16:14:54 truenas k3s[21752]: W0118 16:14:54.577156 21752 container.go:549] Failed to update stats for container "/kubepods/besteffort/pod7947e7d4-b617-440e-b6ca-4b7e880825f1/f893d4a683f9bef76db8bb878dcc08601986e3073c0fef6215c219726812cb83": error while getting gpu utilization: nvml: Not Supported
Jan 18 16:15:07 truenas k3s[21752]: E0118 16:15:07.964188 21752 container.go:677] error occurred while collecting nvidia stats for container /kubepods/besteffort/pod7947e7d4-b617-440e-b6ca-4b7e880825f1/f893d4a683f9bef76db8bb878dcc08601986e3073c0fef6215c219726812cb83: %!s(<nil>)
Jan 18 16:15:19 truenas k3s[21752]: E0118 16:15:19.835262 21752 container.go:677] error occurred while collecting nvidia stats for container /kubepods/besteffort/pod7947e7d4-b617-440e-b6ca-4b7e880825f1/f893d4a683f9bef76db8bb878dcc08601986e3073c0fef6215c219726812cb83: %!s(<nil>)
Jan 18 16:15:34 truenas k3s[21752]: E0118 16:15:34.119472 21752 container.go:677] error occurred while collecting nvidia stats for container /kubepods/besteffort/pod7947e7d4-b617-440e-b6ca-4b7e880825f1/f893d4a683f9bef76db8bb878dcc08601986e3073c0fef6215c219726812cb83: %!s(<nil>)
HW Transcoding not working in Plex with NVidia card in Docker on TrueNAS Scale 20.12
I can see the card in TrueNAS, I've got docker "nvidia-device-plugin-daemonset-4n5r8", Plex Docker also deployed with Environment Variables: NVIDIA_VISIBLE_DEVICES - all, NVIDIA_DRIVER_CAPABILITIES - all, and I can see NVidia card in Docker running command NVIDIA-SMI... but HW Transcoding not working