linux:

watch -n 1 -d nvidia-smi

watch -n 2 -d nvidia-smi
每隔2秒刷新一次,每次只在固定位置刷新


2. 定时查询

nvidia-smi -l 2
每隔2秒查询一下,但是每次的查询结果会接着上一个往下刷新,导致持续扩张terminal的历史log

gpu占用查看 nvidia-smi实时刷新_AI视觉网奇的博客-CSDN博客_查看显卡占用

windows:

进入C:\Program Files\NVIDIA Corporation\NVSMI

在此处打开cmd

nvidia-smi:查看当前的显卡使用
nvidia-smi -L:列出所有显卡的信息
nvidia-smi -l 2:动态显示显卡使用信息,每一秒更新一次,参数值可以自己修改
nvida-smi -lms:循环动态显示
nvidia-smi dmon:设备监视(device monitor)
nvidia-smi -i n:显示指定的显卡(如果你有多块显卡,n的值对应显卡的位置)
————————————————
版权声明:本文为CSDN博主「ShellCollector」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/jacke121/article/details/115710809

这个乱码:

    import time
    import subprocess
    import locale
    import codecs
    import os

    cmd = "nvidia-smi"
    interval = 3
    while True:
        ps = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, shell=True)
        for i in range(0, 13):
            data = ps.stdout.readline()
            # bytes转化为str
            data = str(data)
            # 判断是否开始于"b'",去掉
            if data.startswith('b\''):
                data = data[2:]
            # 同样去掉尾部
            if data.endswith('\\r\\n\''):
                data = data[:len(data) - 5]
            print("\r " + data, end="")
            # print("\r"+data,end = "",flush=True)
        # print("\n\n\n\n\n\n\n")
        time.sleep(interval)

这个有的报错

import pynvml
import time
pynvml.nvmlInit()

def printNvidiaGPU(gpu_id):
    # get GPU temperature
    gpu_device = pynvml.nvmlDeviceGetHandleByIndex(gpu_id)

    temperature = pynvml.nvmlDeviceGetTemperature(gpu_device, pynvml.NVML_TEMPERATURE_GPU)
    # get GPU memory total
    totalMemory = pynvml.nvmlDeviceGetMemoryInfo(gpu_device).total
    # get GPU memory used
    usedMemory = pynvml.nvmlDeviceGetMemoryInfo(gpu_device).used

    performance = pynvml.nvmlDeviceGetPerformanceState(gpu_device)

    powerUsage = pynvml.nvmlDeviceGetPowerUsage(gpu_device)
    powerState = pynvml.nvmlDeviceGetPowerState(gpu_device)
    FanSpeed = pynvml.nvmlDeviceGetFanSpeed(gpu_device)
    PersistenceMode = pynvml.nvmlDeviceGetPersistenceMode(gpu_device)
    UtilizationRates = pynvml.nvmlDeviceGetUtilizationRates(gpu_device)

    print("MemoryInfo:{0}M/{1}M,使用率:{2}%".format("%.1f" % (usedMemory / 1024 / 1024), "%.1f" % (totalMemory / 1024 / 1024), "%.1f" % (usedMemory/totalMemory*100)))
    print("Temperature:{0}摄氏度".format(temperature))
    print("Performance:{0}".format(performance))
    print("PowerState: {0}".format(powerState))
    print("PowerUsage: {0}".format(powerUsage / 1000))
    print("FanSpeed: {0}".format(FanSpeed))
    print("PersistenceMode: {0}".format(PersistenceMode))
    print("UtilizationRates: {0}".format(UtilizationRates.gpu))
    time.sleep(1)
    
while (1):
    printNvidiaGPU(0) # 此处以0号gpu为例
    time.sleep(1)

Logo

为开发者提供学习成长、分享交流、生态实践、资源工具等服务,帮助开发者快速成长。

更多推荐