GPU 性能测试

gpu performance test

foreversmart write on 2020-06-13

GPU 性能测试指标

GPU burn 官网:
gpu burn 性能测试
Shell
Copy
(base) root@278493cc22a8fd69:~/performance_test/gpu-burn# ./gpu_burn 60 GPU 0: P102-100 (UUID: GPU-441ecc72-0be5-f954-8d85-61658fbb8d81) Initialized device 0 with 10156 MB of memory (9975 MB available, using 8978 MB of it), using FLOATS 11.7% proc'd: 2795 (9270 Gflop/s) errors: 0 temps: 21 C Summary at: Mon Dec 14 15:01:19 CST 2020 23.3% proc'd: 6708 (9193 Gflop/s) errors: 0 temps: 25 C Summary at: Mon Dec 14 15:01:26 CST 2020 35.0% proc'd: 10621 (9173 Gflop/s) errors: 0 temps: 29 C Summary at: Mon Dec 14 15:01:33 CST 2020 46.7% proc'd: 13975 (9159 Gflop/s) errors: 0 temps: 31 C Summary at: Mon Dec 14 15:01:40 CST 2020 58.3% proc'd: 17888 (9154 Gflop/s) errors: 0 temps: 32 C Summary at: Mon Dec 14 15:01:47 CST 2020 68.3% proc'd: 21242 (9144 Gflop/s) errors: 0 temps: 34 C Summary at: Mon Dec 14 15:01:53 CST 2020 80.0% proc'd: 24596 (9141 Gflop/s) errors: 0 temps: 35 C Summary at: Mon Dec 14 15:02:00 CST 2020 91.7% proc'd: 28509 (9139 Gflop/s) errors: 0 temps: 36 C Summary at: Mon Dec 14 15:02:07 CST 2020 100.0% proc'd: 31863 (9123 Gflop/s) errors: 0 temps: 37 C Killing processes.. Freed memory for dev 0 Uninitted cublas done Tested 1 GPUs: GPU 0: OK

前置环境准备

gpu 驱动安装,如果机器没有预装显卡驱动需要手动安装显卡驱动
执行 nvidia-smi 如果提示没有该命令则 nvidia 显卡驱动没有安装
下载对应的 Nvidia 显卡驱动,执行
Shell
Copy
bash NVIDIA-Linux-x86_64-450.80.02.run
进行安装
这里 benchmark 的 readme 推荐使用 conda 进行安装
Conda 是一个包,依赖和环境管理工具,如果系统没有预装 conda 工具需要进行预装
进入 conda 安装页面我们发现有两个安装器可以使用
Miniconda
Anaconda
第一次使用 Miniconda 安装 conda
Shell
Copy

benchmark 安装

首先使用 conda 命令安装 python 环境
Shell
Copy
conda install -y python=3.7
使用 conda 安装 pytorch, torchvision and torchtext 工具
Shell
Copy
conda install -y pytorch torchtext torchvision -c pytorch-nightly
提示需要安装的包如下:
Shell
Copy
package | build ---------------------------|----------------- cudatoolkit-10.2.89 | hfd86e86_1 365.1 MB ffmpeg-4.2.2 | h20bf706_0 59.6 MB gnutls-3.6.5 | h71b1129_1002 1.6 MB lame-3.100 | h7b6447c_0 323 KB libopus-1.3.1 | h7b6447c_0 491 KB libuv-1.40.0 | h7b6447c_0 736 KB libvpx-1.7.0 | h439df22_0 1.2 MB nettle-3.4.1 | hbb512f6_0 5.0 MB ninja-1.10.2 | py37hff7bd54_0 1.4 MB openh264-2.1.0 | hd408876_0 722 KB pytorch-1.8.0.dev20210110 |py3.7_cuda10.2.89_cudnn7.6.5_0 655.4 MB pytorch-nightly torchtext-0.9.0.dev20210110| py37 11.3 MB pytorch-nightly torchvision-0.9.0.dev20210110| py37_cu102 26.2 MB pytorch-nightly x264-1!157.20191217 | h7b6447c_0 922 KB ------------------------------------------------------------ Total: 1.10 GB
报错
Shell
Copy
joblib-1.0.0 | 207 KB | ## | 100% torchvision-0.2.1 | 75 KB | ## | 100% fsspec-0.8.3 | 69 KB | ## | 100% sphinxcontrib-qthelp | 26 KB | ## | 100% openssl-1.1.1i | 3.8 MB | ## | 100% sphinxcontrib-devhel | 24 KB | ## | 100% anaconda-custom | 3 KB | ## | 100% sphinxcontrib-serial | 25 KB | ## | 100% pytorch-1.8.0.dev202 | 860.8 MB | | 0% pytorch-1.8.0.dev202 | 860.8 MB | | 0% torchtext-0.9.0.dev2 | 11.2 MB | ## | 100% mock-4.0.3 | 27 KB | ## | 100% sphinxcontrib-appleh | 30 KB | ## | 100% tbb-2020.3 | 1.4 MB | ## | 100% InvalidArchiveError('Error with archive /root/anaconda3/pkgs/pytorch-1.8.0.dev20201229-py3.7_cuda11.0.221_cudnn8.0.5_0.tar.bz2. You probably need to delete and re-download or re-create this file. Message from libarchive was:\n\nSeek failed')
注意要选对版本,因为是用 pytorch 进行 GPU 测试所以应该是 pytorch-1.8.0.dev20201229-py3.7_cuda11.0.221_cudnn8.0.5_0.tar.bz2 这个包
我们通过 wget 命令下载离线包进行 conda 离线安装:
Shell
Copy
wget https://anaconda.org/pytorch-nightly/pytorch/1.8.0.dev20201229/download/linux-64/pytorch-1.8.0.dev20201229-py3.7_cuda11.0.221_cudnn8.0.5_0.tar.bz2 conda install --use-local pytorch-1.8.0.dev20201229-py3.7_cuda11.0.221_cudnn8.0.5_0.tar.bz2 Downloading and Extracting Packages | 0% InvalidArchiveError('Error with archive /root/anaconda3/pkgs/pytorch-1.8.0.dev20201229-py3.7_cuda11.0.221_cudnn8.0.5_0.tar.bz2. You probably need to delete and re-download or re-create this file. Message from libarchive was:\n\nSeek failed')
发现还是报错,这是因为之前我们安装的时候残留文件导致的,删除残留文件再执行安装
Shell
Copy
The following packages will be UPDATED: ca-certificates 2019.1.23-0 --> 2020.12.8-h06a4308_0 certifi 2019.3.9-py37_0 --> 2020.12.5-py37h06a4308_0 openssl 1.1.1b-h7b6447c_1 --> 1.1.1i-h27cfd23_0 pytorch <unknown>::pytorch-1.8.0.dev20201230-~ --> pytorch-nightly::pytorch-1.8.0.dev20210103-py3.7_cuda11.0.221_cudnn8.0.5_0
下载 benchmark 安装 benchmark 需要的套件
Shell
Copy
git clone https://github.com/pytorch/benchmark cd benchmark python install.py Requirement already satisfied: pytest in /usr/local/anaconda3/lib/python3.7/site-packages (from -r requirements.txt (line 1)) (5.4.3) Collecting pytest-benchmark Downloading pytest_benchmark-3.2.3-py2.py3-none-any.whl (49 kB) |████████████████████████████████| 49 kB 358 kB/s Requirement already satisfied: requests in /usr/local/anaconda3/lib/python3.7/site-packages (from -r requirements.txt (line 3)) (2.23.0) Collecting tabulate Downloading tabulate-0.8.7-py3-none-any.whl (24 kB) Requirement already satisfied: wcwidth in /usr/local/anaconda3/lib/python3.7/site-packages (from pytest->-r requirements.txt (line 1)) (0.2.4) Requirement already satisfied: attrs>=17.4.0 in /usr/local/anaconda3/lib/python3.7/site-packages (from pytest->-r requirements.txt (line 1)) (19.3.0) Requirement already satisfied: py>=1.5.0 in /usr/local/anaconda3/lib/python3.7/site-packages (from pytest->-r requirements.txt (line 1)) (1.8.1) Requirement already satisfied: more-itertools>=4.0.0 in /usr/local/anaconda3/lib/python3.7/site-packages (from pytest->-r requirements.txt (line 1)) (8.3.0) Requirement already satisfied: packaging in /usr/local/anaconda3/lib/python3.7/site-packages (from pytest->-r requirements.txt (line 1)) (20.4) Requirement already satisfied: pluggy<1.0,>=0.12 in /usr/local/anaconda3/lib/python3.7/site-packages (from pytest->-r requirements.txt (line 1)) (0.13.1) Requirement already satisfied: importlib-metadata>=0.12; python_version < "3.8" in /usr/local/anaconda3/lib/python3.7/site-packages (from pytest->-r requirements.txt (line 1)) (1.6.1) Collecting py-cpuinfo Downloading py-cpuinfo-7.0.0.tar.gz (95 kB) |████████████████████████████████| 95 kB 81 kB/s Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/anaconda3/lib/python3.7/site-packages (from requests->-r requirements.txt (line 3)) (1.25.9) Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/anaconda3/lib/python3.7/site-packages (from requests->-r requirements.txt (line 3)) (3.0.4) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/anaconda3/lib/python3.7/site-packages (from requests->-r requirements.txt (line 3)) (2020.4.5.2) Requirement already satisfied: idna<3,>=2.5 in /usr/local/anaconda3/lib/python3.7/site-packages (from requests->-r requirements.txt (line 3)) (2.9) Requirement already satisfied: six in /usr/local/anaconda3/lib/python3.7/site-packages (from packaging->pytest->-r requirements.txt (line 1)) (1.15.0) Requirement already satisfied: pyparsing>=2.0.2 in /usr/local/anaconda3/lib/python3.7/site-packages (from packaging->pytest->-r requirements.txt (line 1)) (2.4.7) Requirement already satisfied: zipp>=0.5 in /usr/local/anaconda3/lib/python3.7/site-packages (from importlib-metadata>=0.12; python_version < "3.8"->pytest->-r requirements.txt (line 1)) (3.1.0) Building wheels for collected packages: py-cpuinfo
安装一栏之前需要去 torchbenchmark 中把每个模型需要的依赖也安装
为了方便写了个脚本进行安装:
Shell
Copy
a=`ls . | grep -v ADDING_MODELS.md | grep -v install.sh` for element in ${a[@]} do cd $element python install.py cd .. done
安装 代理需求:
Shell
Copy
Unable to verify https connectivity, required for setup. Do you need to use a proxy?
install.py 文件中修改 pip 命令加上代理设置
Shell
Copy
subprocess.check_call([sys.executable, '-m', 'pip', '--proxy', 'http://127.0.0.1:10000', 'install', '-r', 'requirements.txt'])
安装过程中发现 model 中的依赖里面执行 python install.py 会有很多 package not found 出现下面这些错误:
Shell
Copy
Installed /root/anaconda3/lib/python3.7/site-packages/cityscapesScripts-2.1.7-py3.7.egg Processing dependencies for cityscapesScripts==2.1.7 Searching for appdirs Reading http://mirrors.cloud.aliyuncs.com/pypi/simple/appdirs/ Couldn't find index page for 'appdirs' (maybe misspelled?) Scanning index of all packages (this may take a while) Reading http://mirrors.cloud.aliyuncs.com/pypi/simple/ No local packages or working download links found for appdirs error: Could not find suitable distribution for Requirement.parse('appdirs') Traceback (most recent call last): File "install.py", line 19, in <module> install_other_dependencies() File "install.py", line 14, in install_other_dependencies subprocess.check_call(['bash', 'install_dependencies.sh', tmpdir]) File "/root/anaconda3/lib/python3.7/subprocess.py", line 363, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['bash', 'install_dependencies.sh', '/tmp/tmpfs4b436c']' returned non-zero exit status 1.
需要手动执行 pip install appdirs 手动进行安装

运行测试:

在 benchmark 下有两个基本的测试脚本:
test.py 封装了最简单的测试,这些测试围绕基础设施遍历所有模型进行安装和执行
test_bench.py
Shell
Copy
发现报各种版本错误
通过 Docker 的方式进行图形显卡测试
https://pytorch.org/ pytorch 官网
遂决定使用 docker 的方式部署:
docker pytorch git 目录
它要求通过 nvidia-docker 命令来运行
Shell
Copy
nvidia-docker run --rm -ti nerox8664/pytorch-benchmarks
查看 nvidia 显卡驱动和相关信息
Shell
Copy
nvidia-smi Thu Dec 17 14:35:58 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 430.50 Driver Version: 430.50 CUDA Version: 10.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 P102-100 Off | 00000000:00:08.0 Off | N/A | | 0% 14C P0 48W / 225W | 0MiB / 10156MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+

「真诚赞赏,手留余香」

Foreversmart

真诚赞赏,手留余香

使用微信扫描二维码完成支付