pythorch 错误程序只能在一个gpu上运行_pytorch多gpu只有一个在跑-CSDN博客

本文链接：https://blog.csdn.net/ResumeProject/article/details/126961710

在某次实验中发现程序只能在0卡上运行，发现是程序中同时使用了两种设置GPU的方法，产生了冲突

os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ['CUDA_VISIBLE_DEVICES'] = "0"

ctrl+shift+f 全局搜索cuda

***.to("cuda:1")
self.net = Model('***_dict.npy').to(device='cuda:1')

关于两者的区别和工作方式可以参考os.environ[CUDA_VISIBLE_DEVICES] 不能正常工作，所以解决这个问题只需对设备进行统一即可。

C&G

另外如果没有显式指定设备序号的话则使用 torch.cuda.current_device()

CPU	to(‘cpu’)	to(torch.device(‘cpu’))	cpu()
Current GPU	to(‘cuda’)	to(torch.device(‘cuda’))	cuda()
Specific GPU	to(‘cuda:1’)	to(torch.device(‘cuda:1’))	cuda(device=1)
询问	命令
PyTorch 是否看到任何 GPU？	torch.cuda.is_available()
张量是否默认存储在 GPU 上？	torch.rand(10).device
将默认张量类型设置为 CUDA：	torch.set_default_tensor_type(torch.cuda.FloatTensor)
这个张量是 GPU 张量吗？	my_tensor.is_cuda
这个模型是否存储在 GPU 上？	all(p.is_cuda for p in my_model.parameters())

>>> import torch

>>> torch.cuda.is_available()
True

>>> torch.cuda.device_count()
1

>>> torch.cuda.current_device()
0

>>> torch.cuda.device(0)
<torch.cuda.device at 0x7efce0b03be0>

>>> torch.cuda.get_device_name(0)
'GeForce GTX 950M'


https://stackoverflow.com/questions/48152674/how-do-i-check-if-pytorch-is-using-the-gpu