大白呼 2021-05-13 22:17 采纳率: 0%
浏览 2367

Assertion `t >= 0 && t < n_classes` failed.

vscode 进行训练出错。到底是哪里出问题了

/opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [3,0,0] Assertion `t >= 0 && t < n_classes` failed.
Traceback (most recent call last):
  File "train_classify.py", line 277, in <module>
    train(args, model, criterion, optimizer, device, train_dataloader, writer, epoch)
  File "train_classify.py", line 42, in train
    running_loss += loss.item()
RuntimeError: CUDA error: device-side assert triggered
  • 写回答

3条回答 默认 最新

  • 爱晚乏客游 2021-05-14 00:17
    关注

    这个应该是标签问题,标签的类数目超过你训练的类数目,比如只有5个来,id是0-4,这时候来个id=6的,就会出现错误。

    解决方法有两种,一种就是写个脚本检查下标签文件里面的标签,找到超过类别的,去掉就行,第二就是进入到train里面之前检查一下标签的数据

    评论

报告相同问题?