【transformers.Trainer填坑】在自定义compute_metrics时logits和labels数据维度不一致问题

问题描述

我在使用 transformers.Trainer 训练我的模型时,我自定义了 compute_loss 函数和compute_metrics函数,我的模型是一个简单的二分类模型。

在自定义 compute_loss 时这样写的:

def compute_loss(self, model, inputs, return_outputs=False):
        """
        重写 Trainer.compute_loss:
        1) 提取字典中的 images, bboxes, locs, labels 等
        2) 用 vision_encoder 先处理图像,得到特征
        3) 用下游 model 做预测
        4) 计算并返回 loss
        """
        # 前向传播
        outputs, labels = model(**inputs)  # (bz, num_classes), or (bz*num_frames, num_classes)

        batch_size = inputs['labels'].shape[0]

        outputs = outputs.squeeze()  # (bz*num_frames)

        if batch_size == 1:
            outputs = outputs.unsqueeze(0)

        # 计算 loss
        loss = self.loss_func(outputs, labels.float())

        if self.state
``` from datasets import load_dataset from transformers import T5Tokenizer, T5ForConditionalGeneration, Seq2SeqTrainer, Seq2SeqTrainingArguments import numpy as np tokenizer = T5Tokenizer.from_pretrained("flan-t5-base") model = T5ForConditionalGeneration.from_pretrained("flan-t5-base").to("cuda:0") def preprocess_function(examples): inputs = ["sentiment analysis: " + text for text in examples["text"]] targets = [str(label) for label in examples["labels"]] # 标签需转换为字符串 model_inputs = tokenizer( inputs, max_length=128, truncation=True, padding="max_length" ) labels = tokenizer( targets, max_length=8, # 根据标签长度调整 truncation=True, padding="max_length" ) model_inputs["labels"] = labels["input_ids"] return model_inputs # 3. 加载数据集(以GoEmotions为例) dataset = load_dataset("go_emotions", split="train[:5000]") dataset = dataset.map(preprocess_function, batched=True) # 4. 训练配置 training_args = Seq2SeqTrainingArguments( output_dir="./results", evaluation_strategy="epoch", learning_rate=3e-5, per_device_train_batch_size=8, per_device_eval_batch_size=8, num_train_epochs=3, predict_with_generate=True # 启用生成模式 ) # 5. 定义评估指标 def compute_metrics(eval_pred): predictions, labels = eval_pred decoded_preds = tokenizer.batch_decode(predictions, skip_special_tokens=True) # 将字符串标签转换为数值 decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True) accuracy = np.mean([int(p == l) for p, l in zip(decoded_preds, decoded_labels)]) return {"accuracy": accuracy} # 6. 创建Trainer trainer = Seq2SeqTrainer( model=model, args=training_args, train_dataset=dataset, eval_dataset=dataset.shuffle().select(range(500)), compute_metrics=compute_metrics ) # 7. 开始训练 trainer.train()```ValueError: Class label 16047 greater than configured num_classes 28
03-30
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值