粤语识别语音模型

置顶深圳市进化图灵智能科技有限公司

已于 2025-04-20 23:13:35 修改

阅读量398

点赞数 1

分类专栏： AI 文章标签：人工智能语音识别

于 2024-12-27 10:31:57 首次发布

本文链接：https://blog.csdn.net/WMX843230304WMX/article/details/144760733

版权

AI 专栏收录该内容

60 篇文章

订阅专栏

粤语识别语音模型


AI学习交流qq群	873673497
官网	turingevo.com
邮箱	wmx@turingevo.com
github	https://github.com/turingevo
huggingface	https://huggingface.co/turingevo

原始模型

-	id
modelscope	iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch

微调数据集

-	id
modelscope	modelscope/speech_asr_commonvoice_cantonese-CHS_trainsets

微调结果

-	id
huggingface	turingevo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch-lora

在这里插入图片描述

推理

common_voice_yue_31189594.wav 睇我几有礼貌去之前讲返声

# from funasr.runtime.python.onnx.runtime_recognizer import ONNXRuntimeRecognizer

input="/media/wmx/soft1/huggingface_cache/data/speech_asr_commonvoice_cantonese-CHS_trainsets/test/common_voice_yue_31189594.wav"
# input="/media/wmx/soft1/AI-model/FunASR/asr_example_zh.wav"
# input="/media/wmx/soft1/AI-model/FunASR/asr_example_en.wav"

model_dir="/media/wmx/soft1/huggingface_cache/out_models/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch-lora"
# model_dir="/media/wmx/soft1/huggingface_cache/hub/iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch"

from funasr import AutoModel

model = AutoModel(model=model_dir)

res = model.generate(input=input)
print(res)

result :

[
{'key': 'common_voice_yue_31189594', 
'text': '睇 我 几 有 礼 貌 去 之 前 返 声', 
'timestamp': [[1410, 1650], [1730, 1970], [2050, 2270], [2270, 2470], [2470, 2690], [2690, 2930], [3230, 3470], [3550, 3770], [3770, 4010], [4010, 4250], [4270, 4490]]}
]