当前位置：首页 > news >正文

宿豫建设局网站建设六马路小学网站

news 2026/5/6 19:29:07

宿豫建设局网站,建设六马路小学网站,织梦网站统计,学服装设计的基础文章目录 WhisperFaster Whisper安装使用尝试WSL部署尝试 Jetson 部署时间戳实时转录 Whisper Whisper 是一种通用语音识别模型。它是在大量不同音频数据集上进行训练的#xff0c;也是一个多任务模型#xff0c;可以执行多语言语音识别、语音翻译和语言识别。测试#x… 文章目录 WhisperFaster Whisper安装使用尝试WSL部署尝试 Jetson 部署时间戳实时转录 Whisper Whisper 是一种通用语音识别模型。它是在大量不同音频数据集上进行训练的也是一个多任务模型可以执行多语言语音识别、语音翻译和语言识别。测试用Chattts生成一段语音四川美食确实以辣闻名但也有不辣的选择。比如甜水面、赖汤圆、蛋烘糕、叶儿粑等这些小吃口味温和甜而不腻也很受欢迎。 $ pip install -U openai-whisper $ sudo apt update sudo apt install ffmpeg $ pip install setuptools-rust$ whisper ../audio.wav --model tiny 100%|█████████████████████████████████████| 72.1M/72.1M [00:3600:00, 2.08MiB/s] /home/jetson/.local/lib/python3.8/site-packages/whisper/__init__.py:146: FutureWarning: You are using torch.load with weights_onlyFalse (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_onlyTrue for any use case where you dont have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.checkpoint torch.load(fp, map_locationdevice) /home/jetson/.local/lib/python3.8/site-packages/whisper/transcribe.py:115: UserWarning: FP16 is not supported on CPU; using FP32 insteadwarnings.warn(FP16 is not supported on CPU; using FP32 instead) Detecting language using up to the first 30 seconds. Use --language to specify the language Detected language: Chinese [00:00.000 -- 00:03.680] 四川美時確實以辣文明但以有不辣的選擇 [00:03.680 -- 00:07.200] 比如潛水面賴湯圓再轟高夜熱八等 [00:07.200 -- 00:11.560] 這些小市口維溫和然後甜而不膩也很受歡迎这个是CPU运行的GPU都没带喘的。 Faster Whisper fast-whisper 是使用 CTranslate2 重新实现 OpenAI 的 Whisper 模型CTranslate2 是 Transformer 模型的快速推理引擎。 Funasr有个大问题它的实时转录是CPU的很慢GPU的支持离线语音转文字但又不能实时。找到了一个faster-whisper可以支持实时GPU转录也支持中文。 Faster-Whisper 实时识别电脑语音转文本模型faster-whisper-large-v3 安装使用 pip install faster-whisperfrom faster_whisper import WhisperModelmodel_size large-v3# Run on GPU with FP16 # model WhisperModel(model_size, devicecuda, compute_typefloat16)# or run on GPU with INT8 model WhisperModel(model_size, devicecuda, compute_typeint8_float16) # or run on CPU with INT8 # model WhisperModel(model_size, devicecpu, compute_typeint8)segments, info model.transcribe(audio.mp3, beam_size5)print(Detected language %s with probability %f % (info.language, info.language_probability))for segment in segments:print([%.2fs - %.2fs] %s % (segment.start, segment.end, segment.text))尝试WSL部署 Cuda12.6Cudnn9.2 直接运行报错Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory这是需要cublascudnn的python库 pip install nvidia-cublas-cu12 nvidia-cudnn-cu12export LD_LIBRARY_PATHpython3 -c import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; print(os.path.dirname(nvidia.cublas.lib.__file__) : os.path.dirname(nvidia.cudnn.lib.__file__))但是仍然跑不起来因为 Version 9 of nvidia-cudnn-cu12 appears to cause issues due its reliance on cuDNN 9 (Faster-Whisper does not currently support cuDNN 9). Ensure your version of the Python package is for cuDNN 8. 那我安装 Cudnn 8 不就行了果断下载cudnn8 for cuda 12.x但是每次都安装cudnn9.4除了降cuda版本否则没办法恢复到cudnn8。尝试 Jetson 部署 Cuda11.4Cudnn8.6.0 简直量身定制啊首先尝试安装cudnn python库 $ pip3 install faster-whisper -i https://mirrors.aliyun.com/pypi/simple/# 贴心的提示我们For all these methods below, keep in mind the above note # regarding CUDA versions. Depending on your setup, you may need to install the # CUDA 11 versions of libraries that correspond to the CUDA 12 libraries listed # in the instructions below.$ pip install --extra-index-url https://pypi.nvidia.com nvidia-cudnn-cu11 ... The installation of nvidia-cudnn-cu11 for version 9.0.0.312 failed.This is a special placeholder package which downloads a real wheel packagefrom https://pypi.nvidia.com. If https://pypi.nvidia.com is not reachable, wecannot download the real wheel file to install.You might try installing this package via$ pip install --extra-index-url https://pypi.nvidia.com nvidia-cudnn-cu11Here is some debug information about your platform to include in any bugreport:Python Version: CPython 3.8.10Operating System: Linux 5.10.104-tegraCPU Architecture: aarch64nvidia-smi command not found. Ensure NVIDIA drivers are installed.原来是 nvidia-cudnnn-cu11没有aarch64 Arm版本但是nvidia-cudnn-cu12有。怎么办安装cuda 12.2Jetson的系统是离线刷机jetpack 6确实支持12.2和cudnn8 已经准备买新的固态刷机了但是太麻烦了得装虚拟机装刷机SDK得拆机箱改跳帽得重新配置ssh网络连接关键是得花钱不试试怎么行呢我不信邪就安装cudnn12 python库 pip install --extra-index-url https://pypi.nvidia.com nvidia-cudnn-cu12 Defaulting to user installation because normal site-packages is not writeable Looking in indexes: https://pypi.org/simple, https://pypi.nvidia.com Collecting nvidia-cudnn-cu12Downloading nvidia_cudnn_cu12-9.4.0.58-py3-none-manylinux2014_aarch64.whl.metadata (1.6 kB) Collecting nvidia-cublas-cu12 (from nvidia-cudnn-cu12)Downloading https://pypi.nvidia.com/nvidia-cublas-cu12/nvidia_cublas_cu12-12.6.1.4-py3-none-manylinux2014_aarch64.whl (376.7 MB)━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 376.7/376.7 MB 12.9 MB/s eta 0:00:00 Downloading nvidia_cudnn_cu12-9.4.0.58-py3-none-manylinux2014_aarch64.whl (572.7 MB)━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 572.7/572.7 MB 1.1 MB/s eta 0:00:00 Installing collected packages: nvidia-cublas-cu12, nvidia-cudnn-cu12 Successfully installed nvidia-cublas-cu12-12.6.1.4 nvidia-cudnn-cu12-9.4.0.58跑一下demo test.py preprocessor_config.json: 100%|████████████████████████████████| 340/340 [00:0000:00, 118kB/s] config.json: 100%|█████████████████████████████████████████████| 2.39k/2.39k [00:0000:00, 1.03MB/s] vocabulary.json: 100%|█████████████████████████████████████████| 1.07M/1.07M [00:0000:00, 1.13MB/s] tokenizer.json: 100%|██████████████████████████████████████████| 2.48M/2.48M [00:0100:00, 2.14MB/s] model.bin: 100%|███████████████████████████████████████████████| 3.09G/3.09G [03:1800:00, 9.89MB/s] Traceback (most recent call last):File test.py, line 9, in modulemodel WhisperModel(model_size, devicecuda, compute_typeint8_float16)File /home/jetson/.local/lib/python3.8/site-packages/faster_whisper/transcribe.py, line 145, in __init__self.model ctranslate2.models.Whisper( ValueError: This CTranslate2 package was not compiled with CUDA supportHoly这又是咋回事找一下This CTranslate2 package was not compiled with CUDA support #1306跳过他们的讨论结合faster-whisper库里的描述 Note: Latest versions of ctranslate2 support CUDA 12 only. For CUDA 11, the current workaround is downgrading to the 3.24.0 version of ctranslate2 (This can be done with pip install --force-reinstall ctranslate23.24.0 or specifying the version in a requirements.txt). 又是cuda11的幺蛾子它说要使用降级的方法 $ pip install --force-reinstall ctranslate23.24.0 ERROR: pips dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. mediapipe 0.8.4 requires opencv-contrib-python, which is not installed. onnx-graphsurgeon 0.3.12 requires onnx, which is not installed. d2l 0.17.6 requires numpy1.21.5, but you have numpy 1.24.4 which is incompatible. d2l 0.17.6 requires requests2.25.1, but you have requests 2.32.3 which is incompatible. faster-whisper 1.0.3 requires ctranslate25,4.0, but you have ctranslate2 3.24.0 which is incompatible.呸我试试自己编一个cuda版本的https://opennmt.net/CTranslate2/installation.html#compile-the-c-library $ pip3 uninstall ctranslate2 whisper-ctranslate2 $ git clone --recursive https://github.com/OpenNMT/CTranslate2.git $ mkdir build cd build $ cmake .. ... CMake Error at CMakeLists.txt:294 (message):Intel OpenMP runtime libiomp5 not found-- Configuring incomplete, errors occurred!哪来的intel找找原来是By default, the library is compiled with the Intel MKL backend which should be installed separately. See the Build options to select or add another backend. 改一下不用老in家的 # 老张我给你表演什么叫一镜到底注意看我只表演一次 $ cmake .. -DOPENMP_RUNTIMECOMP -DWITH_MKLOFF -DWITH_CUDAON -DWITH_CUDNNON $ make -j32 $ sudo make install $ sudo ldconfig $ cd ../python $ pip install -r install_requirements.txt $ python setup.py bdist_wheel $ pip install dist/*.whl喜大普奔时间戳 from faster_whisper import WhisperModelmodel_size large-v3# Run on GPU with FP16 # model WhisperModel(model_size, devicecuda, compute_typefloat16)# or run on GPU with INT8 model WhisperModel(model_size, devicecuda, compute_typeint8_float16) # or run on CPU with INT8 # model WhisperModel(model_size, devicecpu, compute_typeint8)segments, _ model.transcribe(audio.wav, word_timestampsTrue)for segment in segments:for word in segment.words:print([%.2fs - %.2fs] %s % (word.start, word.end, word.word))[0.00s - 0.24s] 四 [0.24s - 0.44s] 川 [0.44s - 0.58s] 美 [0.58s - 0.78s] 食 [0.78s - 1.10s] 确 .. [9.72s - 9.96s] 腻 [9.96s - 10.42s] 也 [10.42s - 10.68s] 很 [10.68s - 10.82s] 受 [10.82s - 11.04s] 欢 [11.04s - 11.22s] 迎实时转录 Whisper 实时流式传输用于长时间语音到文本的转录和翻译。Whisper 是最近最先进的多语言语音识别和翻译模型之一然而它并不是为实时转录而设计的。在本文中我们在 Whisper 之上构建并创建了 Whisper-Streaming这是一种实时语音转录和类似 Whisper 模型翻译的实现。 Whisper-Streaming 使用本地协议策略和自适应延迟来实现流式转录。我们证明 Whisper-Streaming 在未分段的长格式语音转录测试集上实现了高质量和 3.3 秒的延迟并且我们在多语言会议上展示了其作为实时转录服务组件的鲁棒性和实际可用性。 $ git clone gitgithub.com:ufal/whisper_streaming.git $ cd whisper_streaming $ python3 whisper_online.py ../audio.wav --language zh --min-chunk-size 1 INFO Audio duration is: 11.68 seconds INFO Loading Whisper large-v2 model for zh... INFO done. It took 14.19 seconds. DEBUG PROMPT: DEBUG CONTEXT: DEBUG transcribing 1.00 seconds from 0.00 DEBUG COMPLETE NOW: (None, None, ) DEBUG INCOMPLETE: (0.0, 0.98, 四川美食群) DEBUG len of buffer now: 1.00 DEBUG ## last processed 1.00 s, now is 5.30, the latency is 4.29 DEBUG PROMPT: DEBUG CONTEXT: DEBUG transcribing 5.30 seconds from 0.00 DEBUG COMPLETE NOW: (0.0, 0.88, 四川美食) DEBUG INCOMPLETE: (0.88, 5.26, 确实以辣为名,但也有不辣的选择,比如甜水面赖淘宝。) DEBUG len of buffer now: 5.30 11643.5227 0 880 四川美食 11643.5227 0 880 四川美食 DEBUG ## last processed 5.30 s, now is 11.64, the latency is 6.35 DEBUG PROMPT: DEBUG CONTEXT: 四川美食 DEBUG transcribing 11.64 seconds from 0.00 DEBUG COMPLETE NOW: (None, None, ) DEBUG INCOMPLETE: (0.88, 11.24, 確實以辣聞名,但也有不辣的選擇,比如甜水麵、瀨湯圓、炸烘糕、葉子粑等,這些小吃口味溫和,然後甜而不膩,也很受歡迎。) DEBUG len of buffer now: 11.64 DEBUG ## last processed 11.64 s, now is 21.61, the latency is 9.96 DEBUG PROMPT: DEBUG CONTEXT: 四川美食 DEBUG transcribing 11.68 seconds from 0.00 DEBUG COMPLETE NOW: (None, None, ) DEBUG INCOMPLETE: (0.88, 11.32, 确实以辣闻名,但也有不辣的选择,比如甜水面、赖汤圆、炸烘糕叶、热巴等,这些小吃口味温和,然后甜而不腻,也很受欢迎。) DEBUG len of buffer now: 11.68 DEBUG ## last processed 21.61 s, now is 31.53, the latency is 9.92 DEBUG last, noncommited: (0.88, 11.32, 确实以辣闻名,但也有不辣的选择,比如甜水面、赖汤圆、炸烘糕叶、热巴等,这些小吃口味温和,然后甜而不腻,也很受欢迎。) 31528.1091 880 11320 确实以辣闻名,但也有不辣的选择,比如甜水面、赖汤圆、炸烘糕叶、热巴等,这些小吃口味温和,然后甜而不腻,也很受欢迎。 31528.1091 880 11320 确实以辣闻名,但也有不辣的选择,比如甜水面、赖汤圆、炸烘糕叶、热巴等,这些小吃口味温和,然后甜而不腻,也很受欢迎。注更改模型量化 # this worked fast and reliably on NVIDIA L40 # model WhisperModel(model_size_or_path, devicecuda, compute_typefloat16, download_rootcache_dir)# or run on GPU with INT8 # tested: the transcripts were different, probably worse than with FP16, and it was slightly (appx 20%) slower model WhisperModel(model_size_or_path, devicecuda, compute_typeint8_float16)

查看全文

http://www.hkea.cn/news/14558749/