当前位置：首页 > news >正文

长沙专业做网站公司做互联网的网站

news 2026/4/15 12:40:16

长沙专业做网站公司,做互联网的网站,电商ui设计师的发展前景,wordpress服务Baichuan2是百川智能推出的新一代开源大语言模型#xff0c;采用 2.6 万亿 Tokens 的高质量语料训练。在多个权威的中文、英文和多语言的通用、领域 benchmark 上取得同尺寸最佳的效果。包含有 7B、13B 的 Base 和 Chat 版本#xff0c;并提供了 Chat 版本的 4bits 量化。模…Baichuan2是百川智能推出的新一代开源大语言模型采用 2.6 万亿 Tokens 的高质量语料训练。在多个权威的中文、英文和多语言的通用、领域 benchmark 上取得同尺寸最佳的效果。包含有 7B、13B 的 Base 和 Chat 版本并提供了 Chat 版本的 4bits 量化。模型下载基座模型 Baichuan2-7B-Base https://huggingface.co/baichuan-inc/Baichuan2-7B-Basehttps://huggingface.co/baichuan-inc/Baichuan2-7B-BaseBaichuan2-13B-Base https://huggingface.co/baichuan-inc/Baichuan2-13B-Basehttps://huggingface.co/baichuan-inc/Baichuan2-13B-Base 对齐模型 Baichuan2-7B-Chat https://huggingface.co/baichuan-inc/Baichuan2-7B-Chathttps://huggingface.co/baichuan-inc/Baichuan2-7B-ChatBaichuan2-13B-Chat https://huggingface.co/baichuan-inc/Baichuan2-13B-Chathttps://huggingface.co/baichuan-inc/Baichuan2-13B-Chat 对齐模型 4bits 量化 Baichuan2-7B-Chat-4bits https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat-4bitshttps://huggingface.co/baichuan-inc/Baichuan2-7B-Chat-4bitsBaichuan2-13B-Chat-4bits https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat-4bitshttps://huggingface.co/baichuan-inc/Baichuan2-13B-Chat-4bits 拉取代码 git clone https://github.com/baichuan-inc/Baichuan2 安装依赖 pip install -r requirements.txt 调用方式 Python代码调用 Chat 模型推理方法示例 import torch from transformers import AutoModelForCausalLM, AutoTokenizer from transformers.generation.utils import GenerationConfig tokenizer AutoTokenizer.from_pretrained(baichuan-inc/Baichuan2-13B-Chat, use_fastFalse, trust_remote_codeTrue) model AutoModelForCausalLM.from_pretrained(baichuan-inc/Baichuan2-13B-Chat, device_mapauto, torch_dtypetorch.bfloat16, trust_remote_codeTrue) model.generation_config GenerationConfig.from_pretrained(baichuan-inc/Baichuan2-13B-Chat) messages [] messages.append({role: user, content: 解释一下“温故而知新”}) response model.chat(tokenizer, messages) print(response) Base 模型推理方法示范 from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer AutoTokenizer.from_pretrained(baichuan-inc/Baichuan2-13B-Base, trust_remote_codeTrue) model AutoModelForCausalLM.from_pretrained(baichuan-inc/Baichuan2-13B-Base, device_mapauto, trust_remote_codeTrue) inputs tokenizer(登鹳雀楼-王之涣\n夜雨寄北-, return_tensorspt) inputs inputs.to(cuda:0) pred model.generate(**inputs, max_new_tokens64, repetition_penalty1.1) print(tokenizer.decode(pred.cpu()[0], skip_special_tokensTrue)) 模型加载指定 device_mapauto会使用所有可用显卡。如需指定使用的设备可以使用类似 export CUDA_VISIBLE_DEVICES0,1使用了0、1号显卡的方式控制。命令行方式 python cli_demo.py 本命令行工具是为 Chat 场景设计不支持使用该工具调用 Base 模型。网页 demo 方式依靠 streamlit 运行以下命令会在本地启动一个 web 服务把控制台给出的地址放入浏览器即可访问。 streamlit run web_demo.py 本网页demo工具是为 Chat 场景设计不支持使用该工具调用 Base 模型。量化方法 Baichuan2支持在线量化和离线量化两种模式。在线量化对于在线量化baichuan2支持 8bits 和 4bits 量化使用方式和 Baichuan-13B 项目中的方式类似只需要先加载模型到 CPU 的内存里再调用quantize()接口量化最后调用 cuda()函数将量化后的权重拷贝到 GPU 显存中。实现整个模型加载的代码非常简单以 Baichuan2-7B-Chat 为例 8bits 在线量化: model AutoModelForCausalLM.from_pretrained(baichuan-inc/Baichuan2-7B-Chat, torch_dtypetorch.float16, trust_remote_codeTrue) model model.quantize(8).cuda() 4bits 在线量化: model AutoModelForCausalLM.from_pretrained(baichuan-inc/Baichuan2-7B-Chat, torch_dtypetorch.float16, trust_remote_codeTrue) model model.quantize(4).cuda() 需要注意的是在用 from_pretrained 接口的时候用户一般会加上 device_mapauto在使用在线量化时需要去掉这个参数否则会报错。离线量化为了方便用户的使用baichuan2提供了离线量化好的 4bits 的版本 Baichuan2-7B-Chat-4bits供用户下载。用户加载 Baichuan2-7B-Chat-4bits 模型很简单只需要执行: model AutoModelForCausalLM.from_pretrained(baichuan-inc/Baichuan2-7B-Chat-4bits, device_mapauto, trust_remote_codeTrue) 对于 8bits 离线量化baichuan2没有提供相应的版本因为 Hugging Face transformers 库提供了相应的 API 接口可以很方便的实现 8bits 量化模型的保存和加载。用户可以自行按照如下方式实现 8bits 的模型保存和加载 model AutoModelForCausalLM.from_pretrained(model_id, load_in_8bitTrue, device_mapauto, trust_remote_codeTrue) model.save_pretrained(quant8_saved_dir) model AutoModelForCausalLM.from_pretrained(quant8_saved_dir, device_mapauto, trust_remote_codeTrue) CPU 部署 Baichuan2 模型支持 CPU 推理但需要强调的是CPU 的推理速度相对较慢。需按如下方式修改模型加载的方式 model AutoModelForCausalLM.from_pretrained(baichuan-inc/Baichuan2-7B-Chat, torch_dtypetorch.float32, trust_remote_codeTrue) 模型微调依赖安装 git clone https://github.com/baichuan-inc/Baichuan2.git cd Baichuan2/fine-tune pip install -r requirements.txt 如需使用 LoRA 等轻量级微调方法需额外安装 peft 如需使用 xFormers 进行训练加速需额外安装 xFormers 单机训练 hostfile deepspeed --hostfile$hostfile fine-tune.py \--report_to none \--data_path data/belle_chat_ramdon_10k.json \--model_name_or_path baichuan-inc/Baichuan2-7B-Base \--output_dir output \--model_max_length 512 \--num_train_epochs 4 \--per_device_train_batch_size 16 \--gradient_accumulation_steps 1 \--save_strategy epoch \--learning_rate 2e-5 \--lr_scheduler_type constant \--adam_beta1 0.9 \--adam_beta2 0.98 \--adam_epsilon 1e-8 \--max_grad_norm 1.0 \--weight_decay 1e-4 \--warmup_ratio 0.0 \--logging_steps 1 \--gradient_checkpointing True \--deepspeed ds_config.json \--bf16 True \--tf32 True 轻量化微调代码已经支持轻量化微调如 LoRA如需使用仅需在上面的脚本中加入以下参数 --use_lora True LoRA 具体的配置可见 fine-tune.py 脚本。使用 LoRA 微调后可以使用下面的命令加载模型 from peft import AutoPeftModelForCausalLM model AutoPeftModelForCausalLM.from_pretrained(output, trust_remote_codeTrue)

查看全文

http://www.hkea.cn/news/14274565/