LMFlow项目地址：https://github.com/OptimalScale/LMFlow

LMFlow

An extensible, convenient, and efficient toolbox for finetuning large machine learning models, designed to be user-friendly, speedy and reliable, and accessible to the entire community.
一个可扩展、方便、高效的工具箱，用于微调大型机器学习模型，旨在用户友好、快速和可靠，并且整个社区都可以访问。

Large Language Model for All. See our vision. 共建大模型社区，让每个人都能训得起大模型。查看我们的愿景。

Latest News

[2023-04-02] Web service is online!
[2023-04-02] Web服务上线啦！
[2023-04-01] Release Chinese checkpoints in model zoo: Hu (湖羊), Dongshan (东山羊), and Hetian (和田羊).
[2023-04-01] 在模型动物园中发布中国检查站：湖（湖羊）、东山（东山羊）和和田（和田羊）。
[2023-04-01] Release English checkpoints in model zoo: LLaMA7B-medical, LLaMA13B-medical, and LLaMA33B-medical.
[2023-04-01] 在模型动物园中发布英语检查点：LLaMA7B-medical、LLaMA13B-medical 和 LLaMA33B-medical。
[2023-03-27] Support full tuning and lora tuning for all decoder models.
[2023-03-27] 支持所有解码器型号的full tuning和lora tuning。
[2023-03-27] Tasked tuned model beats ChatGPT on medical domain
[2023-03-27] Tasked tuned 模型在医疗领域击败了 ChatGPT
[2023-03-27] Release code and checkpoints – version 0.0.1
[2023-03-27] 发布代码和检查点 – 版本 0.0.1

Demos

Currently our checkpoint download service is at capacity. We have allocated one more server to support that. If you encounter error “too many HTTP requests“, please wait for several minutes and try again. Thanks for your understanding.🙏
目前我们的检查点下载服务已经满负荷运转。我们已经分配了一台服务器来支持它。如果遇到错误“太多 HTTP 请求”，请等待几分钟，然后重试。感谢你的理解。 🙏

We provide four kinds of demos which include
我们提供四种演示，其中包括

Online Service: If you don’t want to run any code and just want to try our models, we deploy our instruction-tuned LLaMA-7B and LLaMA-33B for you to have a try.
在线服务：如果您不想运行任何代码，只想试用我们的模型，我们会部署指令调优的 LLaMA-7B 和 LLaMA-33B 供您试用。
Colab Chatbot(shell): An interactive shell-based chatbot for you to easily deploy a chatbot on colab.
Colab Chatbot(shell)：一个基于 shell 的交互式聊天机器人，让您可以轻松地在 colab 上部署聊天机器人。
Colab Chatbot(web): An interactive web-based chatbot for you to easily deploy your own chatbot on colab.
Colab Chatbot(web)：一个基于 Web 的交互式聊天机器人，您可以在 Colab 上轻松部署自己的聊天机器人。
Local Deploy: We also provide a way for you to deploy your model/chatbot locally, which means you can deploy much larger model than previous three methods if you have enough resource.
本地部署：我们还为您提供了一种在本地部署您的模型/聊天机器人的方法，这意味着如果您有足够的资源，您可以部署比前三种方法更大的模型。

Online Service

Welcome to visit our web service. We deploy Hu (湖羊), and Hetian (和田羊) online for preview. Due to the high website traffic, sometimes the website may fail to respond. You can also deploy the chatbot referto Local Deploy.
欢迎访问我们的网络服务。我们在线部署了 Hu (湖羊) 和 Hetian (和田羊) 进行预览。由于网站流量较高，有时网站可能无法响应。您还可以参考 Local Deploy 部署聊天机器人。

Colab chatbot(shell) Colab 聊天机器人（外壳）

We provide a simple shell demo of chatbot with Google Colab’s T4/P100/V100 GPU. Notice that the provided gpt-neo-2.7b model is a rather weak model, which only supports English and may sometimes generate unsatisfactory responses. To improve the performance, users can use their own dataset to finetune and obtain a better model with LMFlow. One can also try other available decoder-only models provided in 🤗 huggingface, by
我们提供了一个使用 Google Colab 的 T4/P100/V100 GPU 的聊天机器人的简单 shell 演示。请注意，提供的 gpt-neo-2.7b 模型是一个相当弱的模型，它只支持英语，有时可能会产生不令人满意的响应。为了提高性能，用户可以使用自己的数据集通过 LMFlow 进行微调并获得更好的模型。还可以尝试 🤗 huggingface 中提供的其他可用的仅解码器模型，通过

./scripts/run_chatbot.sh {another-model-name}

Colab chatbot(web) 协作聊天机器人（网络）

We provide a simple web demo of chatbot with Google Colab’s T4/P100/V100 GPU. Notice that the provided gpt-neo-2.7b model is a rather weak model, which only supports English and may sometimes generate unsatisfactory responses.
我们使用 Google Colab 的 T4/P100/V100 GPU 提供了聊天机器人的简单网络演示。请注意，提供的 gpt-neo-2.7b 模型是一个相当弱的模型，它只支持英语，有时可能会产生不令人满意的响应。

Local Deploy

If you have resources and want to deploy your own model locally. We provide you an easy way to run a flask server to launch a backend (to further provide services to other frontend) and an interactive web frontend (to let you communicate directly) by
如果你有资源，想在本地部署自己的模型。我们为您提供了一种简单的方法来运行 flask 服务器以启动后端（进一步向其他前端提供服务）和交互式 Web 前端（让您直接通信）

cd ./service
python app.py

Medical Performance 医疗表现

	PubMedQA (ID)	MedQA-USMLE (OOD)	MedMCQA (ID)	Average
Human (pass)	60.0	50.0
Human (expert)	78.0	87.0	90.0	85.0

InstructGPT 175B	73.2	46.0	44.0	54.4
ChatGPT	63.9	57.0	44.7	55.2
LLaMA 7B	5.2	27.1	24.3	18.9
LLaMA 33B	1.8	43.4	30.3	25.2

Task-tuned LLaMA 7B (Full) 任务调整的 LLaMA 7B（完整版）	75.1	44.5	49.9	56.5
Task-tuned LLaMA 33B (LoRA) 任务调整的 LLaMA 33B (LoRA)	74.0	51.3	50.2	58.5

The LLaMA 33B (LoRA) performance is achieved with only ~16h finetuning on the training split of PubMedQA and MedMCQA with a single 8 * A100 server. For more performance, including instruction tuning results, please refer to our Documentation.
LLaMA 33B (LoRA) 性能仅需 ~16 小时微调即可在 PubMedQA 和 MedMCQA 的训练拆分上使用单个 8 * A100 服务器实现。有关更多性能，包括指令调优结果，请参阅我们的文档。

Model Zoo

We open-sourced the trained checkpoints to everyone for further training and inference.
我们将经过训练的检查点开源给大家进行进一步的训练和推理。

Instruct-tuned Models 指令调整模型	Base Model	Download
Hu (湖羊)	LLaMA-7B	Google Drive
Dongshan (东山羊)	LLaMA-13B	Google Drive
Hetian (和田羊)	LLaMA-33B	Google Drive
Altay (阿勒泰羊)	LLaMA-65B	Google Drive
LLaMA7B-medical	LLaMA-7B	Google Drive
LLaMA13B-medical	LLaMA-13B	Google Drive
LLaMA33B-medical	LLaMA-33B	Google Drive
LLaMA65B-medical	LLaMA-65B	Google Drive

Supported Pipelines 支持的管道

Pipelines	Status
Task Tuning	✅ Supported ✅ 支持
Instruction Tuning 指令调优	✅ Supported ✅ 支持
Parameter-Efficient Tuning 参数高效调优	✅ Supported ✅ 支持
Large Model Inference 大型模型推理	✅ Supported ✅ 支持
Alignment Tuning	🔧 Developing 🔧 发展中

Supported Models

Seamlessly supported all the decoder models in 🤗 huggingface. LLaMA, GPT2, GPT-Neo, Galactica, have been fully tested. We will support encoder models soon.
无缝支持 🤗 huggingface 中的所有解码器模型。 LLaMA、GPT2、GPT-Neo、Galactica，已经过全面测试。我们将很快支持编码器模型。

1.Setup

git clone https://github.com/OptimalScale/LMFlow.git
cd LMFlow
conda create -n lmflow python=3.9 -y
conda activate lmflow
conda install mpi4py
pip install -e .

2.Prepare Dataset

You can easily download the example training dataset and test dataset by running
您可以通过运行轻松下载示例训练数据集和测试数据集

cd data
bash download.sh all
cd -

If you cannot access Google Drive, you can download the data by BaiduNetDisk.
如果您无法访问Google Drive，您可以通过百度网盘下载数据。

You can also use your own dataset by simply convert to the following format:
您还可以通过简单地转换为以下格式来使用自己的数据集：

{
  "type": "text2text",
  "instances": [
    {
      "input": "Question: The Transformer architecture [START_REF]",
      "output": "N/A"
    },
    ...
  ]
}

{
  "type": "text_only",
  "instances": [
    {
      "text": "Defintion: In this task, we ask you to write an answer to a question that involves events that may be stationary (not changing over time) or transient (changing over time). For example, the sentence \"he was born in the U.S.\" contains a stationary event since it will last forever; however, \"he is hungry\" contains a transient event since it will remain true for a short period of time. Note that a lot of the questions could have more than one correct answer. We only need a single most-likely answer. Please try to keep your \"answer\" as simple as possible. Concise and simple \"answer\" is preferred over those complex and verbose ones. \n Input: Question: Sentence: It's hail crackled across the comm, and Tara spun to retake her seat at the helm. \nQuestion: Will the hail storm ever end? \n Output: NA \n\n"
    },
    ...
  ]
}

3. Run Scripts 3.运行脚本

3.1 Run Finetuning 3.1 运行微调

You can run scripts/run_finetune.sh to finetune a GPT-2 base model
您可以运行 scripts/run_finetune.sh 来微调 GPT-2 基本模型

./scripts/run_finetune.sh

If you would like to provide arguments for deepspeed to reflect your machine settings, you may pass the corresponding deepspeed arguments to the script. For example,
如果您想为 deepspeed 提供参数以反映您的机器设置，您可以将相应的 deepspeed 参数传递给脚本。例如，

./scripts/run_finetune.sh "--num_gpus=8 --master_port 10001"

To enable LoRA finetuning, you may refer to
要启用 LoRA 微调，您可以参考

./scripts/run_finetune_with_lora.sh

which can be run in similar manner.
可以以类似的方式运行。

For detailed configurations, one may modify these scripts directly. These scripts actually just call python script examples/finetune.py, which can be run in following manner,
具体的配置可以直接修改这些脚本。这些脚本实际上只是调用 python 脚本 examples/finetune.py ，可以通过以下方式运行，

deepspeed ${deepspeed_args} \
  examples/finetune.py \
    --deepspeed configs/ds_config_zero3.json \
    --bf16 \
    --run_name finetune_with_lora \
    --model_name_or_path facebook/galactica-1.3b \
    --num_train_epochs 0.01 \
    --learning_rate 2e-5 \
    --dataset_path ${dataset_path} \
    --per_device_train_batch_size 1 \
    --per_device_eval_batch_size 1 \
    --validation_split_percentage 0 \
    --logging_steps 20 \
    --block_size 512 \
    --do_train \
    --output_dir output_models/finetune \
    --overwrite_output_dir \
    --ddp_timeout 72000 \
    --save_steps 5000 \
    --dataloader_num_workers 1

Here we set number of epochs --num_train_epochs to 0.01 so that the finetuning process can be finished quickly. If you wish to obtain a model with better performance, feel free to adjust those hyperparameters. You may run
在这里，我们将轮数 --num_train_epochs 设置为 0.01 ，以便快速完成微调过程。如果您希望获得性能更好的模型，请随意调整这些超参数。你可以跑

python examples/finetune.py -h

to view all possible finetuning arguments. The finetuned model checkpoint will be saved in the argument specified by --output_dir, which is output_models/finetune in the above example.
查看所有可能的微调参数。微调后的模型检查点将保存在 --output_dir 指定的参数中，在上例中为 output_models/finetune 。

3.2 Run Evaluation 3.2 运行评价

One can directly run evaluation with an existing huggingface model, e.g. to run GPT2 large, one may execute
可以直接使用现有的 huggingface 模型运行评估，例如要运行大型 GPT2，可以执行

./scripts/run_evaluation.sh

or run the corresponding python script
或者运行相应的python脚本

CUDA_VISIBLE_DEVICES=0 \
    deepspeed examples/evaluate.py \
    --answer_type medmcqa \
    --model_name_or_path gpt2-large \
    --test_file data/MedQA-USMLE/validation/valid_1273.json \
    --deepspeed examples/ds_config.json \

To load the finetuned model, specify --model_name_or_path with the saved model checkpoint directory path.
要加载微调模型，请使用保存的模型检查点目录路径指定 --model_name_or_path 。

For LoRA finetuned models, one may refer to
对于 LoRA 微调模型，可以参考

./scripts/run_evaluation_with_lora.sh

Those scripts invoke the examples examples/*.py built based on our APIs. For more API-related examples, one may refer to the methods in the unittest tests.
这些脚本调用基于我们的 API 构建的示例 examples/*.py 。更多API相关的例子可以参考unittest tests 中的方法。

4. Additional Notes 4. 附加说明

4.1 LLaMA Checkpoint 4.1 美洲驼检查点

First, you need to get the access of LLaMA model from facebookresearch/llama. Download the official checkpoints and save them into ${llama-path}.
首先，您需要从 facebookresearch/llama 获取 LLaMA 模型的访问权限。下载官方检查点并将它们保存到 ${llama-path} 中。
Second, convert the official checkpoints ${llama-path} to HuggingFace supported checkpoints ${llama-hf-path} by running
其次，通过运行将官方检查点 ${llama-path} 转换为 HuggingFace 支持的检查点 ${llama-hf-path}

python ./scripts/convert_llama_weights_to_hf.py --input_dir ${llama-path} --model_size 7B --output_dir ${llama-hf-path}/llama-7b-hf
Then you are good to go by setting the checkpoint path to ${llama-hf-path}/llama-7b-hf. Enjoy it!
然后，您可以将检查点路径设置为 ${llama-hf-path}/llama-7b-hf 。好好享受！
(optional) Now you have the original llama-7b-hf pretrained model. With
（可选）现在您有了原始的 llama-7b-hf 预训练模型。和

cd output_models && ./download.sh all && cd -

You can obtain the model difference finetuned by ours. By a way similar to ./scripts/run_evaluation_with_lora.sh,
您可以获得我们微调的模型差异。通过类似于 ./scripts/run_evaluation_with_lora.sh 的方式，

CUDA_VISIBLE_DEVICES=0 \
    deepspeed examples/evaluate.py \
    --answer_type text \
    --model_name_or_path ${llama-hf-path}/llama-7b-hf \
    --lora_model_path output_models/${llama-model-diff-path} \
    --test_file data/alpaca/test/test_252.json \
    --deepspeed examples/ds_config.json

You can now evaluate with the finetuned llama model.
您现在可以使用微调的 llama 模型进行评估。

4.2 DeepSpeed Config 4.2 DeepSpeed 配置

You can config the deepspeed under configs. Details can be referred at DeepSpeed Configuration
您可以在配置下配置 deepspeed。详情可参考DeepSpeed Configuration

5. Model Release 5.模型发布

5.1 Medical Model Checkpoints 5.1 医学模型检查点

You can run following script to download our medical model checkpoints :
您可以运行以下脚本来下载我们的医学模型检查点：

cd output_models
bash download.sh medical_ckpt
cd -

You can also directly download our model via google drive link : medical_ckpt.tar.gz
您也可以通过谷歌驱动器链接直接下载我们的模型：medical_ckpt.tar.gz

5.2 Instruction Model Checkpoints
5.2 指令模型检查点

Similarly, you can run following script to download our instruction model checkpoints :
同样，您可以运行以下脚本来下载我们的指令模型检查点：

cd output_models
bash download.sh instruction_ckpt
cd -

You can also directly download our model via google drive link : instruction_ckpt.tar.gz
您也可以通过谷歌驱动器链接直接下载我们的模型：instruction_ckpt.tar.gz

5.3 Begin Reproduce 5.3 开始重现

After downloading the model checkpoints. You can replace the --lora_model_path with output_models/instruction_ckpt/llama7b-lora (example for llama-7b for instruction) and replace --model_name_or_path with your converted llama model inside LMFlow/scripts/run_evaluation_with_lora.sh and run this shell script to reproduce the result.
下载模型检查点后。您可以将 --lora_model_path 替换为 output_models/instruction_ckpt/llama7b-lora （用于 llama-7b 的说明示例）并将 --model_name_or_path 替换为您在 LMFlow/scripts/run_evaluation_with_lora.sh 中转换的 llama 模型，然后运行此 shell 脚本以重现结果。

Then you can check the model performance at our Doc.
然后您可以在我们的文档中查看模型性能。

Documentation

Please refer to our Documentation for more API reference and experimental results.
有关更多 API 参考和实验结果，请参阅我们的文档。

Vision

Hello there! We are excited to announce the upcoming release of our code repository that includes a complete LLM training process, enabling users to quickly build their own language models and train them effectively.
你好呀！我们很高兴地宣布即将发布我们的代码存储库，其中包括完整的 LLM 培训流程，使用户能够快速构建自己的语言模型并有效地对其进行培训。

Our code repository is not just a simple model; it includes the complete training workflow, model optimization, and testing tools. You can use it to build various types of language models, including conversation models, question-answering models, and text generation models, among others.
我们的代码库不仅仅是一个简单的模型；它包括完整的训练工作流程、模型优化和测试工具。您可以使用它来构建各种类型的语言模型，包括对话模型、问答模型和文本生成模型等。

Moreover, we aim to create an open and democratic LLM sharing platform where people can share their checkpoints and experiences to collectively improve the skills of the community. We welcome anyone who is interested in LLM to participate and join us in building an open and friendly community!
此外，我们旨在创建一个开放和民主的 LLM 共享平台，人们可以在这个平台上分享他们的检查点和经验，以共同提高社区的技能。我们欢迎任何对LLM感兴趣的人参与进来，与我们一起建设一个开放友好的社区！

Whether you are a beginner or an expert, we believe that you can benefit from this platform. Let’s work together to build a vibrant and innovative LLM community!
无论您是初学者还是专家，我们相信您都能从这个平台中获益。让我们共同努力，建立一个充满活力和创新的LLM社区！

我们很高兴地开源LMFlow代码库，其中包括了完整的大模型训练流程，能够快速、高效地训练和部署自己的语言模型。

我们的代码库不仅仅是一个简单的模型；它包括完整的训练流程、模型权重和测试工具。您可以使用它来构建各种类型的语言模型，包括对话模型、问答模型和文本生成模型等。

此外，我们旨在创建一个开放和民主的大模型共享平台，任何人都可以在这个平台上分享训练模型权重和经验。我们欢迎任何对大模型感兴趣的人参与进来，与我们一起建设一个开放友好的社区！

无论您是初学者还是专家，我们相信大家都能从这个平台中获益。让我们共同努力，建立一个充满活力和创新的大模型社区！

Disclaimer

This package aims to provide a streamlined and user-friendly pipeline for large model tuning. Its functionalities serve as a reference and are intended for use by the user. However, it is important to note that the responsibility for the preparation of the data and pretrained models lies solely with the user. This package does not guarantee the accuracy, completeness, applicability, or legality of the components from the user’s preparation. Users must be aware of and assume all risks and liabilities associated with the preparation of the models and data, and obtain legal, commercial, and technical advice before utilizing this package. The pipeline shall not be held responsible for any direct, indirect, special, incidental, or consequential damages resulting from the user’s improper preparation of the data and pretrained models.
该软件包旨在为大型模型调整提供简化且用户友好的管道。其功能仅供参考，供用户使用。但是，需要注意的是，准备数据和预训练模型的责任完全在于用户。本软件包不保证用户编写的组件的准确性、完整性、适用性或合法性。用户必须了解并承担与模型和数据准备相关的所有风险和责任，并在使用此包之前获得法律、商业和技术建议。对于因用户不当准备数据和预训练模型而导致的任何直接、间接、特殊、偶然或后果性损害，管道不承担任何责任。

Our checkpoints, which include both English and Chinese versions, are provided solely for research purposes. The training data contained within these checkpoints includes generated results from the ChatGPT language model. We do not endorse or encourage the distribution or usage of these checkpoints for commercial purposes. Users of these checkpoints are solely responsible for ensuring that they are used correctly and appropriately.
我们的检查点包括英文和中文版本，仅供研究使用。这些检查点中包含的训练数据包括来自 ChatGPT 语言模型的生成结果。我们不认可也不鼓励出于商业目的分发或使用这些检查点。这些检查点的用户全权负责确保正确和适当地使用它们。

It is also crucial to highlight that the results generated by the model are based on probabilistic models and not directly related to this pipeline. The accuracy, reliability, applicability, and legality of the results are not guaranteed by this pipeline. Therefore, users must also be aware of the risks and liabilities associated with the results and seek legal, commercial, and technical advice before relying on the model-generated outcomes. This pipeline shall not be accountable for any direct, indirect, special, incidental, or consequential damages resulting from the user’s reliance on the model-generated results.
同样重要的是要强调，模型生成的结果是基于概率模型的，与该管道没有直接关系。本管道不保证结果的准确性、可靠性、适用性和合法性。因此，用户还必须了解与结果相关的风险和责任，并在依赖模型生成的结果之前寻求法律、商业和技术建议。对于因用户依赖模型生成的结果而导致的任何直接、间接、特殊、附带或后果性损害，该管道概不负责。

参考链接：https://mp.weixin.qq.com/s/LCGQyNA6sHcdfIIARSNlww