Skip to content

Latest commit

 

History

History
33 lines (26 loc) · 3.23 KB

reademe_en.md

File metadata and controls

33 lines (26 loc) · 3.23 KB

Progress

In October 2024, AutoPlan2 was launched, significantly reducing the data construction cost for AutoPlan and focusing on constructing complex agent data for professional fields during cold start.

In September 2024, high-quality cold start function call data construction was achieved.

AutoPlan

This project mainly accomplished complex task planning and execution in the military domain based on large models, utilizing an improved ReAct technology for long-chain agent execution. This repository is grateful for the strong support of Mr. Yin Junxi, who made a significant contribution to this project.

Basic Principles:

The first version of the project's principle is as follows, Principle Diagram The second version distilled the data from the first version, merging the two models into a smaller one, addressing the shortcomings of the first version such as lack of multi-turn dialogue capability, excessive GPU memory usage, low inference efficiency, overcomplication of simple tasks, and lack of daily conversation functionality.

Usage

There are two ways to use this, either by training a task planning dataset (for the first version) or a fully distilled dataset (for the second version).

Usage 1: Training Task Planning Dataset

train_plan.json and test_plan.json are the training and testing datasets for task planning. They can be used to train within qwen1/qwen1.5, after which the qwen model will acquire the ability to plan tasks. Training Data Example

Change the default value of allparams_split_task_chain in main.py to the trained task planning qwen model. Set execute_model_path to the address of the qwen72b model and set execute_reflexion to false. Keep other parameters unchanged and run to obtain both task planning and execution capabilities.

Usage 2: Training Distilled Task Execution Dataset

train_react.json and test_react.json are the datasets distilled from the task planning and task execution models, including manual annotations, which cover both task planning and execution steps. Place train_react.json in qwen1/qwen1.5 for training, to integrate task planning and execution abilities into a single model; it is recommended to use qwen1.5 14b for training. After training, set the allparams_split_task_chain default value in main.py to false. Change execute_model_path to the address of the trained model, and set execute_reflexion to false. Inference will yield a model that has both task planning and execution capabilities.

Demonstration of Results

Task Planning Phase Results:

Task Planning Result

Task Execution Phase Results:

Task Execution Result 1 Task Execution Result 2