The command above did not work in Anaconda Prompt, but this version did:

(chatdev_conda_env) C:\chatdev>set OPENAI_API_BASE=http://127.0.0.1:5001/v1

(chatdev_conda_env) C:\chatdev>set OPENAI_API_KEY=123456

(chatdev_conda_env) C:\chatdev>python run.py --task "Hello world in python" --name "HelloWorld"
**[Preprocessing]**

**ChatDev Starts** (20230913191808)

**Timestamp**: 20230913191808

**config_path**: C:\chatdev\CompanyConfig\Default\ChatChainConfig.json

**config_phase_path**: C:\chatdev\CompanyConfig\Default\PhaseConfig.json

**config_role_path**: C:\chatdev\CompanyConfig\Default\RoleConfig.json

**task_prompt**: Hello world in python

**project_name**: HelloWorld

**Log File**: C:\chatdev\WareHouse\HelloWorld_DefaultOrganization_20230913191808.log

**ChatDevConfig**:
 ChatEnvConfig.clear_structure: True
ChatEnvConfig.brainstorming: False


**ChatGPTConfig**:
 ChatGPTConfig(temperature=0.2, top_p=1.0, n=1, stream=False, stop=None, max_tokens=None, presence_penalty=0.0, frequency_penalty=0.0, logit_bias={}, user='')

I am having a problem using it with local api:

It looks like all that the API returns is 1 token:

Text-generation-webui side:

llm_load_print_meta: model size     = 13.02 B
llm_load_print_meta: general.name   = openassistant_llama2-13b-orca-8k-3319
llm_load_print_meta: BOS token = 1 '<s>'
llm_load_print_meta: EOS token = 2 '</s>'
llm_load_print_meta: UNK token = 0 '<unk>'
llm_load_print_meta: LF token  = 13 '<0x0A>'
llm_load_tensors: ggml ctx size =    0.12 MB
llm_load_tensors: using CUDA for GPU acceleration
llm_load_tensors: mem required  =  128.35 MB (+ 1600.00 MB per state)
llm_load_tensors: offloading 40 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloading v cache to GPU
llm_load_tensors: offloading k cache to GPU
llm_load_tensors: offloaded 43/43 layers to GPU
llm_load_tensors: VRAM used: 11656 MB
...................................................................................................
llama_new_context_with_model: kv self size  = 1600.00 MB
llama_new_context_with_model: compute buffer total size =  191.47 MB
llama_new_context_with_model: VRAM scratch buffer: 190.00 MB
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 |
2023-09-13 19:11:27 INFO:Loaded the model in 7.52 seconds.

Warning: $This model maximum context length is 2048 tokens. However, your messages resulted in over 498 tokens and max_tokens is 15937.

llama_print_timings:        load time =   955.37 ms
llama_print_timings:      sample time =     0.23 ms /     1 runs   (    0.23 ms per token,  4424.78 tokens per second)
llama_print_timings: prompt eval time =   955.31 ms /   498 tokens (    1.92 ms per token,   521.30 tokens per second)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
llama_print_timings:       total time =   957.88 ms
Output generated in 1.41 seconds (0.00 tokens/s, 0 tokens, context 498, seed 1828391196)
127.0.0.1 - - [13/Sep/2023 19:11:42] "POST /v1/chat/completions HTTP/1.1" 200 -
Warning: $This model maximum context length is 2048 tokens. However, your messages resulted in over 551 tokens and max_tokens is 15885.
Llama.generate: prefix-match hit

llama_print_timings:        load time =   955.37 ms
llama_print_timings:      sample time =     0.16 ms /     1 runs   (    0.16 ms per token,  6410.26 tokens per second)
llama_print_timings: prompt eval time =   835.88 ms /   489 tokens (    1.71 ms per token,   585.01 tokens per second)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
llama_print_timings:       total time =   836.73 ms
Output generated in 1.24 seconds (0.00 tokens/s, 0 tokens, context 551, seed 192786861)
127.0.0.1 - - [13/Sep/2023 19:11:46] "POST /v1/chat/completions HTTP/1.1" 200 -
Warning: $This model maximum context length is 2048 tokens. However, your messages resulted in over 521 tokens and max_tokens is 15907.
Llama.generate: prefix-match hit

llama_print_timings:        load time =   955.37 ms
llama_print_timings:      sample time =     0.13 ms /     1 runs   (    0.13 ms per token,  7633.59 tokens per second)
llama_print_timings: prompt eval time =   884.39 ms /   459 tokens (    1.93 ms per token,   519.00 tokens per second)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
llama_print_timings:       total time =   885.63 ms
Output generated in 1.26 seconds (0.00 tokens/s, 0 tokens, context 521, seed 1288396660)
127.0.0.1 - - [13/Sep/2023 19:11:53] "POST /v1/chat/completions HTTP/1.1" 200 -
Warning: $This model maximum context length is 2048 tokens. However, your messages resulted in over 574 tokens and max_tokens is 15854.
Llama.generate: prefix-match hit

ChatDev side



Chief Executive Officer: **Chief Product Officer<->Chief Executive Officer on : DemandAnalysis, turn 0**

[ChatDev is a software company powered by multiple intelligent agents, such as chief executive officer, chief human resources officer, chief product officer, chief technology officer, etc, with a multi-agent organizational structure and the mission of "changing the digital world through programming".
You are Chief Product Officer. we are both working at ChatDev. We share a common interest in collaborating to successfully complete a task assigned by a new customer.
You are responsible for all product-related matters in ChatDev. Usually includes product design, product strategy, product vision, product innovation, project management and product marketing.
Here is a new customer's task: Hello world in python.
To complete the task, you must write a response that appropriately solves the requested instruction based on your expertise and customer's needs.]



**[OpenAI_Usage_Info Receive]**
prompt_tokens: 521
completion_tokens: 1
total_tokens: 522


**[OpenAI_Usage_Info Receive]**
prompt_tokens: 574
completion_tokens: 1
total_tokens: 575


Chief Product Officer: **Chief Product Officer<->Chief Executive Officer on : DemandAnalysis, turn 1**

[ChatDev is a software company powered by multiple intelligent agents, such as chief executive officer, chief human resources officer, chief product officer, chief technology officer, etc, with a multi-agent organizational structure and the mission of "changing the digital world through programming".
You are Chief Executive Officer. Now, we are both working at ChatDev and we share a common interest in collaborating to successfully complete a task assigned by a new customer.
Your main responsibilities include being an active decision-maker on users' demands and other key policy issues, leader, manager, and executor. Your decision-making role involves high-level decisions about policy and strategy; and your communicator role can involve speaking to the organization's management and employees.
Here is a new customer's task: Hello world in python.
To complete the task, I will give you one or more instructions, and you must help me to write a specific solution that appropriately solves the requested instruction based on your expertise and my needs.]



Chief Executive Officer: **Chief Product Officer<->Chief Executive Officer on : DemandAnalysis, turn 1**

[ChatDev is a software company powered by multiple intelligent agents, such as chief executive officer, chief human resources officer, chief product officer, chief technology officer, etc, with a multi-agent organizational structure and the mission of "changing the digital world through programming".
You are Chief Product Officer. we are both working at ChatDev. We share a common interest in collaborating to successfully complete a task assigned by a new customer.
You are responsible for all product-related matters in ChatDev. Usually includes product design, product strategy, product vision, product innovation, project management and product marketing.
Here is a new customer's task: Hello world in python.
To complete the task, you must write a response that appropriately solves the requested instruction based on your expertise and customer's needs.]



**[OpenAI_Usage_Info Receive]**
prompt_tokens: 544
completion_tokens: 1
total_tokens: 545

Local Models integration #27

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions