Trying to run eval in agent_rag branch #128

heyjustinai · 2024-11-26T08:04:41Z

System Info

llama_models                             0.0.55
llama_stack                              0.0.55      /Users/justinlee/local/llama-stack
llama_stack_client                       0.0.55

yaml

version: "2"
built_at: "2024-11-11T21:59:52.074753"
image_name: fireworks
docker_image: null
conda_env: fireworks
apis:
  - inference
  - telemetry
  - datasetio
  - eval
  - scoring
  - inference
  - memory
  - safety
  - agents
  - telemetry
providers:
  safety: []
  scoring:
    - provider_id: basic-0
      provider_type: inline::basic
      config: {}
    - provider_id: llm-as-judge-0
      provider_type: inline::llm-as-judge
      config: {}
    - provider_id: braintrust-0
      provider_type: inline::braintrust
      config: {}
  datasetio:
    - provider_id: huggingface-0
      provider_type: remote::huggingface
      config: {}
    - provider_id: localfs-0
      provider_type: inline::localfs
      config: {}
  eval:
    - provider_id: meta-reference-0
      provider_type: inline::meta-reference
      config: {}
  inference:
    # - provider_id: fireworks-0
    #   provider_type: remote::together
    #   config:
    #     url: https://api.together.xyz/v1
    # api_key: <ENTER_YOUR_API_KEY>
    - provider_id: fireworks-0
      provider_type: remote::fireworks
      config:
        url: https://api.fireworks.ai/inference
        api_key:
  telemetry:
    - provider_id: meta-reference-0
      provider_type: inline::meta-reference
      config: {}
  memory:
    - provider_id: chromadb-0
      provider_type: remote::chromadb
      config:
        host: localhost
        port: 8000
        protocol: http
  agents:
    - provider_id: meta-reference-0
      provider_type: inline::meta-reference
      config:
        persistence_store:
          namespace: null
          type: sqlite
          db_path: /Users/xiyan/.llama/runtime/kvstore.db
metadata_store: null
models:
  - metadata: {}
    model_id: meta-llama/Llama-3.1-8B-Instruct
    provider_id: null
    provider_model_id: fireworks/llama-v3p1-8b-instruct
  - metadata: {}
    model_id: meta-llama/Llama-3.1-70B-Instruct
    provider_id: null
    provider_model_id: fireworks/llama-v3p1-70b-instruct
  - metadata: {}
    model_id: meta-llama/Llama-3.1-405B-Instruct-FP8
    provider_id: null
    provider_model_id: fireworks/llama-v3p1-405b-instruct
  - metadata: {}
    model_id: meta-llama/Llama-3.2-1B-Instruct
    provider_id: null
    provider_model_id: fireworks/llama-v3p2-1b-instruct
  - metadata: {}
    model_id: meta-llama/Llama-3.2-3B-Instruct
    provider_id: null
    provider_model_id: fireworks/llama-v3p2-3b-instruct
  - metadata: {}
    model_id: meta-llama/Llama-3.2-11B-Vision-Instruct
    provider_id: null
    provider_model_id: fireworks/llama-v3p2-11b-vision-instruct
  - metadata: {}
    model_id: meta-llama/Llama-3.2-90B-Vision-Instruct
    provider_id: null
    provider_model_id: fireworks/llama-v3p2-90b-vision-instruct
  - metadata: {}
    model_id: meta-llama/Llama-Guard-3-8B
    provider_id: null
    provider_model_id: fireworks/llama-guard-3-8b
  - metadata: {}
    model_id: meta-llama/Llama-Guard-3-11B-Vision
    provider_id: null
    provider_model_id: fireworks/llama-guard-3-11b-vision
datasets:
  - dataset_id: mmlu
    provider_id: huggingface-0
    url:
      uri: https://huggingface.co/datasets/llamastack/evals
    metadata:
      path: llamastack/evals
      name: evals__mmlu__details
      split: train
    dataset_schema:
      input_query:
        type: string
      expected_answer:
        type: string
      chat_completion_input:
        type: string
  - dataset_id: simpleqa
    provider_id: huggingface-0
    url:
      uri: https://huggingface.co/datasets/llamastack/evals
    metadata:
      path: llamastack/evals
      name: evals__simpleqa
      split: train
    dataset_schema:
      input_query:
        type: string
      expected_answer:
        type: string
      chat_completion_input:
        type: string
eval_tasks:
  - eval_task_id: meta-reference-mmlu
    provider_id: meta-reference-0
    dataset_id: mmlu
    scoring_functions:
      - basic::regex_parser_multiple_choice_answer
  - eval_task_id: meta-reference-simpleqa
    provider_id: meta-reference-0
    dataset_id: simpleqa
    scoring_functions:
      - llm-as-judge::405b-simpleqa

Information

The official example scripts
My own modified scripts

🐛 Describe the bug

Steps to reproduce:

install llama stack from source
build together llama stack distro
changed yaml file based on discussion with @yanxi0830, included above
changed config:

AGENT_CONFIG = AgentConfig(
    model="meta-llama/Llama-3.2-3B-Instruct",
    instructions="You are a helpful assistant",

run generate.py in evals/rag [success]
But i get 404 when running llama-stack-client scoring_functions list and also

--dataset-path <path-to-local-dataset> \
--output-dir ./

Error logs

it seems like llama-stack-client is not hitting the server at all scoring_functions list
client side error:

llama-stack-client eval run_scoring braintrust::answer-correctness \
--dataset-path ./rag/data/input_llamastack_generated.csv \
--output-dir ./rag/data/results
   0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/2  [ 0:00:16 < -:--:-- , ? it/s ]
Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniforge/base/envs/llamastack-fireworks/bin/llama-stack-client", line 8, in <module>
    sys.exit(main())
  File "/opt/homebrew/Caskroom/miniforge/base/envs/llamastack-fireworks/lib/python3.10/site-packages/llama_stack_client/lib/cli/llama_stack_client.py", line 80, in main
    cli()
  File "/opt/homebrew/Caskroom/miniforge/base/envs/llamastack-fireworks/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/llamastack-fireworks/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/llamastack-fireworks/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/homebrew/Caskroom/miniforge/base/envs/llamastack-fireworks/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/homebrew/Caskroom/miniforge/base/envs/llamastack-fireworks/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/llamastack-fireworks/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/llamastack-fireworks/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/llamastack-fireworks/lib/python3.10/site-packages/llama_stack_client/lib/cli/eval/run_scoring.py", line 100, in run_scoring
    score_res = client.scoring.score(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/llamastack-fireworks/lib/python3.10/site-packages/llama_stack_client/resources/scoring.py", line 78, in score
    return self._post(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/llamastack-fireworks/lib/python3.10/site-packages/llama_stack_client/_base_client.py", line 1261, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
  File "/opt/homebrew/Caskroom/miniforge/base/envs/llamastack-fireworks/lib/python3.10/site-packages/llama_stack_client/_base_client.py", line 953, in request
    return self._request(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/llamastack-fireworks/lib/python3.10/site-packages/llama_stack_client/_base_client.py", line 1056, in _request
    raise self._make_status_error_from_response(err.response) from None
llama_stack_client.NotFoundError: Error code: 404 - {'detail': 'Not Found'}

Expected behavior

should be able to do eval on dataset generated by generate.py

The text was updated successfully, but these errors were encountered:

yanxi0830 · 2024-11-26T18:49:09Z

Wondering if you have the server side logs? The llama_stack_client.NotFoundError: Error code: 404 - {'detail': 'Not Found'} suggests that the endpoint is not found. What is printed out during server startup?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to run eval in agent_rag branch #128

Trying to run eval in agent_rag branch #128

heyjustinai commented Nov 26, 2024 •

edited

Loading

yanxi0830 commented Nov 26, 2024 •

edited

Loading

Trying to run eval in agent_rag branch #128

Trying to run eval in agent_rag branch #128

Comments

heyjustinai commented Nov 26, 2024 • edited Loading

System Info

Information

🐛 Describe the bug

Error logs

Expected behavior

yanxi0830 commented Nov 26, 2024 • edited Loading

heyjustinai commented Nov 26, 2024 •

edited

Loading

yanxi0830 commented Nov 26, 2024 •

edited

Loading