Skip to content

Eval bug: output from gpt-oss-20b-F16 model has no comparison the output of the same prompt on gpt-oss.com #15190

@jgforbes

Description

@jgforbes

Name and Version

$ ./llama-cli --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
version: 6121 (e54d41b)
built with cc (GCC) 14.3.1 20250523 (Red Hat 14.3.1-1) for x86_64-redhat-linux

Sending the simple prompt "What is bugonia?" to the 20b model on gpt-oss.com gives a perfect response.
With llama-cli it tries to reason an answer but never comes close the the correct answer from gpt-oss.com.

Neither of these invocations give an acceptable anser:
$ ./llama.cpp/llama-cli -hf unsloth/gpt-oss-20b-GGUF:F16 --jinja -ngl 99 --threads -1 --ctx-size 16384 --temp 1.0 --top-p 1.0 --top-k 0

$ ./llama.cpp/llama-cli -hf ggml-org/gpt-oss-20b-GGUF -c 0 -fa --jinja --reasoning-format none

Operating systems

Linux

GGML backends

CUDA

Hardware

Intel(R) Core(TM) i7-5820K CPU
RTX 3090

Models

No response

Problem description & steps to reproduce

run command above with given prompt

First Bad Commit

New with this model

Relevant log output

NA

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions