I find greedy sampling works the best for instruct mode #548

blackle · 2023-03-27T06:50:00Z

blackle
Mar 27, 2023

If you use --top_p 0.0 --top_k 1 then the sampler will always pick the most likely next token. For the 7b alpaca model I have, this works extremely well and rarely gets caught in loops.

As an example, here's a prompt:

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
Write a poem about a frog who learns so much about steel export taxation policy that he decides to write the poem that you're reading at this moment!

### Response:

And the output:

A frog jumps in the river,
His thoughts far away from the mire.
He dreams of a life beyond,
Where his knowledge can reach its zenith.

So he sets out on a grand journey,
To explore all that he can.
From steel export taxation policy,
He learns much more than he had planned.

For days and weeks he studies hard,
His mind filled with facts galore.
Now his knowledge is quite vast,
So he pens this poem at last.

Although the network still hallucinates sometimes, I find that greedy sampling makes it less common and also it sticks to the prompt better. This is especially useful if you have a specific task, like if you want it to extract information from some unstructured input.

anzz1 · 2023-03-27T07:12:35Z

anzz1
Mar 27, 2023

Good research 👍 .

I've also found that the alpaca-lora models get easily stuck in a loop. However, using the palpaca-7B (non-lora, q4_0) model, it hasn't gotten stuck in a loop, not once, when running in instruct mode with default settings and -c 2024 -n -1. It works unbelievably well and outclasses all the other 7B models so far, even might be better than the 13B ones. However, it only works so well in the instruct mode and it seems to always give pretty succinct answers, a few paragraphs at most. If you run it without the instruct mode and use --ignore-eos to try a infinite story output, it breaks down quite fast, while the llama models can usually go on for a pretty long time.

0 replies

anzz1 · 2023-03-27T07:19:51Z

anzz1
Mar 27, 2023

For reference the same with palpaca-7B q4_0
main -c 2024 -n -1 -m ./models/palpaca-7B-ggml/ggml-model-q4_0.bin -f ./prompts/alpaca.txt -ins

> Write a poem about a frog who learns so much about steel export taxation policy that he decides to write the poem that you're reading at this moment!
A frog leaped from his lily pad one day,
In search of adventure he wanted to explore.
He found himself in a strange new land,
Where people spoke of steel export taxation policy with such glee.

The little frog was amazed by what he heard,
So much knowledge he had never before been stored.
In search for understanding he hopped around,
And soon realized why this policy was so sound!

Realizing the importance of this great cause,
The little frog decided to write a poem such as this!
>

Your poem seems a lil better tbh 😄 Neither are particularly amazing 😆

0 replies

anzz1 · 2023-03-27T07:26:56Z

anzz1
Mar 27, 2023

The output definitely changed with --top_p 0.0 --top_k 1
For the better, but also for the worse. It now knew rhymes, but also repeated that one sentence.
Also unlike the default settings, this seems to produce the same output every time, so even when seed changes the "most likely" token stays the same?
It's also considerably slower to begin outputting with those options.

main -c 2024 -n -1 -m ./models/palpaca-7B-ggml/ggml-model-q4_0.bin -f ./prompts/alpaca.txt -ins --top_p 0.0 --top_k 1

> Write a poem about a frog who learns so much about steel export taxation policy that he decides to write the poem that you're reading at this moment!
A frog hopped through the night,
Gathering knowledge of light.
He studied hard and learned well,
And now can write a steel export tax poem for you!

The policy was complex and vast,
But he worked hard to understand it fast.
He read all the documents he could find,
And soon became an expert within.

His knowledge grew with every page,
As he studied day and night.
He learned about tariffs and quotas too,
And now can write a steel export tax poem for you!
>

1 reply

anzz1 Mar 27, 2023

My dude 🐸 just killed taxes.

main -c 2024 -n -1 -m ./models/palpaca-7B-ggml/ggml-model-q4_1.bin -f ./prompts/alpaca.txt -ins

> Write a poem about a frog who learns so much about steel export taxation policy that he decides to write the poem that you're reading at this moment!
A frog from a distant swamp,
Who learned much of laws and trumpets,
Took it upon him to write,
A song so sweet, for all who hear.

He sang with his croaky voice,
Of taxes that made him go,
"Woah-oh!" he said, "no more!"
And so the taxation stopped there.

Poems seem to be a thing that even small changes to any vars or changing the quantization make the results very different.
I suspect that the model isn't very confident and picks from low p-values. That is just a guess though, and more the reason why a trace feature like this

Trace model outputs to a binary file #477

will be amazing as you can break the output into the token id's and top_k , top_p samplings to have a better understanding of the choices the models make. this is also something that in my opinion cannot be quantitatively assessed with perplexity as the 'quality' of things like poems and stories are very much a qualitative and subjective thing.

Piezoid · 2023-03-27T14:27:13Z

Piezoid
Mar 27, 2023

@blackle I just pushed a draft version of a new repeat penalization heuristic that might help with sampling at low temperatures #331 (comment)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I find greedy sampling works the best for instruct mode #548

{{title}}

Replies: 4 comments 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

I find greedy sampling works the best for instruct mode #548

blackle Mar 27, 2023

Replies: 4 comments · 1 reply

anzz1 Mar 27, 2023

anzz1 Mar 27, 2023

anzz1 Mar 27, 2023

anzz1 Mar 27, 2023

Piezoid Mar 27, 2023

blackle
Mar 27, 2023

Replies: 4 comments 1 reply

anzz1
Mar 27, 2023

anzz1
Mar 27, 2023

anzz1
Mar 27, 2023

Piezoid
Mar 27, 2023