Skip to content

Commit 0357c16

Browse files
ljmwaughwaugh
andauthored
correct spelling of document in context row of table (#48)
Signed-off-by: waugh <[email protected]> Co-authored-by: waugh <[email protected]>
1 parent e8f674c commit 0357c16

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

docs/taxonomy/knowledge/file_structure.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Key | Type | Required | Constraints | Value | Notes
2323
`created_by` | string | Y | no spaces | Your GitHub username (for the upstream taxonomy) or your name with no spaces (for general intructlab use) | -
2424
`domain` | string | Y | - | Knowledge sub-category | The knowledge domain which is used in prompts to the teacher model during synthetic data generation. The domain should be brief such as the title to a textbook chapter or section.
2525
`seed_examples` | Y | array | at least 5 sets | null | This is a collection of questions and answers with context from the knowledge document that InstructLab uses to generate data synthetically.
26-
`context` | string | Y | < 500 tokens | A chunk of the knowledge document showing off the different **unique** content to help guide the teacher model. If the knowledge documents have only text, all context would be text. If the knowledge documnets have tables or other content formats, ensure samples of those formats are all used. | This should be a copy-paste from the Markdown version of your document
26+
`context` | string | Y | < 500 tokens | A chunk of the knowledge document showing off the different **unique** content to help guide the teacher model. If the knowledge documents have only text, all context would be text. If the knowledge documents have tables or other content formats, ensure samples of those formats are all used. | This should be a copy-paste from the Markdown version of your document
2727
`questions_and_answers` | Y | array | at least 3 pairs per context | null | This is a collection of questions and answers.
2828
`question` | Y | string | \> 250 tokens | A question related to and grounded in the relevant context | Questions are things you'd expect someone to ask the model based on the context given. This will be used for synthetic data generation.
2929
`answer` | Y | string | \> 250 tokens | An answer for the question, longer than a one-word or one-number answer | Answers are what you'd like the model to give as an answer. It will not be an exact answer the model always gives.

0 commit comments

Comments
 (0)