Merge branch 'openai:main' into main

mjuetz · Jan 4, 2024 · 40fc016 · 40fc016
2 parents 4fe3d78 + b7316a1
commit 40fc016
Show file tree

Hide file tree

Showing 39 changed files with 30,276 additions and 2,032 deletions.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -31,16 +31,18 @@ For additional advice on writing good documentation, refer to [What Makes Docume
 | Uniqueness   | Does the content offer new insights or unique information compared to existing documentation?       |       |
 | Clarity      | Is the language easy to understand? Are things well-explained? Is the title clear?                  |       |
 | Correctness  | Are the facts, code snippets, and examples correct and reliable? Does everything execute correctly? |       |
+| Conciseness  | Is the content concise? Are all details necessary? Can it be made shorter?                          |       |
 | Completeness | Is the content thorough and detailed? Are there things that weren’t explained fully?                |       |
 | Grammar      | Are there grammatical or spelling errors present?                                                   |       |
 
 ### Breakdown
 
-| Criteria     | 4                                      | 3                                        | 2                                             | 1                                     |
-| ------------ | -------------------------------------- | ---------------------------------------- | --------------------------------------------- | ------------------------------------- |
-| Relevance    | Relevant and useful.                   | Relevant but not very useful.            | Tangentially relevant.                        | Not relevant.                         |
-| Uniqueness   | Completely unique with fresh insights. | Unique with minor overlaps.              | Some unique aspects, but significant overlap. | Many similar guides/examples.         |
-| Clarity      | Clear language and structure.          | Clear language, unclear structure.       | Some sections unclear.                        | Confusing and unclear.                |
-| Correctness  | Completely error free.                 | Code works, minor improvements needed.   | Few errors and warnings.                      | Many errors, code doesn't execute.    |
-| Completeness | Complete and detailed.                 | Mostly complete, minor additions needed. | Lacks some explanations.                      | Missing significant portions.         |
-| Grammar      | Perfect grammar.                       | Correct grammar, few typos.              | Some spelling/grammatical errors.             | Numerous spelling/grammatical errors. |
+| Criteria     | 4                                             | 3                                         | 2                                             | 1                                          |
+| ------------ | --------------------------------------------- | ----------------------------------------- | --------------------------------------------- | ------------------------------------------ |
+| Relevance    | Relevant and useful.                          | Relevant but not very useful.             | Tangentially relevant.                        | Not relevant.                              |
+| Uniqueness   | Completely unique with fresh insights.        | Unique with minor overlaps.               | Some unique aspects, but significant overlap. | Many similar guides/examples.              |
+| Clarity      | Clear language and structure.                 | Clear language, unclear structure.        | Some sections unclear.                        | Confusing and unclear.                     |
+| Correctness  | Completely error free.                        | Code works, minor improvements needed.    | Few errors and warnings.                      | Many errors, code doesn't execute.         |
+| Conciseness  | Cannot be reduced in any section, or overall. | Mostly short, but could still be reduced. | Some long sections, and/or long overall.      | Very long sections and overall, redundant. |
+| Completeness | Complete and detailed.                        | Mostly complete, minor additions needed.  | Lacks some explanations.                      | Missing significant portions.              |
+| Grammar      | Perfect grammar.                              | Correct grammar, few typos.               | Some spelling/grammatical errors.             | Numerous spelling/grammatical errors.      |
diff --git a/articles/how_to_work_with_large_language_models.md b/articles/how_to_work_with_large_language_models.md
@@ -15,9 +15,9 @@ The magic of large language models is that by being trained to minimize this pre
 * how to code
 * etc.
 
-None of these capabilities are explicitly programmed in—they all emerge as a result of training.
+They do this by “reading” a large amount of existing text and learning how words tend to appear in context with other words, and uses what it has learned to predict the next most likely word that might appear in response to a user request, and each subsequent word after that.
 
-GPT-3 powers [hundreds of software products][GPT3 Apps Blog Post], including productivity apps, education apps, games, and more.
+GPT-3 and GPT-4 power [many software products][OpenAI Customer Stories], including productivity apps, education apps, games, and more.
 
 ## How to control a large language model
 
@@ -27,6 +27,7 @@ Large language models can be prompted to produce output in a few ways:
 
 * **Instruction**: Tell the model what you want
 * **Completion**: Induce the model to complete the beginning of what you want
+* **Scenario**: Give the model a situation to play out
 * **Demonstration**: Show the model what you want, with either:
   * A few examples in the prompt
   * Many hundreds or thousands of examples in a fine-tuning training dataset
@@ -35,7 +36,7 @@ An example of each is shown below.
 
 ### Instruction prompts
 
-Instruction-following models (e.g., `text-davinci-003` or any model beginning with `text-`) are specially designed to follow instructions. Write your instruction at the top of the prompt (or at the bottom, or both), and the model will do its best to follow the instruction and then stop. Instructions can be detailed, so don't be afraid to write a paragraph explicitly detailing the output you want.
+Write your instruction at the top of the prompt (or at the bottom, or both), and the model will do its best to follow the instruction and then stop. Instructions can be detailed, so don't be afraid to write a paragraph explicitly detailing the output you want, just stay aware of how many [tokens](https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them) the model can process.
 
 Example instruction prompt:
 
@@ -71,6 +72,24 @@ Output:
  Ted Chiang
 ```
 
+### Scenario prompt example
+
+Giving the model a scenario to follow or role to play out can be helpful for complex queries or when seeking imaginative responses. When using a hypothetical prompt, you set up a situation, problem, or story, and then ask the model to respond as if it were a character in that scenario or an expert on the topic.
+
+Example scenario prompt:
+```text
+Your role is to extract the name of the author from any given text
+
+“Some humans theorize that intelligent species go extinct before they can expand into outer space. If they're correct, then the hush of the night sky is the silence of the graveyard.”
+― Ted Chiang, Exhalation
+```
+
+Output:
+
+```text
+ Ted Chiang
+```
+
 ### Demonstration prompt example (few-shot learning)
 
 Similar to completion-style prompts, demonstrations can show the model what you want it to do. This approach is sometimes called few-shot learning, as the model learns from a few examples provided in the prompt.
@@ -118,35 +137,33 @@ Output:
 
 ## Code Capabilities
 
-Large language models aren't only great at text - they can be great at code too. OpenAI's specialized code model is called [Codex].
+Large language models aren't only great at text - they can be great at code too. OpenAI's [GPT-4][GPT-4 and GPT-4 Turbo] model is a prime example.
 
-Codex powers [more than 70 products][Codex Apps Blog Post], including:
+GPT-4 powers [numerous innovative products][OpenAI Customer Stories], including:
 
-* [GitHub Copilot] (autocompletes code in VS Code and other IDEs)
-* [Pygma](https://pygma.app/) (turns Figma designs into code)
-* [Replit](https://replit.com/) (has an 'Explain code' button and other features)
-* [Warp](https://www.warp.dev/) (a smart terminal with AI command search)
-* [Machinet](https://machinet.net/) (writes Java unit test templates)
+* [GitHub Copilot] (autocompletes code in Visual Studio and other IDEs)
+* [Replit](https://replit.com/) (can complete, explain, edit and generate code)
+* [Cursor](https://cursor.sh/) (build software faster in an editor designed for pair-programming with AI)
 
-Note that unlike instruction-following text models (e.g., `text-davinci-002`), Codex is *not* trained to follow instructions. As a result, designing good prompts can take more care.
+GPT-4 is more advanced than previous models like `text-davinci-002`. But, to get the best out of GPT-4 for coding tasks, it's still important to give clear and specific instructions. As a result, designing good prompts can take more care.
 
 ### More prompt advice
 
 For more prompt examples, visit [OpenAI Examples][OpenAI Examples].
 
 In general, the input prompt is the best lever for improving model outputs. You can try tricks like:
 
-* **Give more explicit instructions.** E.g., if you want the output to be a comma separated list, ask it to return a comma separated list. If you want it to say "I don't know" when it doesn't know the answer, tell it 'Say "I don't know" if you do not know the answer.'
-* **Supply better examples.** If you're demonstrating examples in your prompt, make sure that your examples are diverse and high quality.
-* **Ask the model to answer as if it was an expert.** Explicitly asking the model to produce high quality output or output as if it was written by an expert can induce the model to give higher quality answers that it thinks an expert would write. E.g., "The following answer is correct, high-quality, and written by an expert."
-* **Prompt the model to write down the series of steps explaining its reasoning.** E.g., prepend your answer with something like "[Let's think step by step](https://arxiv.org/pdf/2205.11916v1.pdf)." Prompting the model to give an explanation of its reasoning before its final answer can increase the likelihood that its final answer is consistent and correct.
+* **Be more specific** E.g., if you want the output to be a comma separated list, ask it to return a comma separated list. If you want it to say "I don't know" when it doesn't know the answer, tell it 'Say "I don't know" if you do not know the answer.' The more specific your instructions, the better the model can respond.
+* **Provide Context**: Help the model understand the bigger picture of your request. This could be background information, examples/demonstrations of what you want or explaining the purpose of your task.
+* **Ask the model to answer as if it was an expert.** Explicitly asking the model to produce high quality output or output as if it was written by an expert can induce the model to give higher quality answers that it thinks an expert would write. Phrases like "Explain in detail" or "Describe step-by-step" can be effective.
+* **Prompt the model to write down the series of steps explaining its reasoning.** If understanding the 'why' behind an answer is important, prompt the model to include its reasoning. This can be done by simply adding a line like "[Let's think step by step](https://arxiv.org/abs/2205.11916)" before each answer.
 
 
 
-[Fine Tuning Docs]: https://beta.openai.com/docs/guides/fine-tuning
-[Codex Apps Blog Post]: https://openai.com/blog/codex-apps/
-[Large language models Blog Post]: https://openai.com/blog/better-language-models/
-[GitHub Copilot]: https://copilot.github.com/
-[Codex]: https://openai.com/blog/openai-codex/
+[Fine Tuning Docs]: https://platform.openai.com/docs/guides/fine-tuning
+[OpenAI Customer Stories]: https://openai.com/customer-stories
+[Large language models Blog Post]: https://openai.com/research/better-language-models
+[GitHub Copilot]: https://github.com/features/copilot/
+[GPT-4 and GPT-4 Turbo]: https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo
 [GPT3 Apps Blog Post]: https://openai.com/blog/gpt-3-apps/
-[OpenAI Examples]: https://beta.openai.com/examples
+[OpenAI Examples]: https://platform.openai.com/examples
diff --git a/articles/related_resources.md b/articles/related_resources.md
@@ -11,6 +11,7 @@ People are writing great tools and papers for improving outputs from GPT. Here a
 - [Guardrails.ai](https://shreyar.github.io/guardrails/): A Python library for validating outputs and retrying failures. Still in alpha, so expect sharp edges and bugs.
 - [Guidance](https://github.com/microsoft/guidance): A handy looking Python library from Microsoft that uses Handlebars templating to interleave generation, prompting, and logical control.
 - [Haystack](https://github.com/deepset-ai/haystack): Open-source LLM orchestration framework to build customizable, production-ready LLM applications in Python.
+- [HoneyHive](https://honeyhive.ai): An enterprise platform to evaluate, debug, and monitor LLM apps.
 - [LangChain](https://github.com/hwchase17/langchain): A popular Python/JavaScript library for chaining sequences of language model prompts.
 - [LiteLLM](https://github.com/BerriAI/litellm): A minimal Python library for calling LLM APIs with a consistent format.
 - [LlamaIndex](https://github.com/jerryjliu/llama_index): A Python library for augmenting LLM apps with data.
@@ -34,13 +35,15 @@ People are writing great tools and papers for improving outputs from GPT. Here a
 - [Lil'Log Prompt Engineering](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/): An OpenAI researcher's review of the prompt engineering literature (as of March 2023).
 - [OpenAI Cookbook: Techniques to improve reliability](https://cookbook.openai.com/articles/techniques_to_improve_reliability): A slightly dated (Sep 2022) review of techniques for prompting language models.
 - [promptingguide.ai](https://www.promptingguide.ai/): A prompt engineering guide that demonstrates many techniques.
+- [Xavi Amatriain's Prompt Engineering 101 Introduction to Prompt Engineering](https://amatriain.net/blog/PromptEngineering) and [202 Advanced Prompt Engineering](https://amatriain.net/blog/prompt201): A basic but opinionated introduction to prompt engineering and a follow up collection with many advanced methods starting with CoT.   
 
 ## Video courses
 
 - [Andrew Ng's DeepLearning.AI](https://www.deeplearning.ai/short-courses/chatgpt-prompt-engineering-for-developers/): A short course on prompt engineering for developers.
 - [Andrej Karpathy's Let's build GPT](https://www.youtube.com/watch?v=kCc8FmEb1nY): A detailed dive into the machine learning underlying GPT.
 - [Prompt Engineering by DAIR.AI](https://www.youtube.com/watch?v=dOxUroR57xs): A one-hour video on various prompt engineering techniques.
 - [Scrimba course about Assistants API](https://scrimba.com/learn/openaiassistants): A 30-minute interactive course about the Assistants API.
+- [LinkedIn course: Introduction to Prompt Engineering: How to talk to the AIs](https://www.linkedin.com/learning/prompt-engineering-how-to-talk-to-the-ais/talking-to-the-ais?u=0): Short video introduction to prompt engineering
 
 
 ## Papers on advanced prompting to improve reasoning

diff --git a/authors.yaml b/authors.yaml
@@ -37,13 +37,28 @@ ibigio:
   name: "Ilan Bigio"
   website: "https://twitter.com/ilanbigio"
   avatar: "https://pbs.twimg.com/profile_images/1688302223250378752/z-99TOMH_400x400.jpg"
-  
+
 jhills20:
   name: "James Hills"
-  website: "https://twitter.com/jimmerhills"
+  website: "https://twitter.com/jamesmhills"
   avatar: "https://pbs.twimg.com/profile_images/1722092156691902464/44FGj7VT_400x400.jpg"
 
 colin-openai:
   name: "Colin Jarvis"
   website: "https://twitter.com/colintjarvis"
-  avatar: "https://pbs.twimg.com/profile_images/1647875178947387393/aUc7D9m-_400x400.jpg"
+  avatar: "https://pbs.twimg.com/profile_images/1727207339034345472/IM8v8tlC_400x400.jpg"
+
+prakul:
+  name: "Prakul Agarwal"
+  website: "https://www.linkedin.com/in/prakulagarwal"
+  avatar: "https://media.licdn.com/dms/image/D5603AQEUug83qKgRBg/profile-displayphoto-shrink_800_800/0/1675384960197?e=1706140800&v=beta&t=qxkDbBr-Bk2ASpcwbR5JVPD6yS-vzmIwNHAa8ApyDq4"
+
+nghiauet:
+  name: "Nghia Pham"
+  website: "https://www.linkedin.com/in/deptraicucmanh/"
+  avatar: "https://media.licdn.com/dms/image/D5603AQFpoi0l1lRQ7g/profile-displayphoto-shrink_200_200/0/1680006955681?e=1707350400&v=beta&t=1J1DAVRGXL7LgFUBnIooI-fsDRkdzFzCDwwxMze9N8c"
+
+ggrn:
+  name: "Greg Richardson"
+  website: "https://twitter.com/ggrdson"
+  avatar: "https://pbs.twimg.com/profile_images/1371549909820657664/ZG-HDNlI_400x400.jpg"