-
-
Notifications
You must be signed in to change notification settings - Fork 154
Definitions
Andy edited this page Jul 8, 2023
·
20 revisions
A centralized glossary of terms and their meaning, to clarify the language used throughout the top 10.
- LLM - Large language model. A type of artificial intelligence (AI) that is trained on a massive dataset of text and code. LLMs used natural language processing to process requests and generate data.
- LLM Agent - A piece of code that formulates prompts to an LLM and parses the output in order to perform an action or a series of actions (typically by calling one or more plugins/tools).
- LLM Plugin - A piece of code that exposes external functionality to an LLM Agent; e.g., reading a file, fetching the contest of a URL, querying a database, etc.
- NLP (Natural Language Processing) - The branch of computer science focused on teaching computers to speak.
- Transformer - A type of neural network architecture that is commonly used to train LLMs. Transformers are able to learn long-range dependencies between words, which makes them well-suited for natural language processing tasks.
- Self-supervised learning - A type of machine learning in which the model is trained to learn from unlabeled data. In the case of LLMs, self-supervised learning is often used to train the model to predict the next word in a sequence.
- Foundation Model - A large language model that is trained on a broad set of diverese data to operate across a wide range of use cases.
-
Fine-tuning - The process of further refining a Foundation Model to improve the performance of the model on a specific task. Common methods of fine tuning include:
- Training on task-specific datasets
- Embeddings
- Engineering task-specific system prompts
- Transfer learning - A process of using a model that has been trained on one task to improve the performance of a model on a different task. Transfer learning is often used to save time and resources when training new models.
- Supervised learning is a machine learning approach that’s defined by its use of labeled datasets. These datasets are designed to train or “supervise” algorithms into classifying data or predicting outcomes accurately. Using labeled inputs and outputs, the model can measure its accuracy and learn over time.
- Unsupervised learning uses machine learning algorithms to analyze and cluster unlabeled data sets. These algorithms discover hidden patterns in data without the need for human intervention (hence, they are “unsupervised”).
- Inference - The process of using a trained model to generate predictions or responses, usually as an API or web service.
- Hallucinate - In the context of LLMs, hallucinate can refer to the process of generating text that is not based on any real-world input. This can happen for a variety of reasons. When an LLM hallucinates, the output text may be nonsensical, offensive, or even dangerous.