Skip to content

Commit

Permalink
Merge pull request #31 from lordzuko/post-ss
Browse files Browse the repository at this point in the history
updated summary description of post
  • Loading branch information
lordzuko authored Sep 22, 2023
2 parents e9f3c2b + 0305078 commit 3828d63
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion content/posts/lm_sharding.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ draft: false
ShowToc: true
category: [ai]
tags: ["llms", "ai", "inference"]
description: "A guide to fine-tuning GPT-X models with DeepSpeed"
description: "Techniques to load LLMs on smaller GPUs and enable parallel inference using Hugging Face Accelerate"
---

*With the rise of deep learning and the development of increasingly powerful models, pre-trained language models have grown in size. While these models deliver impressive performance in various natural language processing (NLP) tasks, their sheer magnitude poses challenges for inference on resource-constrained devices and large-scale distributed systems. Enter sharding, a technique that divides large models into smaller, more manageable parts, offering an efficient and faster approach to distributed inference.*
Expand Down

0 comments on commit 3828d63

Please sign in to comment.