Skip to content

A repository to hold different NLP/LLM related codes which utilize various new models, frameworks, and tools in the space.

License

Notifications You must be signed in to change notification settings

aryankargwal/genai-tutorials

Repository files navigation

NLP Tutorials using Large Language Models (LLMs)

made-with-python
License
YouTube

Welcome to the NLP Tutorials repository! This repository currently hosts multiple projects focusing on Large Language Models (LLMs) and Natural Language Processing (NLP) workflows, including MultiHop Question Answering (QA), Marketing Campaign Generation with AI Assistants, Multimodal AI with Janus, and Advanced Parsing with Omniparser.

📺 Watch on YouTube

Check out video walkthroughs of these projects and more on my YouTube channel!

🚀 Projects in this Repository

Tutorial Name Description Link to Tutorial
MultiHopQA with DSPy Implements multi-hop question answering using DSPy, ColBERT, Qwen 2.5 72B, and HotPotQA. multihopqa-dspy
VLM Stress Testing Compares the performance of multiple Vision-Language Models for multi-modal inferences using Streamlit. vlm-comparison
Multi-Turn Assistant Walks through creating a multi-turn assistant application to automate tasks like market research, strategy generation, and image creation using AI assistants. marketing-campaign
Janus 1.3B Multimodal AI Provides a tutorial on deploying Janus 1.3B for multimodal tasks, including text understanding and image generation. janus-tutorial
Omniparser Demonstrates parsing and structuring complex documents for NLP workflows with Omniparser. omniparser-tutorial
Qwen 2.5 Coder Transforms UI layouts into HTML/CSS using Qwen 2.5 vision-language model for rapid prototyping and design automation. qwen-coder

MultiHopQA with DSPy, ColBERT, and Qwen 2.5 72B

This project demonstrates how to perform MultiHop Question Answering (QA), a task that requires synthesizing information from multiple sources to answer complex questions. The solution integrates several state-of-the-art tools like DSPy, ColBERT, HotPotQA, TuneAPI, and Qwen 2.5 72B.

Future Plans

  • Additional tutorials on text generation, retrieval-augmented generation, and more will be added as part of this repository.

VLM Stress Testing with Llama 3.2, Qwen 2 VL, and GPT 4o

This project provides a framework for evaluating Vision-Language Models (VLMs) by comparing their performance on multi-modal question-answering tasks. It allows users to upload images, input questions, and test responses across multiple state-of-the-art models like Llama 3.2, Qwen 2 VL, and GPT 4o. The application tracks key metrics like response quality, latency, and token usage to determine the best-performing model for each task.

Future Plans

Upcoming enhancements include expanded model comparisons, fine-tuning tutorials, and automated dataset generation for advanced multi-modal tasks.

AI Assistant-Led Marketing Campaign Generator

In this tutorial, we build a multi-turn assistant application using Tune Studio to automate various marketing-related tasks. The app performs market research, analyzes numerical data, and generates campaign strategies and sample posters using AI assistants powered by models like Claude Sonnet and GPT4o. This tutorial covers how AI assistants can collaborate to streamline workflows, such as content creation, strategy planning, and image generation.

Future Plans

  • Further enhancements, including advanced assistant workflows for more complex marketing and automation tasks.

Janus 1.3B Multimodal AI

This tutorial guides you through deploying Janus 1.3B, a compact yet powerful vision-language model. You will learn how to leverage Janus for both text understanding and image generation with a straightforward implementation. The tutorial covers the architecture of Janus, including its tokenization methods, multimodal processing, and practical usage examples.

Future Plans

  • Additional insights on tuning and optimizing Janus for specific use cases and performance benchmarks.

Omniparser for Document Parsing

The Omniparser tutorial focuses on handling complex document parsing tasks, extracting and structuring data for NLP workflows. This project is ideal for anyone needing to process diverse document types, as it covers setup, parsing strategies, and methods to ensure clean, organized data for downstream applications.

Future Plans

  • Expansion to cover additional data extraction methods and integration with text and image processing modules.

Qwen 2.5 Coder - UI to Code Transformation

The Qwen 2.5 Coder tutorial demonstrates the use of Qwen 2.5 to transform UI layouts into HTML/CSS code. It offers a streamlined solution for automating design-to-code workflows, enabling rapid prototyping and design implementation.

Future Plans

  • Additional examples with more complex UI layouts and improvements in customization for UI component parsing.

📜 License

This repository is licensed under the Apache 2.0 License.

📺 Watch More

Don't forget to subscribe and check out more tutorials on my YouTube channel.

About

A repository to hold different NLP/LLM related codes which utilize various new models, frameworks, and tools in the space.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published