Merge pull request #2 from BU-Spark/project-outline

Project outline
BU-Spark · Feb 16, 2025 · 6408334 · 6408334
2 parents 421a47c + 0167b14
commit 6408334
Show file tree

Hide file tree

Showing 3 changed files with 239 additions and 4 deletions.
diff --git a/README.md b/README.md
@@ -1,7 +1,107 @@
-# MET Autograder
+# MET BU Autograder 🚀
 
-Create a new branch from dev, add changes on the new branch you just created.
+**A Boston University SPARK Project**  
+**For Boston University’s Metropolitan College Office of Education Technology and Innovation (MET ETI)**
 
-Open a Pull Request to dev. Add your PM and TPM as reviewers. 
+---
+
+## 📖 Table of Contents  
+1. [Overview](#overview)  
+2. [✨ Key Features](#key-features)  
+3. [🎯 Goals](#goals)  
+4. [🛠️ Tech Stack](#tech-stack)  
+5. [👥 Team](#team)  
+6. [📌 Development Roadmap](#development-roadmap)  
+7. [📜 License](#license)  
+
+---
+
+## 🌍 Overview  
+
+**MET BU Autograder** is a web-based REST API for AI-Assisted Grading of written and “complex” assignments. It refines and optimizes grading capabilities using various Large Language Models (LLMs) and advanced context management.  
+
+Developed as part of a **Boston University SPARK** project for **BU MET ETI**, this tool is designed to integrate seamlessly with multiple LLM backends and provide a robust, well-documented API for clients seeking to enhance their grading workflows.
+
+---
+
+## ✨ Key Features  
+
+✔️ **Context Management Strategies** - Ensures the AI retains necessary context across requests over otherwise stateless APIs.
+
+✔️ **Retrieval-Augmented Generation** - Uses a vector database to store supplemental data like documents, videos, images, and graphs.
+
+✔️ **Web Crawling** - Gathers assignment-relevant information with optional automatic update checking.
+
+✔️ **Prompt Engineering** - Uses zero-shot, few-shot, self-consistency prompting, and instruction tuning.
+
+✔️ **File Conversion & Extraction** - Supports multiple formats (CSV, PDF, diagrams, PowerPoints) to feed into LLM APIs.
+
+---
+
+## 🎯 Goals  
+
+🎯 **Future-Proof Design**: Integrate with multiple text-based or vision-based LLM backends.  
+🎯 **Consistent Grading**: Standardized grading approach for improved fairness and reliability.  
+🎯 **Well-Documented API**: Clear and accessible documentation for clients and contributors.  
+🎯 **Efficiency**: Minimize unnecessary external API calls to reduce costs while maintaining high accuracy.
+
+---
+
+## 🛠️ Tech Stack  
+
+🟡 **Language**: Python 🐍  
+🟢 **Framework**: FastAPI ⚡  
+🔵 **Others**:  
+   - LLM integration (multiple providers)  
+   - Vector databases (for retrieval-augmented generation)  
+   - Web crawling utilities  
+
+---
+
+## 📌 Development Roadmap  
+
+🚀 **Phase 0:** Project Vision & Goals ✅
+
+🚀 **Phase 1:** Project Setup & Initial API Development ⏳
+
+🚀 **Phase 2:** LLM Integration & Context Management ⏳
+
+🚀 **Phase 3:** Web Crawling & Vector Database Implementation ⏳  
+
+🚀 **Phase 4:** Performance Optimization & API Documentation ⏳  
+
+🚀 **Phase 5:** Deployment & User Testing ⏳  
+
+---
+
+## 📊 Workflow Diagram
+
+Below is a visual representation of our current workflow for the MET BU Autograder workflow:
+
+![proposed-workflow](proposed-workflow.png)
+
+---
+
+## 👥 Team  
+
+| 👤 **First Name**  | **Last Name**  | ✉️ **Email Address**  | 🖥️ **GitHub Username**  |
+|:------------------|:--------------|:----------------------|:-----------------------|
+| Fahim            | Uddin         | [email protected]      | [fahimuddin/fahimuddin1](https://github.com/fahimuddin/fahimuddin1) |
+| Zach             | Gentile       | [email protected]      | [zgentile](https://github.com/zgentile) |
+| Josh             | Yip           | [email protected]      | [joshyipp](https://github.com/joshyipp) |
+| Muhammad Aseef   | Imran         | [email protected]         | [Aseeef](https://github.com/Aseeef) |
+
+---
+
+## 📜 License  
+
+This project is licensed under the **GNU General Public License (GPL)**. See the [LICENSE.txt](LICENSE.txt) file for more details.
+
+---
+
+> ⚠️ **Note**: This project is in active development. For more details on installation, usage, or contributing, please refer to the project’s documentation and issue tracker.  
+
+---
+
+<sub>_If you have any questions or feedback, feel free to open an issue or reach out via email._</sub>
 
-At the end of the semester during project wrap up open a final Pull Request to main from dev branch.
diff --git a/project-document-template.md b/project-document-template.md
@@ -0,0 +1,135 @@
+# MET BU Autograder - Technical Project Document 🚀
+
+## *Josh Yip, Zach Gentile, Muhammad Aseef Imran, Fahim Uddin – 2025-February-15 vx.x.x-dev*
+
+## 📝 Overview
+
+The **AI-Assisted Grading Tool** for written answers and complex assignments is a project for **Boston University’s Metropolitan College Office of Education Technology and Innovation (MET ETI)**. This tool is designed to enhance grading consistency, accuracy, and alignment with instructor expectations for **CS 581 quizzes and assignments** using:
+
+✅ **Azure AI Studio**\
+✅ **GPT-4o**\
+✅ **Retrieval-Augmented Generation (RAG)**
+
+The AI model will be capable of:
+
+- Evaluating student responses.
+- Processing supplemental course material.
+- Supporting file-based grading.
+- Ensuring cost-efficient API usage while maintaining high accuracy.
+
+---
+
+## A. Provide a solution in terms of human actions to confirm if the task is within the scope of automation through AI.
+
+The task **is** within the scope of AI automation. MET ETI staff have already developed an AI model achieving **93% grading consistency**, proving its effectiveness. Our goal is to **enhance and improve this model** for better reliability and alignment with instructor expectations.
+
+Further confirmation can be achieved by:
+
+1. **Testing LLMs** with entire prompts and rubrics to validate their ability to grade accurately.
+2. **Comparing AI-graded results** with human-graded results for benchmarking.
+3. **Running pilot testing** to assess grading stability and fairness across multiple assignments.
+
+### ✅ Current Process (Manual Grading):
+
+1. A CS 581 student submits a quiz or assignment in **Blackboard**.
+2. The **instructor or TA manually grades** responses using rubrics and sample correct answers.
+3. The **grade is entered** into Blackboard.
+4. A **review is conducted** for consistency across multiple graders.
+
+### 🤖 AI-Assisted Process:
+
+1. A student submits a quiz or assignment.
+2. The response is **sent via API to the AI model**.
+3. The AI **grades the response** using:
+   - Predefined **rubrics** 📜
+   - Sample answers ✅
+   - Supplemental **course material** 📂
+4. The AI **returns a structured response**, including a **score & explanation**.
+5. The **instructor reviews and confirms** the AI’s evaluation before **finalizing the grade**.
+6. AI-graded responses are **logged for consistency analysis** 📊.
+
+---
+
+## B. Problem Statement
+
+The primary challenge is ensuring **consistency, accuracy, and reliability** in AI-assisted grading for **short-answer quizzes and file-based assignments**. The AI model must:
+
+- Extract **clear scores and justifications**.
+- Reference **structured supplemental data** (e.g., rubrics, external materials, PDFs, and slides).
+- Support **various file types**.
+- Potentially **retrieve relevant external information** (e.g., web browsing and document parsing capabilities).
+
+---
+
+## C. Checklist for Project Completion
+
+To define the **successful completion** of this project, we aim to deliver:
+
+### 🎯 Core Deliverables:
+
+- **Optimal AI platform** for MET ETI’s use case.
+  - 🔹 **Documentation**: Setup instructions, environment access, API usage.
+- **Optimal AI model** tailored for grading.
+  - 🔹 **Documentation**: API usage, fine-tuning instructions.
+- **Efficient method for adding course materials** (RAG-based document retrieval & storage).
+  - 🔹 **Guidelines** on integrating **PDFs, slides, videos**.
+- **Performance metrics & evaluation reports** 📈.
+  - 🔹 **Improvement summary** of AI auto-grading performance for CS 581.
+
+---
+
+## D. Outline a Path to Operationalization
+
+The goal is to deliver a **production-ready API**. The final deployment strategy includes:
+
+### 🌐 **Integration with Blackboard & LMS systems**
+
+- API endpoints to **automate quiz & assignment grading**.
+  - FastAPI has built in tools to create documentation for your API with low-effort which we intend to utilize.
+- (Time Permitting) Web-based dashboard for **grading logs & analytics**.
+
+### ☁️ **Deployment Strategy**
+
+- **Cloud-based API** hosted on **Azure / FastAPI backend**.
+- **Database & vector storage** for **retrieval-augmented grading**.
+  - *Note, which vector database or whether this vector database would be required even still requires more research.*
+- **API Authentication & security layers** to protect student data.
+
+### 🔗 **Long-Term Maintenance Plan**
+
+- Clear **documentation** for using the Grading Tool's API.
+- Pipeline for **future AI model upgrades**.
+- **Feedback loop** for improving grading accuracy.
+
+---
+
+## 📊 Workflow Diagram
+
+Below is a visual representation of the MET BU Autograder workflow:
+
+![proposed-workflow](proposed-workflow.png)
+
+This diagram provides a (preliminary) step-by-step breakdown of how requests flow through the system, from initial submission to AI-assisted grading.
+
+---
+
+## 📂 Resources
+
+### 📊 Data Sets
+
+- 📝 **Student responses** from CS 581 quizzes & assignments.
+- 📑 **Instructor-provided rubrics & sample answers**.
+- 📚 **Supplementary course materials** (PDFs, slides, videos).
+
+### 📖 References
+
+- 📄 **MET ETI AI-Assisted Grading Requirements Document**.
+- 📌 **Azure AI Studio Documentation**.
+- 🤖 **GPT-4o, Claude, LLaMA API Docs**.
+
+---
+
+## 🗓️ Weekly Meeting Updates
+
+Ongoing meetings and updates will be tracked in the **Project Description Document** prepared by **Spark staff** for this project.
+
diff --git a/proposed-workflow.png b/proposed-workflow.png