A LeetCode-style platform for practicing prompt engineering with LLMs. Users solve challenges by writing prompts that extract structured data from JSON datasets or unstructured text.
- Challenge-Based Learning: Solve prompt engineering challenges with varying difficulty levels
- Real-time Evaluation: Submit prompts and get instant feedback on GPT-4o responses
- Progress Tracking: View your attempt history and success rates
- JSON-Aware Validation: Intelligent comparison that checks for exact matches of expected JSON objects
- PostgreSQL Integration: Robust data storage for questions, attempts, and user progress
- Backend: Flask with SQLAlchemy ORM
- Database: PostgreSQL with JSONB support
- LLM: OpenAI GPT-4o for prompt evaluation
- Validation: Custom JSON-aware comparison logic
pip install -r requirements.txt
# Create database
createdb llm_leetcode
# Run setup script
psql -d llm_leetcode -f setup_db.sql
Create a .env
file:
OPENAI_API_KEY=your_openai_api_key_here
DATABASE_URL=postgresql://localhost/llm_leetcode
python app.py
POST /submit-prompt
Submit a user prompt for evaluation against a specific question.
{
"user_id": "user123",
"question_id": "q1_employee_salary",
"user_prompt": "You are a data extraction specialist. Extract the requested information and format it exactly as specified."
}
Response:
{
"attempt_id": 1,
"success": true,
"score": 1.0,
"model_response": "[{\"name\": \"Sarah Johnson\", \"employee_id\": \"EMP002\"}]",
"missing_entries": [],
"extra_entries": [],
"format_issues": [],
"tokens_used": 150,
"created_at": "2024-01-15T10:30:00Z"
}
GET /get-question/<question_id>
Retrieve question details including dataset and expected output.
Response:
{
"id": "q1_employee_salary",
"title": "Employee Salary Filter",
"description": "Return the names and employee_id of people who earn more than 100k...",
"dataset": [...],
"difficulty": "easy",
"category": "data_extraction",
"created_at": "2024-01-15T10:30:00Z"
}
GET /get-results/<user_id>?page=1&per_page=10
Retrieve paginated attempt history for a user.
Response:
{
"results": [
{
"id": 1,
"question_id": "q1_employee_salary",
"user_prompt": "...",
"llm_response": "...",
"score": 1.0,
"success": true,
"model": "gpt-4o",
"tokens_used": 150,
"created_at": "2024-01-15T10:30:00Z"
}
],
"total": 25,
"pages": 3,
"current_page": 1,
"per_page": 10
}
GET /questions
Get all available challenges.
Response:
{
"questions": [
{
"id": "q1_employee_salary",
"title": "Employee Salary Filter",
"description": "...",
"difficulty": "easy",
"category": "data_extraction"
}
]
}
GET /health
Check if the service is running.
id
: Primary keyuser_id
: User identifierquestion_id
: Question identifieruser_prompt
: The prompt submitted by the userdataset
: JSON dataset used for the questionexpected_output
: Expected JSON outputllm_response
: Raw response from GPT-4oscore
: Success score (0.0 to 1.0)success
: Boolean indicating if the attempt passedmodel
: LLM model used (e.g., "gpt-4o")tokens_used
: Number of tokens consumedcreated_at
: Timestamp of the attempt
id
: Primary key (question identifier)title
: Question titledescription
: Question description and format requirementsdataset
: JSON dataset for the challengeexpected_output
: Expected JSON outputdifficulty
: Challenge difficulty (easy, medium, hard)category
: Question category (data_extraction, text_extraction, etc.)created_at
: Timestamp of question creation
The platform uses sophisticated JSON-aware validation:
- JSON Parsing: Extracts and parses JSON from model responses
- Object Matching: Checks if each expected object is present in the response
- Format Validation: Ensures responses follow the requested structure
- Scoring: Calculates success rate based on found vs. expected entries
- Error Handling: Provides detailed feedback for format issues
- Dataset: JSON array of employee records
- Task: Filter employees earning >$100k
- Format:
{name: ..., employee_id: ...}
- Dataset: Unstructured text paragraph
- Task: Extract sales representatives and amounts
- Format:
{name: ..., sales_amount: ...}
Test the endpoints using curl:
# List questions
curl http://localhost:5000/questions
# Get question details
curl http://localhost:5000/get-question/q1_employee_salary
# Submit a prompt
curl -X POST http://localhost:5000/submit-prompt \
-H "Content-Type: application/json" \
-d '{
"user_id": "test_user",
"question_id": "q1_employee_salary",
"user_prompt": "Extract the requested information in the exact format specified."
}'
# Get user results
curl http://localhost:5000/get-results/test_user
- Database Migrations: Use Flask-Migrate for schema changes
- Logging: Configured for debugging and monitoring
- Error Handling: Comprehensive error responses with logging
- Performance: Indexed database queries for optimal performance