A FastAPI application for generating SQL queries using AWS Bedrock LLMs with context awareness. The application validates SQL syntax, handles schema validation, and provides clean, consistent query outputs.
- SQL Query Generation using AWS Bedrock LLMs
- Schema-aware Query Generation
- Query History Context
- Proper Column Qualification and Table Aliasing
- SQL Validation and Formatting
- Case-insensitive String Handling
- Python 3.x
- PDM (Python Development Manager)
- AWS Credentials
- Docker (optional)
- Clone the repository:
git clone https://github.com/yourusername/athena-with-aws-bedrock.git
cd athena-with-aws-bedrock
- Install dependencies:
pdm install
- Set up environment variables:
cp .env.example .env
# Add your AWS credentials and other configurations
- Run the application:
make run
athena-with-aws-bedrock/
├── src/
│ ├── builder/ # LLM builder implementations
│ ├── executor/ # Query execution logic
│ ├── generator/ # Query and schema generators
│ │ ├── query/ # SQL query generation
│ │ └── schema/ # Schema management
│ ├── main.py # FastAPI application
│ ├── models.py # Data models
│ ├── prompt.txt # LLM prompts
│ ├── resolvers.py # Query resolvers
│ └── schema_def.json # Schema definitions
├── tests/
├── Makefile
└── README.md
Format code:
make format
Run development server:
make run
- Handles LLM integration with AWS Bedrock
- Manages model configurations and parameters
- Provides interface for query generation
- Query Generator: Creates SQL queries from natural language
- Schema Generator: Manages and validates database schemas
- Ensures proper table aliasing and column qualification
- Executes generated SQL queries
- Handles query validation
- Manages query context and history
-
Table Aliasing
- Meaningful table aliases (e.g., tf for transaction_fact)
- Every column must be fully qualified (e.g., tf.column_name)
- Consistent alias usage across queries
-
String Handling
- Case-insensitive comparisons using LOWER()
- String values in uppercase
- Special handling for tenant_id
-
Query Structure
- Single line output without line breaks
- Proper WHERE clause construction
- Appropriate JOIN handling
- Date/time handling using EXTRACT()
Example Query:
SELECT
tf.transaction_id,
tf.amount
FROM transaction_fact tf
WHERE
tf.tenant_id = '123'
AND LOWER(tf.status) = LOWER('COMPLETED')
When running locally, access:
- API Documentation: http://localhost:8000/docs
- Alternative Documentation: http://localhost:8000/redoc
/generate-sql
: Generate SQL from natural language/validate-sql
: Validate generated SQL/execute-sql
: Execute validated SQL queries
# AWS Configuration
AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AWS_REGION=your_region
# Bedrock Configuration
MODEL_ID=your_bedrock_model_id
MODEL_PARAMETERS={"temperature": 0, "top_p": 1}
# Application Configuration
DEBUG=True
LOG_LEVEL=INFO
format:
pdm run black .
run:
pdm run uvicorn src.main:app --reload
The application handles various error scenarios:
- Schema Validation Errors
- SQL Syntax Errors
- Query Execution Errors
- AWS Bedrock API Errors
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Format your code (
make format
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
- Use Black for code formatting
- Add tests for new features
- Update documentation as needed
- Follow semantic versioning
Run tests:
pdm run pytest
Generate coverage report:
pdm run pytest --cov=src
- AWS Bedrock team for LLM capabilities
- FastAPI framework
- Python PDM community