ASF Projects Visualizer

This project creates a visual map of Apache projects and allows filtering based on user queries.

Setup

Clone the repository:

git clone https://github.com/yourusername/apache-projects-visualizer.git
cd apache-projects-visualizer

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

Install the required packages:
```
pip install -r requirements.txt
```

Create a .env file in the root directory and add your configuration:

LLM_PROVIDER=openai  # or 'local'
OPENAI_API_KEY=your_api_key_here
OPENAI_MODEL=gpt-4o  # or another OpenAI model
LOCAL_MODEL_NAME=your_local_model_name  # if using a local LLM
HUGGINGFACE_TOKEN=your_huggingface_token  # if using Hugging Face models

Getting your OpenAI API Key:
- Go to https://platform.openai.com/signup and sign up for an account if you don't have one.
- After logging in, navigate to https://platform.openai.com/account/api-keys
- Although they will recommend a new project API key, for the moment this project only works with the old secret API key
- Click on "Create new secret key"
- Copy the generated key (you won't be able to see it again)
- Paste this key as the value for OPENAI_API_KEY in your .env file
Run the initial data collection script:
```
python src/data_collector.py --collect
```
(Optional) If using a local LLM, train it using the collected data:
```
python src/fine_tune_model.py
```
Run the enhanced data collection using the configured LLM:
```
python src/data_collector.py --enhance
```
Start the Flask server:
```
python src/app.py
```
Open http://127.0.0.1:5000 in a web browser.

Usage

Use the dimension selector to choose how projects are grouped (category, key features, refined category, or programming language).
Enter your requirements in the input field and click "Query" to find relevant Apache projects.
Use the checkboxes to filter projects by their groupings.
Click on a project to view more details, including its description, features, and latest release information.

LLM Configuration

This project supports two LLM providers: OpenAI and a local LLM. You can configure which one to use by setting the LLM_PROVIDER environment variable in the .env file.

Using OpenAI (Recommended)

Set the LLM_PROVIDER to openai and provide your OPENAI_API_KEY in the .env file. This is currently the recommended option due to its superior performance and quality of results.

Using Local LLM (Experimental)

Set the LLM_PROVIDER to local and specify your LOCAL_MODEL_NAME in the .env file.

Note: The local LLM option is currently experimental and not yet as performant as the OpenAI backend. The fine-tuning process and training algorithm need further improvement to match the quality of OpenAI's models. We welcome contributions from the community to enhance the local LLM training and performance.

Project Structure

src/data_collector.py: Handles data collection and enhancement for Apache projects.
src/app.py: Flask server that provides API endpoints for the frontend.
src/llms.py: Contains the LLM interface for querying project information.
src/config.py: Manages configuration and environment variables.
src/fine_tune_model.py: Script for fine-tuning a local LLM (if used).
static/: Contains the frontend files (HTML, CSS, JavaScript).

Contributing

Contributions are welcome! Here are some areas where we particularly need help:

Improving the fine-tuning process for the local LLM to enhance its performance.
Developing better training algorithms for the local model to improve the quality of its outputs.
Expanding the dataset used for training to cover a wider range of Apache projects and their characteristics.

If you're interested in contributing to these areas or have other ideas for improvement, please feel free to submit a Pull Request or open an Issue for discussion.

License

This project is licensed under the Apache License, Version 2.0. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
src		src
static		static
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ASF Projects Visualizer

Setup

Usage

LLM Configuration

Using OpenAI (Recommended)

Using Local LLM (Experimental)

Project Structure

Contributing

License

About

Releases

Packages

Languages

License

sergehuber/asf-projects-visualizer

Folders and files

Latest commit

History

Repository files navigation

ASF Projects Visualizer

Setup

Usage

LLM Configuration

Using OpenAI (Recommended)

Using Local LLM (Experimental)

Project Structure

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages