Hugging Face Datasets Library to SingularityNET Pipeline

Project Overview

This project implements a foundational pipeline to integrate the Hugging Face Datasets library into a technological system. It creates a seamless pipeline for accessing and utilizing Knowledge Graph Question-Answer datasets, fostering efficient integration with Hugging Face's extensive NLP resources. The framework serves as a robust backbone for subsequent integration of specific Hugging Face datasets, enhancing capabilities in the realm of augmenting Artificial General Intelligence (AGI) research.

Features

Data retrieval from Hugging Face Datasets library
Data transformation and schema mapping
Integration with Neo4j graph database
RESTful API for dataset access and querying
Support for multiple datasets:
- Tree of Knowledge
- HotpotQA
- TimeQA
Security measures including API key authentication
Deployment on AWS infrastructure

Installation

Clone the repository: git clone https://github.com/singnet/HFDLSP.git
Install the required dependencies: pip install -r requirements.txt
Set up the environment variables:

Copy the .env.example file to .env
Fill in the necessary environment variables in the .env file

Set up the Neo4j database (see Database section for details)
Run database migrations: python manage.py migrate

Usage

To start the development server:

The API will be available at http://localhost:8000/.

API Endpoints

/answer/: Get answers from the dataset
/fetch_dataset/: Fetch and insert datasets into Neo4j
/schema/: OpenAPI schema
/swagger-ui/: Swagger UI for API documentation

For detailed API documentation, visit the Swagger UI at /swagger-ui/ when the server is running.

Configuration

The project uses environment variables for configuration. Key settings include:

SECRET_KEY: Django secret key
DEBUG: Debug mode (set to 0 for production)
DJANGO_ALLOWED_HOSTS: Allowed hosts for Django
NEO4J_DATABASE_URL: URL for the Neo4j database
API_KEY: API key for authentication

Refer to settings.py for all available configuration options.

Database

This project uses Neo4j as its primary database. Ensure you have Neo4j installed and running. Update the NEOMODEL_NEO4J_BOLT_URL in settings.py or set the NEO4J_DATABASE_URL environment variable to point to your Neo4j instance.

Security

The API is secured using API key authentication. Ensure you set the API_KEY environment variable and include it in the Authorization header when making requests to the API.

Deployment

The project is deployed on AWS, but is designed to be deployed on any Cloud service. Key components of the deployment include:

EC2 instances for hosting the application
Load balancing and auto-scaling configurations
CI/CD pipeline for automated testing and deployment

Refer to the deployment documentation for detailed instructions on setting up the AWS infrastructure.

Contributing

Contributions to this project are welcome. Please follow these steps:

Fork the repository
Create a new branch for your feature
Commit your changes
Push to the branch
Create a new Pull Request

License

Apache-2.0 License: http://www.apache.org/licenses/

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
.cache/plugin/social		.cache/plugin/social
.github/workflows		.github/workflows
docs		docs
nginx		nginx
src		src
.dockerignore		.dockerignore
.env.sample		.env.sample
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
azure-pipelines.yml		azure-pipelines.yml
docker-compose.yml		docker-compose.yml
mkdocs.yml		mkdocs.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hugging Face Datasets Library to SingularityNET Pipeline

Table of Contents

Project Overview

Features

Installation

Usage

API Endpoints

Configuration

Database

Security

Deployment

Contributing

License

About

Releases

Packages

Languages

License

singnet/HFDLSP

Folders and files

Latest commit

History

Repository files navigation

Hugging Face Datasets Library to SingularityNET Pipeline

Table of Contents

Project Overview

Features

Installation

Usage

API Endpoints

Configuration

Database

Security

Deployment

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages