Skip to content

BLSQ/openhexa-app

Repository files navigation

OpenHEXA Logo

Open-source Data integration platform

OpenHEXA is an open-source data integration platform developed by Bluesquare.

Its goal is to facilitate data integration and analysis workflows, in particular in the context of public health projects.

Please refer to the OpenHEXA wiki for more information about OpenHEXA.

This repository contains the code of the following components:

  • The backend component, which mostly offers a GraphQL API and an infrastructure to run data pipelines.
  • The frontend component, which is a NextJS application that allows you to interact with the OpenHEXA platform.

Other components are available in separate repositories:

Changelog

For details please refer to the CHANGELOG.md file.

You can also refer to the backend & frontend README files for more details on versions prior to v1.0.2.

Release workflow

This project follows Semantic Versioning. Tagging and releases' creation are managed by release-please that will create and maintain a pull request with the next release based on the commit messages of the new commits.

On creation of a new release, the following actions are performed:

  • The changelog is updated with the new changes.
  • The version in the pyproject.toml file is updated.
  • The version in the package.json file is updated.
  • A new release is created in GitHub.
  • Docker images are built and pushed to Docker Hub: blsq/openhexa-app & blsq/openhexa-frontend.

This process can also run on release branches named release/* (ex: release/0.81) for maintaining older versions. When working on a release branch:

  1. Create a new release branch from the related version git checkout -b release/0.81 0.81.1
  2. The same conventional commit format should be used
  3. Release-please will create and maintain a PR for the next patch version on that branch
  4. Changes can be cherry-picked or implemented directly on the release branch
  5. Merging the PR created by release-please will trigger a new patch release for that version

This approach allows us to maintain multiple versions simultaneously while ensuring proper semantic versioning for each release line.

Code style

Our backend code is linted using ruff. It also handles code formatting, and import sorting.

We currently target the Python 3.13 syntax.

We use a pre-commit hook to lint the code before committing. Make sure that pre-commit is installed, and run pre-commit install the first time you check out the code. Linting will again be checked when submitting a pull request.

You can run the lint tools manually using pre-commit run --all.

Our frontend code is linted using eslint as provided by Next.js.

Lint and format the code using the following command:

npm run lint && npm run format

Getting started

The Installation instructions section of our wiki gives an overview of the local development setup required to run OpenHEXA locally.

To ease the setup of the environment and management of dependencies, we are using containerization, in particular Docker. As such, we provide a docker-compose.yaml file for local development.

Backend

When running the backend component using docker compose, the code of this repository is mounted as a volume within the container, so that any change you make in your local copy of the codebase is directly reflected in the running container.

  1. Prepare the .env :
       cp .env.dist .env
       # Set WORKSPACE_STORAGE_LOCATION to a local directory to use a local storage backend for workspaces (ex: /Users/yolanfery/Desktop/data/openhexa)
       # Set the OPENHEXA_BACKEND_URL to the URL of the backend (ex: http://localhost:8000)
  2. Navigate to the backend directory:
    cd backend
  3. Run the backend server:
    docker network create openhexa
    docker compose build
    docker compose run app fixtures
    docker compose up

Frontend

  1. Prepare the .env :
       cp .env.dist .env
       # Set WORKSPACE_STORAGE_LOCATION to a local directory to use a local storage backend for workspaces (ex: /Users/yolanfery/Desktop/data/openhexa)
       # Set the OPENHEXA_BACKEND_URL to the URL of the backend (ex: http://localhost:8000)
  2. Navigate to the frontend directory:
    cd frontend
  3. Install the required Node.js dependencies:
    npm install
  4. Run the frontend development server:
    npm run dev

This will correctly configure all the environment variables, fill the database with some initial data and start the base db and app services. The app is then exposed on localhost:8000. Two main paths are available:

Anything else will be redirected to the frontend served at http://localhost:3000.

You can then log in with the following credentials: [email protected]/root

Python requirements are handled with pip-tools, you will need to install it. When you want to add a requirement, simply update requirements.in and run pip-compile in the root directory. You can then rebuild the Docker image.

Backend & Frontend

You can run the frontend along with the backend in a single command :

  docker compose --profile frontend up

Pipelines

If you need the pipelines or want to work on them, there are 2 optional services to run: pipelines_runner and/or pipelines_scheduler. You can run them with the following command instead of docker compose up:

docker compose --profile pipelines up

The Writing OpenHEXA pipelines section of the wiki contains the instructions needed to build and deploy a data pipeline on OpenHEXA.

To deploy and run data pipelines locally, you will need to:

  1. Create a workspace on your local instance
  2. Configure the SDK to use your local instance as the backend
openhexa config set_url http://localhost:8000

You can now deploy your pipelines to your local OpenHEXA instance.

Please refer to the SDK documentation for more information.

Dataset worker

Generation of file samples and metadata calculation are done in separate worker, in order to run it locally you can make use of dataset_worker by adding dataset_worker profile to the list of enabed profiles.

docker compose --profile dataset_worker up

Running commands on the container

The app Docker image contains an entrypoint. You can use the following to list the available commands:

docker compose run app help

As an example, use the following command to run the migrations:

docker compose run app migrate

Analytics

We use Mixpanel to track users and their actions. If you want to enable it, set the MIXPANEL_TOKEN environment variable with the token from your Mixpanel project and restart the application.

Debugging

If you want to run the backend app in debugger mode, you can override the default command to execute by adding a docker-compose.debug.yaml file in order to use the your favorite debugger package and wait for a debugger to attach.

Using debugpy for VSCode

# docker-compose.debug.yaml

services:
  app:
    entrypoint: []
    command:
      - "sh"
      - "-c"
      # If you want to wait for the debugger client to be attached before running the server
      # - |
      #   pip install debugpy \
      #   && python -m debugpy --listen 0.0.0.0:5678 --wait-for-client /code/manage.py runserver 0.0.0.0:8000
      - |
        pip install debugpy \
        && python -m debugpy --listen 0.0.0.0:5678 /code/manage.py runserver 0.0.0.0:8000
    ports:
      - "8000:8000"
      - "5678:5678"

You can then add a new configuration in VSCode to run the app in debugger mode:

# .vscode/launch.json

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Attach OpenHEXA Debugger",
            "type": "debugpy",
            "request": "attach",
            "connect": {
                "host": "localhost",
                "port": 5678
            },
            "pathMappings": [
                {
                    "localRoot": "${workspaceFolder}",
                    "remoteRoot": "/code"
                }
            ],
            "django": true,
            "justMyCode": false
        }
    ]
}

Run the app with docker compose -f docker-compose.yaml -f docker-compose.debug.yaml up & start the debugger from VSCode.

Using Pycharm

# docker-compose.debug.yaml

services:
  app:
    entrypoint: []
    # Used when running in normal mode.
    command: ["/code/docker-entrypoint.sh", "manage", "runserver", "0.0.0.0:8000"]
    ports:
      - "8000:8000"

Create a new interpreter configuration in Pycharm with the following settings:

Pycharm Interpreter Configuration

Create a new django server run configuration by setting the following options:

  • Python interpreter: The one you just created
  • In "Docker Compose" section; Command and options: -f docker-compose.yaml -f docker-compose.debug.yaml up

Run the configuration in debug mode.

PgAdmin as dev tool

For development purposes, you can define a pgAdmin service as Docker container. In this example, let's say in docker-compose.dev.yaml.

# docker-compose.dev.yaml

services:
  pgadmin:
    image: dpage/pgadmin4
    environment:
      PGADMIN_DEFAULT_EMAIL: ${PGADMIN_DEFAULT_EMAIL:[email protected]}
      PGADMIN_DEFAULT_PASSWORD: ${PGADMIN_DEFAULT_PASSWORD:-root}
      PGADMIN_CONFIG_SERVER_MODE: "False"
      PGADMIN_CONFIG_MASTER_PASSWORD_REQUIRED: "False"
    ports:
      - "${PGADMIN_PORT:-5050}:80"
    depends_on:
      - db
    networks:
      - openhexa
    volumes:
      - pgadmin_data:/var/lib/pgadmin4

volumes:
  pgadmin_data:

Next run the following command:

docker compose -f docker-compose.yaml -f docker-compose.dev.yaml [-f docker-compose.debug.yaml] up

In the browser, go to http://localhost:5050 and log in using credentials defined in the docker-compose.dev.yaml file.

PgAdmin dev tool

Finally create a new connection to the server:

PgAdmin dev tool

PgAdmin dev tool

The address of the server must be the one of the database container gateway, on the 5434 port.

Running the tests

Backend

Running the tests is as simple as:

docker compose run app test --settings=config.settings.test

Some tests call external resources (such as the public DHIS2 API) and will slow down the suite. You can exclude them when running the test suite for unrelated parts of the codebase:

docker compose run app test --exclude-tag=external --settings=config.settings.test

You can run a specific test as it follows:

docker compose run app test hexa.core.tests.CoreTest.test_ready_200 --settings=config.settings.test

There are many other options, if you want to find out more, look at the documentation of Django test harness, as it is what we are using.

Frontend

Jest is used for the frontend tests.

cd frontend
npm run test

I18N

Backend

You can extract the strings to translate with the following command:

docker compose run app manage makemessages -l fr # Where fr is the language code

You can then translate the strings in the hexa/locale folder.

To compile the translations, run the following command:

docker compose run app manage compilemessages

Frontend

Translations are stored in frontend/public/locales/[lang]/[ns].json. To extract new strings from the frontend/src/ directory, run the extract command:

cd frontend
npm run i18n:extract

To translate the strings using DeepL, run the translate command after setting the DEEPL_API_KEY environment variable:

npm run i18n:translate fr # translate to French
# OR
npm run i18n:translate fr --overwrite # translate to French and overwrite all the strings

You can validate that all the strings are translated using the following command:

npm run i18n:validate

License

This project is licensed under the MIT License. See the LICENSE.md file for details.

FAQ

How should I write my commits?

This project assumes you are using Conventional Commit messages.

The most important prefixes you should have in mind are:

  • fix: which represents bug fixes, and correlates to a SemVer patch.
  • feat: which represents a new feature, and correlates to a SemVer minor.
  • feat!:, or fix!:, refactor!:, etc., which represent a breaking change (indicated by the !) and will result in a SemVer major.