-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #12 from ai-cfia/issue8-missing-chunks
issue #8: install doc and missing_chunk_queries
- Loading branch information
Showing
55 changed files
with
1,148 additions
and
548 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
LOUIS_DSN= | ||
PGBASE= | ||
PGUSER= | ||
USER= | ||
PGHOST= | ||
POSTGRES_PASSWORD= | ||
PGPASSWORD= | ||
OPENAI_API_KEY= | ||
AZURE_OPENAI_SERVICE= | ||
LOUIS_SCHEMA= | ||
DB_SERVER_CONTAINER_NAME= | ||
PGDATA= |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
.env | ||
.env** | ||
.pgpassfile | ||
dumps/** | ||
reports/** | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
# Development guidelines for louis-db | ||
|
||
## Making changes to the database schema | ||
|
||
### Run latest schema locally | ||
|
||
* Setup .env environment variables | ||
* LOUIS_DSN: Data Source Name (DSN) used for configuring a database connection in Louis's system. | ||
|
||
* PGBASE: the base directory where PostgreSQL related files or resources are stored or accessed. | ||
|
||
* PGUSER: the username or role required to authenticate and access a PostgreSQL database. | ||
|
||
* USER: the username required for validation and access | ||
|
||
* PGHOST: the hostname or IP address of the server where the PostgreSQL database is hosted. | ||
|
||
* PGPASSWORD: the password for the user authentication when connecting to the PostgreSQL database. | ||
|
||
* POSTGRES_PASSWORD: the password for the database, for authentication when connecting to the PostgreSQL database. | ||
|
||
* PGDATA: path to the directory where PostgreSQL data files are stored. | ||
|
||
* OPENAI_API_KEY: the API key required for authentication when making requests to the OpenAI API. | ||
|
||
* AZURE_OPENAI_SERVICE: information related to an Azure-based service for OpenAI. | ||
|
||
* LOUIS_SCHEMA: the Louis schema within database. | ||
|
||
* DB_SERVER_CONTAINER_NAME: name of your database server container. | ||
|
||
* Run database locally (see bin/postgres.sh) | ||
* Restore latest schema dump | ||
|
||
### before every change | ||
|
||
* pgdump the schema using ```bin/backup-db-docker.sh``` | ||
|
||
### Create change | ||
|
||
* make sure to create a Github Issue issue #X first describing the work to be done | ||
* create a branch ```issueX-descriptive-name``` | ||
* add a new SQL file YYYY-mm-dd-issueX-descriptive-name | ||
* explain in top header comment the changes to be made | ||
* provide original DDL of files to be modified | ||
* create a test case in tests/test_db.py | ||
* load your new SQL file within a transaction (that will be rolled back) | ||
* ensure you have an assert to test for | ||
* once your test passes, commit change to the database by running your script with bin/psql.sh | ||
* you should now be able to remove the load SQL file and run the test successfully | ||
* re-run test suite and fix exposed database functions affected by your changes (failing) | ||
* dump the new schema as louis_v00X with X+1 | ||
* test new schema with your client apps. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,6 +2,20 @@ | |
|
||
## Installing python package | ||
|
||
If you need to interface with the database, use this to install: | ||
|
||
``` | ||
pip install git+https://github.com/ai-cfia/[email protected] | ||
``` | ||
pip install git+https://github.com/ai-cfia/[email protected] | ||
``` | ||
|
||
You'll often want to add, move or modify existing database layer functions found in louis-db from a client repository. | ||
|
||
To edit, you can install an editable version of the package dependencies such as: | ||
|
||
``` | ||
pip install -e git+https://github.com/ai-cfia/louis-db#egg=louis_db | ||
``` | ||
|
||
this will checkout the latest source in a local git in src/louis-db allowing edits in that directory to be immediately available for use by louis-crawler. | ||
|
||
Don't forget to create a PR with your changes once you're done! |
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
# creating a new schema | ||
|
||
## environment | ||
|
||
This assumes: | ||
|
||
* you are running WSL | ||
* you are running a dockerized version of Postgresql 15 under WSL | ||
* you are running louis-db in a DevContainer under Visual Studio Code | ||
* your source is on WSL under ~/src | ||
|
||
## configuration | ||
|
||
database connection parameters is set in .env file | ||
|
||
you can create multiple .env.NAME and symlink as needed: | ||
|
||
working on local source: | ||
|
||
``` | ||
ln -sf .env.louis_v004_local .env | ||
``` | ||
|
||
switching to target | ||
|
||
``` | ||
ln -sf .env.louis_v005_azure .env | ||
``` | ||
|
||
## Running the database server locally | ||
|
||
* use Dockerfile in postgres directory | ||
* use ```bin/postgres.sh``` script as your startup script (symlink) | ||
|
||
## Editing | ||
|
||
* Create adhoc modifications as scripts in sql/ with proper YYYY-mm-dd prefix | ||
* Create tests that apply these sql scripts in a transaction and test them | ||
* Once satisfied, commit changes to database | ||
|
||
|
||
|
||
## backing up schema and data | ||
|
||
in this example, the modified louis_v004 becomes the louis_v005 schema: | ||
|
||
``` | ||
./bin/dump-versioned-schema.sh louis_v004 louis_v005 | ||
./bin/dump-versioned-data.sh louis_v004 louis_v005 | ||
``` | ||
|
||
## loading schema | ||
|
||
change your .env to link to your target database first | ||
|
||
``` | ||
./bin/load-versioned-schema.sh louis_v005 | ||
``` | ||
|
||
validate manually that schema is as expected here (dbBeaver ERD diagram) before loading the data: | ||
|
||
``` | ||
./bin/load-versioned-data.sh louis_v005 | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
#!/bin/bash | ||
DIRNAME=`dirname $0` | ||
. $DIRNAME/lib.sh | ||
|
||
docker cp $DIRNAME/backup-db.sh louis-db-server:backup-db.sh | ||
docker cp $DIRNAME/lib.sh louis-db-server:lib.sh | ||
docker exec -it -e PGDUMP_FILENAME=/dev/stdout --env-file $ENV_FILE louis-db-server ./backup-db.sh > $PGDUMP_FILENAME |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
#!/bin/bash | ||
DIRNAME=`dirname $0` | ||
. $DIRNAME/lib.sh | ||
|
||
if [ ! -f "$NAME" ]; then | ||
echo "preparing to dump $PGBASE.$LOUIS_SCHEMA to $PGDUMP_FILENAME" | ||
# apparently pg_dump doesn't use the environment variables PG* | ||
pg_dump -d $PGBASE --schema=$LOUIS_SCHEMA --no-owner --no-privileges --file $PGDUMP_FILENAME | ||
else | ||
echo "File $PGDUMP_FILENAME already exists" | ||
fi | ||
|
||
if [ -f "$PGDUMP_FILENAME" ]; then | ||
if [ ! -f "$PGDUMP_FILENAME.zip" ]; then | ||
zip $PGDUMP_FILENAME.zip $PGDUMP_FILENAME | ||
else | ||
echo "File $PGDUMP_FILENAME.zip already exists" | ||
fi | ||
fi |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list' | ||
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add - | ||
sudo apt update | ||
sudo apt install postgresql-client-15 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
#!/bin/bash | ||
DIRNAME=$(dirname $(realpath $0)) | ||
PARENT_DIR=$DIRNAME/.. | ||
PROJECT_DIR=$(realpath $PARENT_DIR) | ||
ENV_FILE=$PROJECT_DIR/.env | ||
|
||
if [ -f "$ENV_FILE" ]; then | ||
# shellcheck source=lib.sh | ||
. "$ENV_FILE" | ||
else | ||
echo "WARNING: File $ENV_FILE does not exist, relying on environment variables" | ||
fi | ||
|
||
check_environment_variables_defined () { | ||
variable_not_set=0 | ||
for VARIABLE in "$@"; do | ||
if [ -z "${!VARIABLE}" ]; then | ||
echo "Environment variable $VARIABLE is not set" | ||
variable_not_set=1 | ||
fi | ||
done | ||
|
||
if [ $variable_not_set -eq 1 ]; then | ||
echo "One or more variables are not defined, the program cannot continue" | ||
exit 1 | ||
fi | ||
} | ||
|
||
export PGOPTIONS="--search_path=$LOUIS_SCHEMA" | ||
export PGBASE | ||
export PGDATABASE | ||
export PGHOST | ||
export PGUSER | ||
export PGPORT | ||
export PGHOST | ||
export PGPASSFILE | ||
export PGPASSWORD | ||
|
||
VERSION15=$(psql --version | grep 15.) | ||
|
||
if [ -z "$VERSION15" ]; then | ||
echo "postgresql-client-15 required" | ||
exit 1 | ||
fi | ||
|
||
TODAY=$(date +%Y-%m-%d) | ||
|
||
if [ -z "$PGDUMP_FILENAME" ]; then | ||
PGDUMP_FILENAME=$PROJECT_DIR/dumps/$TODAY.$PGBASE.pg_dump | ||
fi | ||
|
||
export PSQL_ADMIN="psql -v ON_ERROR_STOP=1 --single-transaction -d $PGBASE" |
File renamed without changes.
Oops, something went wrong.