-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #9 from unicef/develop
dev2master
- Loading branch information
Showing
38 changed files
with
743 additions
and
58 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,3 +2,5 @@ | |
~* | ||
|
||
!.github | ||
|
||
__pycache__ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
HOPE Documentation | ||
================== | ||
|
||
This Repo is used to manage the official HOPE documentation | ||
|
||
## Install | ||
|
||
$ pdm install | ||
$ pdm venv activate in-project | ||
|
||
#### Using .envrc | ||
|
||
add in your .envrc | ||
|
||
eval $(pdm venv activate in-project) | ||
|
||
#### Start | ||
$ mkdocs build | ||
$ mkdocs serve |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,3 @@ | ||
nav: | ||
- index.md | ||
- Contributing: setup | ||
- Usage: usage.md | ||
- setup.md |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
# Contributing | ||
# Setup | ||
|
||
|
||
Prerequisites: | ||
|
@@ -83,3 +83,28 @@ echo "unset PS1" >> .envrc | |
The first time after you have created or modified the _.envrc_ file you will have to authorize it using: | ||
direnv allow | ||
# Run | ||
To start working with Aurora you can: | ||
### Build and use your docker | ||
After you have cloned the repo, be sure to have a Reddis and PostgreSQL server running on your machine | ||
export [email protected] | ||
export ADMIN_PASSWORD=password | ||
export DATABASE_URL=postgres://postgres:@127.0.0.1:5432/aurora | ||
export CACHE_URL=redis://127.0.0.1:6379/1?client_class=django_redis.client.DefaultClient | ||
cd docker | ||
make build run | ||
### Use provided compose.yml | ||
docker compose up | ||
navigate to http://localhost:8000/admin/ and login using `[email protected]/password` |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
nav: | ||
- index.md | ||
- setup.md | ||
- tmp.md | ||
- tmp2.md |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,5 @@ | ||
# Deduplication Engine | ||
|
||
## Repository | ||
|
||
<https://github.com/unicef/hope-dedup-engine> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
# Setup |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,259 @@ | ||
Deduplication | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
RDI | ||
|
||
Fuzzy | ||
|
||
Threshold by BA | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Programme based deduplication. | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Check on documents happen after merge. | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
CHANGE REQUEST 168410 | ||
|
||
|
||
|
||
|
||
Customizable deduplication | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Real cases: | ||
|
||
Need adjudication: same "Bank Statement" number | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Duplicate | ||
|
||
Not Duplicate | ||
|
||
Withdrawn | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Not Withdrawn | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Deduplication | ||
|
||
what is the flag "Postpone deduplication”? | ||
|
||
status pending | ||
|
||
Document type flags | ||
|
||
is_identity_document | ||
|
||
valid_for_deduplication (change the signature ==> 2 valid document with same ID) | ||
|
||
document number | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Batch | ||
|
||
|
||
|
||
|
||
Golden Records | ||
|
||
|
||
|
||
|
||
Threashold -> need adjudication | ||
|
||
|
||
|
||
|
||
deduplication_batch_results | ||
|
||
deduplication_golden_record_results | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Questions | ||
|
||
|
||
|
||
|
||
A) what is the flag "Postpone deduplication" | ||
|
||
what would it stop? | ||
|
||
fuzzy match | ||
|
||
bank account | ||
|
||
document number | ||
|
||
|
||
|
||
|
||
Disable ES | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
B) document type - valid_for_deduplication flag has it used? | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
ok so deduplication works on all documents with: | ||
|
||
status pending | ||
|
||
flag is_identity_document set to true | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
flag valid_for_deduplication only change the signature ==> 2 valid document with same ID | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
DEDUPLICATION | ||
|
||
Name, gender, date of birth | ||
|
||
|
||
|
||
|
||
|
||
|
||
Flexible de-duplications checks (decide from user side which fields should be used to deduplicate) | ||
|
||
No, depends on filters, index | ||
|
||
Requires redesign of flex fields | ||
|
||
Redesign deduplication (?) | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Instructions: | ||
|
||
List: subset (aka sessions) | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
is identity document | ||
|
||
if this is set to true, we use this document to deduplicate and create ticket | ||
|
||
|
||
|
||
|
||
unique for individual | ||
|
||
you cannot create more than one document of this type for individual | ||
|
||
|
||
|
||
|
||
valid for deduplication | ||
|
||
ignores the type of the document during deduplication and deduplicate between different types with this flag set to true | ||
|
||
|
||
|
||
|
||
|
||
|
||
There are two different rules for documents uniqueness: | ||
|
||
"unique for individual" flag indicates whether we should validate uniqueness of document type per individual - so that Individual can only have 1 VALID document of this document_type+country. | ||
|
||
document data uniqueness inside a Program - so that there cannot be more than 1 VALID document with the same set of values: document_number+type+country in a Program | ||
|
||
|
Oops, something went wrong.