Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify mapping logic and components #506

Open
6 tasks
SferaDev opened this issue Apr 17, 2020 · 2 comments
Open
6 tasks

Simplify mapping logic and components #506

SferaDev opened this issue Apr 17, 2020 · 2 comments

Comments

@SferaDev
Copy link
Contributor

SferaDev commented Apr 17, 2020

Tasks:

  • Reorder code and clean-up components logic
  • Everything typed
  • Reason of a conflict
  • Detect issues with mapping and log them after a sync on history
  • Testing: Create some high-level integration test and unit tests for each mapping method
  • Add visual step for disambiguation (refactor MappingTable so that accepts multiple mappings with priority by order as props and has an update method forwarded to the parent)

Original comments:

  • The requirements for mapping changed a lot during its implementation
  • Now that we have clear expectations of the feature we should probably simplify the resulting code
  • The MappingTable and its utility methods hold too much logic for a component
  • There're no unit tests on this part yet

Requirements:

  • There are two kinds of mappings:
    • Local: Specific mapping for an item and its related metadata
    • Global: Shared mapping for some metadata types (CO, CC, O, DE, etc...)
  • During synchronization both mappings are taken into account, first local mapping and, if not set, global mapping.
  • The UI is a combination of a landing page that redirects to other pages with a MappingTable. The component is similar to a MetadataTable but with some advanced contextual actions. Some of these contextual actions open modals that include a recursive MappingTable inside a Wizard.
  • The data structure holding the mapping is a dictionary by mappingType key (a superset of a metadata type, see: dataElements and aggregatedDataElements). Each dictionary entry is a dictionary itself holding the origin identifiers as keys.
  • For each mapping entry we store the destination id, name and code with the origin code, a conflicts flag and global flag. Additionally any entry can store a nested mapping dictionary (this is commonly referred as related metadata). Although theoretically possible we do not support more than one level of nesting.
  • In the UI we disaggregate the data elements with their programs and dataSets, and a same data element can hold different mappings for different dataSets/programs.
  • To accomplish that the combination of ids are joined by a dash -. This allows to map a certain metadata element only when combined with other metadata. We currently support only two variants of this eventDataElements with programId-programStageId-dataElementId and category options with categoryId-categoryOptionId. It always follow the outer to inner rule and the last element in a split by dash array would be the metadata element we apply any operation.
  • There's a feature called auto-map that looks up the selected metadata on destination instance and tries to find a similar match and offer it to the user. If a given piece of metadata is related with another piece of metadata already mapped, we enforce the two are also related on destination (ie. you map a program and auto-map its data elements, the result of auto-map will only look up for data elements in the destination program).
  • The user can also clear or reset the mapping for one or more entries, this means the related subpart of the mapping tree will be removed from the data structure. If it's a nested subpart (related metadata) it will not remove anything, and just remove the destination information stored.
  • The user can also mark a piece of metadata to be excluded, this is accomplished by setting the destination id as "DISABLED". The synchronization logic already knows what to filter when this happens.

Improvements:

  • Reduce logic in the components and utility methods (considering MappingTable is a recursive component)
  • Create a class hieararchy (Mapping -> DataElementMapping, GlobalIndicatorMapping, etc...)
  • Try to abstract the implementation for edge cases such as programs with data elements and other parent-children structures
  • Reduce the complexity of the current autoMap feature.
  • Remove coupling with DHIS2 to allow mapping between DHIS2 and other data sources (such as a JSON file)
  • Abstract the complex ids: id1-id2-id3 structure and add methods to it as helpers depending on the metadata type they're.
@SferaDev
Copy link
Contributor Author

@adrianq This was the card related to mapping refactor, almost everything applies but now we would create a mapping entity (or maybe even repository) and its use-cases

@SferaDev
Copy link
Contributor Author

SferaDev commented Sep 9, 2020

Changes to the mapping during its implementation

  1. We should be able to have a relation between origin instance and its destinations.
    • Map id1 to id2

  1. We should be able to disable sync for a given metadata on a destination instance.
    • Map id1 to DISABLED

  1. Mapping listing should be offline, table should always have information of destination
    • Store name and code with the mapped id.

  1. We should be able to map related metadata (data element -> category option combos)
    • Add a nested mapping dictionary inside the data element mapping

  1. Working with category option combos is not feasible
    • Add related metadata mapping on a category option and category combo level

  1. Most of the times related metadata relationships are common
    • Add a global resolution page for category options and category combos

  1. User should be able to view and identify errors during mapping
    • Add a conflicts boolean calculated during mapping that shows a warning

  1. For certain pieces of metadata (programs) we want to group them by a parent and make them behave different depending on their parent
    • Add nested ids idA-idX-id1 that maps to id2

  1. Category options can exist in multiple categories and their relationship is not 1:1
    • Add nested ids idA-id1 that maps to id2

  1. Since the mapping is overly complicated, user should not perform all the mappings manually
    • Create an auto-map feature that informs of conflicts that the user is able to review and solve

Current data model structure

{
    "aggregatedDataElements": {
      "idA1": {
        "mappedId": "idB",
        "mappedName": "(DEL) Total number of reports actually received in (current reporting year) (c)  - private",
        "code": "MAL_PREREF_ART_SUPP_COMMUNITY_DISC",
        "conflicts": false,
        "global": false,
        "mapping": {
          "categoryCombos": {
            "JzvGfLYkX17": {
              "mappedId": "JzvGfLYkX17",
              "mappedName": "default",
              "mapping": {},
              "conflicts": false
            }
          },
          "categoryOptions": {
            "DMyxTSpvKOp-Y7fcspgsU43": {
              "mappedId": "DISABLED",
              "mappedCode": "DISABLED",
              "code": "default",
              "global": false,
              "conflicts": true
            }
          }
        }
      }
    }

Biggest problem with current data model

  • During sync we usually perform the following mapping[mappingModel][id1] (simplified for concept)
  • We are not doing origin -> destination syncs anymore, we currently support origin <- destination
  • With pull syncs we do not know id1 but we do know the mapped id2
  • Our dictionary can have N:1 mappings making it non-deterministic on reversal

Proposal

  • Build a more robust entity with methods that retrieve mapping and enforce union types on retrieval.

  • Move out of the mapping dictionary the code/name and add it to a cache like storage for the instance but not inside the mapping itself. This cache of names/codes can be refreshed once in a while with a single metadata call so that the table is always consistent yet still offline if the instance is down.

  • Make the conflicts virtual instead of persisted (can be calculated by visiting its children)

  • This would leave us a dictionary like the one we have but only with ids that can be reversed (even though it has multiple options)

  • Add a visual step during sync that allows to review the applied mapping and/or change it for a given manual sync or sync rule.

@SferaDev SferaDev removed their assignment Mar 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant