Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimize what's stored in ui_metadata #511

Open
ohltyler opened this issue Dec 6, 2024 · 0 comments
Open

Minimize what's stored in ui_metadata #511

ohltyler opened this issue Dec 6, 2024 · 0 comments
Labels
feature A new feature for the plugin v3.0.0 Issues targeting release v3.0.0

Comments

@ohltyler
Copy link
Member

ohltyler commented Dec 6, 2024

Currently, the UI config (stored in the backend under the ui_metadata Workflow field) is storing unnecessary amounts of data. This data is used to dynamically generate the UI form components, validation schema, default values, etc. Some of this data should be offloaded & generated on-the-fly based on logic that exists within the plugin.

For example, given some rerank processor, we persist a lot of information:

{
              "name": "Rerank Processor",
              "id": "rerank_processor_22217734f829a74a",
              "fields": [
                {
                  "id": "target_field",
                  "type": "string",
                  "value": "abc"
                }
              ],
              "type": "rerank",
              "optionalFields": [
                {
                  "id": "remove_target_field",
                  "type": "boolean",
                  "value": false
                },
                {
                  "id": "keep_previous_score",
                  "type": "boolean",
                  "value": false
                },
                {
                  "id": "tag",
                  "type": "string",
                  "value": ""
                },
                {
                  "id": "description",
                  "type": "string",
                  "value": ""
                },
                {
                  "id": "ignore_failure",
                  "type": "boolean",
                  "value": false
                }
              ]
            }

But, suppose a user only fills out the required target_field field. Technically, we should only have to persist that field and its value, the processor type, and some unique processor ID - something like:

{ 
  "type": "rerank",
  "id": "rerank_processor_22217734f829a74a"
  "inputs": {
    "target_field": "abc"
  }
}

The rest of the data, such as the frontend-displayed name, the field types, and the optional fields, don't need to be persisted. These should be generated on-the-fly based on the processor type, for example.

This has many benefits:

  • from a security perspective, best practice to store minimal user data
  • from a storage perspective, this drastically reduces persisted data in the system indices
  • from a maintainability perspective, this helps decouple lower-level processor configurations (certain optional fields and their defaults) from becoming outdated/stale over time; suppose tomorrow the rerank processor adds a new field; when loading this in the UI, we are unable to pick up that change, as it's simply rendering what was previously stored. If generated on-the-fly, we can always have the most up-to-date configurations for any dependency, including processors.
  • also from a maintainability perspective, this simplifies/minimizes the schema and lowers the chance of BWC issues, or requiring schema version updates
@ohltyler ohltyler added feature A new feature for the plugin v3.0.0 Issues targeting release v3.0.0 labels Dec 6, 2024
@ohltyler ohltyler removed the untriaged label Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new feature for the plugin v3.0.0 Issues targeting release v3.0.0
Projects
None yet
Development

No branches or pull requests

1 participant