Skip to content

Internal Data Model

René Radoi edited this page Jul 17, 2024 · 21 revisions

Backwards compatibility, during minor version upgrades, can be broken due to changes in the relations data.

This document is intended only for charm developers.

We aim to document the internal data models currently used in order to carefully assess any breaking changes in future releases of the OpenSearch charm.

Secrets:

a. Application level:

  • Generic:

    • Label prefix: <app_name>:app:
    • Content:
      • admin-password: str: password of the admin user of the charm - created when deploying the charm first, this is the operator user of the charm. Needed to be persisted on the internal_users.yml file on each unit.
      • admin-password-hash: str: bcrypt hash of the admin user's password, needed to be persisted on the internal_users.yml file on each unit.
      • kibanaserver-password: str: password of the kibanaserver user when being integrated with the opensearch-dashboards charm, this user configures the opensearch dashboards internal config files.
      • kibanaserver-password-hash: str: bcrypt hash of the kibanaserver user's password, needed to be persisted on the internal_users.yml file on each unit.
      • monitor-password: str: password of the monitor user needed when integrating with COS for reporting metrics.
  • TLS:

    • Label prefix: <app_name>:app:
    • Content:
      • app-admin: Dict[str, str]: various TLS resources used internally by the charm, notably by the adminuser above.
        • cert: certificate content.
        • chain: Full chain of the certificate.
        • ca-cert: CA issuer certificate.
        • key: private key used to issue the CSR (generated or passed through the set-tls-private-key action).
        • key-password: private key password if passed by the user on the set-tls-private-key action.
        • csr: CSR used to request this certificate.
        • subject: set to admin, subject used to issue the CSR.
        • truststore-password: trust store password (of the CA)
        • keystore-password: key store password of the admin TLS resources (key / certificate)
  • On Large deployments (by consumer applications = non "main-orchestrator"):

    • Label prefix: <app_name>:app:
    • Content:
      • s3-creds: Dict[str, str]: S3 credentials for the backup module, retrived from the main-orchestrator related to the data-integrator.
        • access-key: access key used to write in the s3 bucket.
        • secret-key: secret key used to write in the s3 bucket.

b. Unit level:

  • TLS:

    • Label prefix: <app_name>:app:<unit_id>:
    • Content:
      • unit-transport: Dict[str, str]: various TLS resources used for the Transport layer / node-to-node communications (data transfer encryption).
        • cert: certificate content.
        • chain: Full chain of the certificate.
        • ca-cert: CA issuer certificate.
        • key: private key used to issue the CSR (generated or passed through the set-tls-private-key action).
        • key-password: private key password if passed by the user on the set-tls-private-key action.
        • csr: CSR used to request this certificate.
        • subject: set to admin, subject used to issue the CSR.
        • keystore-password: key store password of the Transport Layer TLS resources (key / certificate)
      • unit-http: Dict[str, str]: various TLS resources used for the HTTP layer / client communications (https with the API).
        • cert: certificate content.
        • chain: Full chain of the certificate.
        • ca-cert: CA issuer certificate.
        • key: private key used to issue the CSR (generated or passed through the set-tls-private-key action).
        • key-password: private key password if passed by the user on the set-tls-private-key action.
        • csr: CSR used to request this certificate.
        • subject: set to admin, subject used to issue the CSR.
        • keystore-password: key store password of the HTTP Layer TLS resources (key / certificate)

Model:

Complex objects stored in the relation data are modeled as Pydantic model classes.

class App(Model):
    """Data class representing an application."""

    id: Optional[str] = None
    short_id: Optional[str] = None
    name: Optional[str] = None
    model_uuid: Optional[str] = None


class Node(Model):
    """Data class representing a node in a cluster."""

    name: str
    roles: List[str]
    ip: str
    app: App
    unit_number: int
    temperature: Optional[str] = None


class DeploymentType(BaseStrEnum):
    """Nature of a sub cluster deployment."""

    MAIN_ORCHESTRATOR = "main-orchestrator"
    FAILOVER_ORCHESTRATOR = "failover-orchestrator"
    OTHER = "other"


class StartMode(BaseStrEnum):
    """Mode of start of units in this deployment."""

    WITH_PROVIDED_ROLES = "start-with-provided-roles"
    WITH_GENERATED_ROLES = "start-with-generated-roles"


class Directive(BaseStrEnum):
    """Directive indicating what the pending actions for the current deployments are."""

    NONE = "none"
    SHOW_STATUS = "show-status"
    WAIT_FOR_PEER_CLUSTER_RELATION = "wait-for-peer-cluster-relation"
    INHERIT_CLUSTER_NAME = "inherit-name"
    VALIDATE_CLUSTER_NAME = "validate-cluster-name"
    RECONFIGURE = "reconfigure-cluster"


class State(BaseStrEnum):
    """State of a deployment, directly mapping to the juju statuses."""

    ACTIVE = "active"
    BLOCKED_WAITING_FOR_RELATION = "blocked-waiting-for-peer-cluster-relation"
    BLOCKED_WRONG_RELATED_CLUSTER = "blocked-wrong-related-cluster"
    BLOCKED_CANNOT_START_WITH_ROLES = "blocked-cannot-start-with-current-set-roles"
    BLOCKED_CANNOT_APPLY_NEW_ROLES = "blocked-cannot-apply-new-roles"


class DeploymentState(Model):
    """Full state of a deployment, along with the juju status."""

    value: State
    message: str = Field(default="")


class PeerClusterConfig(Model):
    """Model class for the multi-clusters related config set by the user."""

    cluster_name: str
    init_hold: bool
    roles: List[str]
    data_temperature: Optional[str] = None


class DeploymentDescription(Model):
    """Model class describing the current state of a deployment / sub-cluster."""

    app: App
    config: PeerClusterConfig
    start: StartMode
    pending_directives: List[Directive]
    typ: DeploymentType
    state: DeploymentState = DeploymentState(value=State.ACTIVE)
    promotion_time: Optional[float]


class S3RelDataCredentials(Model):
    """Model class for credentials passed on the PCluster relation."""

    access_key: str = Field(alias="access-key")
    secret_key: str = Field(alias="secret-key")


class PeerClusterRelDataCredentials(Model):
    """Model class for credentials passed on the PCluster relation."""

    admin_username: str
    admin_password: str
    admin_password_hash: str
    kibana_password: str
    kibana_password_hash: str
    monitor_password: str
    admin_tls: Dict[str, Optional[str]]
    s3: Optional[S3RelDataCredentials]


class PeerClusterApp(Model):
    """Model class for representing an application part of a large deployment."""

    app: App
    planned_units: int
    units: List[str]


class PeerClusterFleetApps(Model):
    """Model class for all applications in a large deployment as a dict."""

    __root__: Dict[str, PeerClusterApp]


class PeerClusterRelData(Model):
    """Model class for the PCluster relation data."""

    cluster_name: str
    cm_nodes: List[Node]
    credentials: PeerClusterRelDataCredentials
    deployment_desc: Optional[DeploymentDescription]


class PeerClusterRelErrorData(Model):
    """Model class for the PCluster relation data."""

    cluster_name: Optional[str]
    should_sever_relation: bool
    should_wait: bool
    blocked_message: str
    deployment_desc: Optional[DeploymentDescription]


class PeerClusterOrchestrators(Model):
    """Model class for the PClusters registered main/failover clusters."""

    _TYPES = Literal["main", "failover"]

    main_rel_id: int = -1
    main_app: Optional[App]
    failover_rel_id: int = -1
    failover_app: Optional[App]

We distinguish between the different types of relations.

1. Peer relation:

a. Application data:

  • Generic:

    • Relation endpoint: opensearch-peers:
    • Content:
      • security_index_initialised: bool: whether the security index has been initialized through the opensearch securityadmin script.
      • admin_user_initialized: bool: indicates that the admin user has been created and set.
      • bootstrap_contributors_count: int: count of bootstrap process contributors.
      • deployment_description: DeploymentDescription: description of the current deployment.
      • allocation-exclusions-to-delete: str: comma-separated list of node names to be removed from the allocation exclusions (filled when the deletion of an allocation failed for some reason)
      • delete-voting-exclusions: bool: whether to delete voting exclusions, if the initial attempt failed.
      • nodes_config: List[Node]: full list of current nodes configured by the charm.
        • to be deprecated - once confirmed opensearch is able to make an election with an even number of cm eligible nodes
      • update-ts: int: current time in nanoseconds since epoch (used to trigger a peer-rel-changed event by the leader).
        • to be deprecated in favor of the same flag in unit data
  • On large deployments:

    • Relation endpoint: opensearch-peers
    • Content:
      • orchestrators: PeerClusterOrchestrators: List of registered orchestrators in this application.
      • cluster_fleet_apps: PeerClusterFleetApps: Mapping of full application id and PeerClusterApp (full descriptor of a juju opensearch app)
      • cluster_fleet_apps_rels: PeerClusterApp: (Only on the orchestrators side) - Mapping related application full names and large deployment relation id.
  • Locking (rolled operations):

    • Relation endpoint: node-lock-fallback
    • Content:
      • unit-with-lock: str: full name of the unit holding the lock, when peer relation is used for locking.
      • leader-acquired-lock-after-juju-event-id: str: indicates the juju event id where unit-with-lock was set.
  • Upgrades:

    • Relation endpoint: upgrade-version-a
    • Content:
      • versions: Dict[str, str] / {"charm": "", "workload": ""}: descriptor of the current charm and workload (opensearch) versions.
      • upgrade-resumed: bool: whether the upgrade procedure resumed after confirming the first unit to upgrade is healthy.
      • -unused-timestamp-upgrade-resume-last-updated: str: set current time to trigger a relation changed event.

b. Unit data:

  • Generic:

    • Relation endpoint: opensearch-peers
    • Content:
      • started: bool: whether this unit has fully started and the node is up.
      • tls_configured: bool: flag set when TLS is fully configured in a unit (the TLS secrets / certificates and keys have been set and stored on disk)
      • bootstrap_contributor: bool: whether a cluster_manager eligible node has been part of the bootstrapping process (initial_cluster_managerin opensearch.yml)
      • certs_exp_checked_at: str: expiration date of the certificates (date_format %Y-%m-%d %H:%M:%S)
      • update-ts: int: current time in nanoseconds since epoch (used to trigger a peer-rel-changed event by any unit).
      • allocation-exclusions-to-delete: str: comma-separated list of node names to be removed from the allocation exclusions (filled when the deletion of an allocation failed for some reason)
      • delete-voting-exclusions: bool: whether to delete voting exclusions, if the initial attempt failed.
  • Locking (rolled operations):

    • Relation endpoint: node-lock-fallback
    • Content:
      • lock-requested: bool: whether this unit requested the lock.
      • -trigger: str: set the current juju context id to trigger a rel changed event on the leader (from a non leader unit).
  • Upgrades:

    • Relation endpoint: upgrade-version-a
    • Content:
      • snap_revision: str: current revision of the installed snap.
      • workload_version: str: current version of the opensearch workload.
      • state: UnitState: current state of the upgraded/upgrading unit (healthy, restarting, upgrading, outdated)

2. Large deployment relations:

a. Application data:

  • Provider:

    • Relation endpoint: peer-cluster-orchestrator.
    • Content:
      • orchestrators: PeerClusterOrchestrators: List of orchestrators the Main/Failover provider computed and broadcasted to all related applications.
      • cluster_fleet_apps: PeerClusterFleetApps: The aggregated list of all applications and their descriptions in this fleet (along with their planned units count etc..)
      • data: PeerClusterRelData: Success data set by the orchestrators / providers on the relation.
      • error_data: PeerClusterRelErrorData: Error data set by the orchestrators / providers on the relation.
  • Consumer:

    • Relation endpoint: peer-cluster.
    • Content:
      • app: PeerClusterApp: current detailed app to be reported to the orchestrators.
      • is_candidate_failover_orchestrator: bool: whether the current application can pretend to being elected as a failover orchestrator.

3. Client relations:

a. Application data:

  • Provider:

    • Relation endpoint: opensearch-client
    • Content:
      • version: str: OpenSearch version.
      • index: str: Index name requested by the client application.
      • credentials: Dict[str, str] / {"username": "", "password": ""}: user name and password created for the new established relation.
      • tls-ca: str: The ca-chain to be used for https by the client.
      • endpoints: List[str]: List of endpoints offered by the opensearch charm.