`coder_metadata` gets incorrectly assigned to resources #306

ethanndickson · 2024-08-15T14:20:06Z

Currently, the Terraform provisioner relies on the output of terraform graph to determine what resource coder_metadata instances should be applied to when building the template or workspace.

Unfortunately, it's very reasonable for a user to write a template where the output of terraform graph does not meet what's required to correctly map the metadata to the resource, such as when using the contents of another resource to populate a field. This results in the UI showing metadata intended for one resource on another.

Here's a minimal example of this using null_resources:

resource "coder_agent" "main" {
  os   = "linux"
  arch = "amd64"
}

// Agent resource
resource "null_resource" "about" {
  depends_on = [
    coder_agent.main,
  ]
  triggers = {
    name = "Null About"
  }
}

// Some secondary resource, such as a volume, pointing to the agent resource
resource "null_resource" "other" {
  triggers = {
    about_name = null_resource.about.triggers.name
  }
}

// Metadata pointing to the agent resource and the volume
resource "coder_metadata" "about_info" {
  resource_id = null_resource.about.id
  item {
    key = "other-reference"
    value = null_resource.other.triggers.about_name
  }
}

The graph output of this template is then:

Since null_resource.other is the closest resource to coder_metadata.about_info, the metadata gets assigned to it, instead of null_resource.about, as was indicated in the template.

The same graph and graph traversal algorithm is used to determine which resource the agent exists within, so it's possible (but not proven) that this same issue effects assigning the agent to a resource.

Regardless, this should be a purely cosmetic bug, and shouldn't have any bearing on anything outside of how workspace metadata is shown in the UI.

The text was updated successfully, but these errors were encountered:

matifali · 2024-08-15T16:51:48Z

coder/coder#6517 could be related.

matifali · 2024-08-20T07:21:40Z

#133 also looks related.

aslilac · 2025-01-07T18:33:45Z

unassigning myself for now in case @Emyrk can pick it up, but I might circle back if he doesn't

ethanndickson · 2025-01-08T03:55:05Z

One thing to keep in mind when solving this is that a Terraform resource ID isn't a protected attribute or anything special. There's nothing stopping popular providers from using the same id attribute for multiple resources. The AWS provider shares an ID between
ec2_instance_state and instance, as seen in the example for the first. I wouldn't be surprised if we can't handle that edge case :(

Emyrk · 2025-01-21T18:28:06Z

I was unable to get to this, but pretty sure the code is located here in coderd: https://github.com/coder/coder/blob/389768b48f8d0ab9cbd28bc66f7bae308011062d/provisioner/terraform/resources.go#L563-L578

And should be fixed there.

mafredri · 2025-02-03T17:43:48Z

I've been looking into this issue and I have a pretty good idea of what's going on and what's going wrong.

As I see it, we the following problems:

The graph generated by terraform is lossy, we lose information about the association expressed in the terraform (resource id -> specific resource)
The graph from terraform plan has less details than that from terraform apply
IDs are not guaranteed to be unique across resources

For 1 and 2, we can utilize tfplan and tfstate JSON outputs, however, for 3 I have yet to figure out a 100% solution.

Proposal: For plan we use the tfplan "configuration" field to figure out which resource is being referenced and for apply we simply use the resource_id -> id association.

Does this seem like a sufficient solution for the problem? See below for more details on what data is available to us.

Thankfully, the tfplan actually contains everything we need (see resource_id, JSON has been trimmed):

{
  "configuration": {
    "root_module": {
      "resources": [
        {
          "address": "coder_metadata.about_info",
          "mode": "managed",
          "type": "coder_metadata",
          "name": "about_info",
          "provider_config_key": "coder",
          "expressions": {
            "item": [
              {
                "key": {
                  "constant_value": "other-reference"
                },
                "value": {
                  "references": [
                    "null_resource.other.triggers.about_name",
                    "null_resource.other.triggers",
                    "null_resource.other"
                  ]
                }
              }
            ],
            "resource_id": {
              "references": [
                "null_resource.about.id",
                "null_resource.about"
              ]
            }
          },
          "schema_version": 0
        }
      ]
    }
  }
}

As can be ween in this JSON output, the exact resource and field is being referenced in the configuration. We've yet to utilize this part of the tfplan, which seems like a shame, it could probably also be used to make a better resource<->agent association.

However, if we assume that the uniqueness of ID is unreliable, the tfstate actually ends up being more lossy for our use-case as it references the potentially non-unique ID instead of the unique resource name. (JSON has been trimmed.)

{
  "values": {
    "root_module": {
      "resources": [
        {
          "address": "coder_metadata.about_info",
          "mode": "managed",
          "type": "coder_metadata",
          "name": "about_info",
          "provider_name": "registry.terraform.io/coder/coder",
          "schema_version": 0,
          "values": {
            "daily_cost": null,
            "hide": null,
            "icon": null,
            "id": "f0c9ff7b-48fa-425a-ac82-8bc4759802db",
            "item": [
              {
                "is_null": false,
                "key": "other-reference",
                "sensitive": false,
                "value": "Null About"
              }
            ],
            "resource_id": "8302293550006744338"
          },
          "sensitive_values": {
            "item": [
              {}
            ]
          },
          "depends_on": [
            "coder_agent.main",
            "null_resource.about",
            "null_resource.other"
          ]
        },
        {
          "address": "null_resource.about",
          "mode": "managed",
          "type": "null_resource",
          "name": "about",
          "provider_name": "registry.terraform.io/hashicorp/null",
          "schema_version": 0,
          "values": {
            "id": "8302293550006744338",
            "triggers": {
              "name": "Null About"
            }
          },
          "sensitive_values": {
            "triggers": {}
          },
          "depends_on": [
            "coder_agent.main"
          ]
        }
      ]
    }
  }
}

AFAICT, the best we can do here is rely on resource ID (without passing on information from plan). My guess is that we could further reduce the number of possible resources by checking that the resource with said ID exists in depends_on, but this would need to be verified, and it's still a half-measure.

Emyrk · 2025-02-04T15:10:57Z

I just had a good chat with @mafredri

In summary, we believe using terraform graph might not be the best method of extracting the relationship for coder_metadata resources. @mafredri explains some of the lossy behavior above.

To recap in my own words:

terraform graph does not retain the underlying hcl expressions. This might not be an issue, however it is lossy. Especially if there are cases of nested references (A -> B -> C)
Terraform graph is relating nodes based on the references contained in the item blocks. The only argument of importance is resource_id. In the example above, the item block reference is closer than the resource_id reference.

In the long term (2mo+), I proposed to leverage the terraform evaluation engine that is under works to determine the relationships instead.

We can do this by looking at the static hcl expression for the resource_id argument, meaning we can likely do this entirely before a terraform plan/apply. There is still work to verify this proposal, and come up with the tricky edge cases with the expected output.

@mafredri given my proposal above, do you think you have a cheap fix that can solve some of the more obvious errors?

mafredri · 2025-02-05T13:35:18Z

@Emyrk thanks for the great chat and writing the summary!

While I'm a fan of consolidating our Terraform logic in one place, and would like to see us use the static analysis your working on @Emyrk, we can fairly simply fix this issue by utilizing the configuration/resource ID of tf plan/apply and probably cover 99.9% of cases.

I'd suggest we fall-back to the current graph-based matching in the case of duplicate resource IDs, and pick the closest match of those IDs.

@bpmct I'm guessing a 2mo+ for a fix is not ideal here, but I'll defer to you whether or not we should fix this now or wait.

bpmct · 2025-02-05T16:53:17Z

we can fairly simply fix this issue by utilizing the configuration/resource ID of tf plan/apply and probably cover 99.9% of cases

Let's just do this.

I'm guessing a 2mo+ for a fix is not ideal here

It wouldn't take 2 months to implement something special, right? It would be 2 months until we have static analysis (for other purposes), then we can repurpose to do this assignment slightly better?

I'm in favor of the quick fix now and then evaluating using better methods once we have them.

matifali mentioned this issue Nov 20, 2024

Conflict when adding coder_metadata with duplicate resouce_ids #305

Open

matifali transferred this issue from coder/coder Nov 20, 2024

coder-labeler bot added the help wanted label Nov 20, 2024

matifali removed the help wanted label Nov 20, 2024

bpmct added the must-do label Nov 27, 2024

Kira-Pilot assigned aslilac Jan 6, 2025

aslilac removed their assignment Jan 7, 2025

Emyrk self-assigned this Jan 7, 2025

matifali mentioned this issue Jan 13, 2025

JFrog xray scanning module to list workspace image vulnerabilities coder/modules#244

Open

Kira-Pilot unassigned Emyrk Jan 21, 2025

bpmct assigned mafredri Jan 23, 2025

mafredri mentioned this issue Feb 5, 2025

chore(provisioner/terraform): allow generating individual modules coder/coder#16447

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`coder_metadata` gets incorrectly assigned to resources #306

`coder_metadata` gets incorrectly assigned to resources #306

ethanndickson commented Aug 15, 2024 •

edited

Loading

matifali commented Aug 15, 2024

matifali commented Aug 20, 2024

aslilac commented Jan 7, 2025

ethanndickson commented Jan 8, 2025 •

edited

Loading

Emyrk commented Jan 21, 2025

mafredri commented Feb 3, 2025 •

edited

Loading

Emyrk commented Feb 4, 2025

mafredri commented Feb 5, 2025

bpmct commented Feb 5, 2025

coder_metadata gets incorrectly assigned to resources #306

coder_metadata gets incorrectly assigned to resources #306

Comments

ethanndickson commented Aug 15, 2024 • edited Loading

matifali commented Aug 15, 2024

matifali commented Aug 20, 2024

aslilac commented Jan 7, 2025

ethanndickson commented Jan 8, 2025 • edited Loading

Emyrk commented Jan 21, 2025

mafredri commented Feb 3, 2025 • edited Loading

Emyrk commented Feb 4, 2025

mafredri commented Feb 5, 2025

bpmct commented Feb 5, 2025

`coder_metadata` gets incorrectly assigned to resources #306

`coder_metadata` gets incorrectly assigned to resources #306

ethanndickson commented Aug 15, 2024 •

edited

Loading

ethanndickson commented Jan 8, 2025 •

edited

Loading

mafredri commented Feb 3, 2025 •

edited

Loading