Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workload Identity in Attestation Results #17

Open
thomas-fossati opened this issue Sep 12, 2023 · 14 comments
Open

Workload Identity in Attestation Results #17

thomas-fossati opened this issue Sep 12, 2023 · 14 comments
Labels
[track] info and data models [Track] Information & Data models for Attestation

Comments

@thomas-fossati
Copy link
Contributor

thomas-fossati commented Sep 12, 2023

@gkostal 09/12/2023 SIG meeting:

How is "identity" represented for an attested environment? Can it be generalized?

  • Especially if the attested environment is a confidential compute attested environment.
  • And in a way that's tractable for a relying party to authorize against.

@gkostal 10/10/2023 additional details:

I am looking for an abstraction (similar to what AR4SI does for trustworthiness of an attested environment) for the "code identity" in an attested environment that:

  • can be expressed in attestation results
  • can be referenced simply by relying parties
  • is independent of underlying TEE technology
  • is stable over time (i.e., OS updates, new builds of application executable/binary/container, etc. do not change "code identity")

In essence, I'd like to look at this from the relying party perspective and figure out what's the ideal model for them, and then work backwards to see if/how it's implementable. For example, I could envision a relying party wanting to express a "code identity" as something like "The secret formula application authored by Coca-Cola" versus "The secret formula application authored by Pepsi".

@thomas-fossati thomas-fossati added the new topic ideas for new discussion topics label Sep 12, 2023
@thomas-fossati
Copy link
Contributor Author

thomas-fossati commented Sep 12, 2023

How is "identity" represented for an attested environment?

@gkostal there may be a couple of different (entangled) identities involved:

  • Workload identity (e.g., a public key bound to the static and possibly dynamic state of the workload)
  • Platform identity (e.g., DICE chain, IDevID, EAT's UEID/SUEID and associated public key (see Appendix C of EAT for a partial comparison.)

Both need to be endorsed by the relevant (typically different) supply chain entit{y,ies}.

Are you thinking of one of these two identities in particular? Or their combination? Or something else?

@gkostal
Copy link
Collaborator

gkostal commented Sep 12, 2023

I believe the former.

I'm thinking of the "identity" from the perspective of a relying party that needs to express a local "appraisal policy for attestation results" that says something like "it's OK to release my secret to the attested environment that's running workload X" where X is a well-known software component.

As you point out there can be a distinction between "identity" being just the code that's running, or "identity" being the code that's running plus its dynamic state (e.g. running on behalf of Coca-Cola, running on behalf of Pepsi, etc.).

I believe EAT suggests the software manifests claim ("manifests") for this purpose. Off the cuff, this seems extremely unwieldy to use for authorization policy in a relying party. Ideally, the relying party captures a policy like "I trust component X running on behalf of party Y" or some such. As the software changes within expectation (e.g., new builds of X are allowed, new dependent libraries brought in by X are OK, etc.), the relying party policy doesn't need to change. IOW, the hash of all the binaries in a protected environment may be an "identity" per definition but it's potentially not an "identity" that works very well for real world remote attestation policy in a relying party.

@thomas-fossati
Copy link
Contributor Author

I believe the former.

I'm thinking of the "identity" from the perspective of a relying party that needs to express a local "appraisal policy for attestation results" that says something like "it's OK to release my secret to the attested environment that's running workload X" where X is a well-known software component.

OK, thanks for the clarification. Maybe we should rename the issue to "Workload Identity"?

[...] Ideally, the relying party captures a policy like "I trust component X running on behalf of party Y" or some such. As the software changes within expectation (e.g., new builds of X are allowed, new dependent libraries brought in by X are OK, etc.), the relying party policy doesn't need to change. IOW, the hash of all the binaries in a protected environment may be an "identity" per definition but it's potentially not an "identity" that works very well for real world remote attestation policy in a relying party.

Yes, pure hashes are probably too low-level to be generally usable. A more abstract "version identification" claim (e.g., SVN) is easier to build policies against. FWIW we have that in CoRIM (see here), and I guess it'd be easy to back-port it to EAT.

@SimonFrost-Arm
Copy link

The RP definitely needs a rolled up view, but the verifier role can be involved in taking on the complexity of checking hashes and resolving (probably) a set of them to an app identity.
Question is whether there should be some standardised expression of app identity which appraisal policy could produce and RPs expect? If so then what granularity can be expected - can the 'app' portion of the workload be reliably identified distinctly from the OS portion? Traditional VM models make that complicated but there is the potential for future FAAS like deployments to be more distinct.
As noted above, the other part of the workload to be identified is any data bundle made available pre-attestation. CCA realm state includes a personalization-value claim intended to deliver this role (without mandated implementation).
There are also proposals to keep external definitions of workloads, with a proof delivered into the environment to be presented alongside evidence e.g, https://queue.acm.org/detail.cfm?id=3623460

@gkostal gkostal changed the title Identity of an attesting environment Workload Identity in Attestation Results Oct 10, 2023
@thomas-fossati
Copy link
Contributor Author

@gkostal, specifically on this point:

is stable over time (i.e., OS updates, new builds of application executable/binary/container, etc. do not change "code identity")

is it a signed statement from the software author (in a general sense) over a bunch of metadata associated with the software that you have in mind here?

@gkostal
Copy link
Collaborator

gkostal commented Oct 10, 2023

@gkostal, specifically on this point:

is stable over time (i.e., OS updates, new builds of application executable/binary/container, etc. do not change "code identity")

is it a signed statement from the software author (in a general sense) over a bunch of metadata associated with the software that you have in mind here?

@thomas-fossati , yes, maybe and/or no. :-)

From the relying party perspective (which is where I'm starting), what does the model look like? Ideally the only signed statement the relying party consumes is the attestation result, and the only signer they need to validate is the verification service's signing key.

From a design perspective, the only immediately obvious way I see to implement this is via something like a software endorsement as you describe. Ideally the mechanics of the software endorsement are not necessary for the relying party (e.g., they really don't need to be aware of and verify the signing key for the endorsement).

So, I foresee that there might be two levels to the reach consensus on:

  • the logical model/schema for describing a workload identity in attestation results
  • the mechanism(s) that enable a verifier to populate this logical model/schema (e.g., CoRIM based endorsement?)

@thomas-fossati
Copy link
Contributor Author

thomas-fossati commented Oct 11, 2023

So, I foresee that there might be two levels to the reach consensus on:

  • the logical model/schema for describing a workload identity in attestation results

there is an abundance of SBOM formats (SPDX, SWID/CoSWID, CycloneDX, in-toto, SLSA) which are worth a look as they seem to address exactly this point.

  • the mechanism(s) that enable a verifier to populate this logical model/schema (e.g., CoRIM based endorsement?)

(not surprisingly) +1 :-)

@thomas-fossati
Copy link
Contributor Author

@gkostal a possibly related talk at LPC's CC micro-conference

@thomas-fossati
Copy link
Contributor Author

thomas-fossati commented Nov 14, 2023

It'd be good to break it down "identity shapes" by attestation scheme, i.e., show how identity is/can be represented in CCA, vTPM, DICE, TDX, SEV, etc. and see if there are common patterns that can be extracted.

@gkostal
Copy link
Collaborator

gkostal commented Dec 5, 2023

there is an abundance of SBOM formats (SPDX, SWID/CoSWID, CycloneDX, in-toto, SLSA) which are worth a look as they seem to address exactly this point.

I don't believe this is what I'm looking to discuss.

These describe a detailed single physical manifestation that will change over time as the details of the app binaries change (e.g., code changes, dependent library changes, build tooling changes, etc.). These definitions don't satisfy the needs of a relying party, specifically the requirements I mentioned earlier in the discussion (copy/pasted again here):

  • (met) can be expressed in attestation results
  • (not met) can be referenced simply by relying parties
  • (not met) is independent of underlying TEE technology
  • (not met) is stable over time (i.e., OS updates, new builds of application executable/binary/container, etc. do not change "code identity")

@dcmiddle
Copy link
Contributor

dcmiddle commented Dec 5, 2023

The ACON project takes the identity of an application as the measurement of its container.
https://github.com/intel/acon
That measurement is independent from the underlying OS and middleware (though a chain of measurements is provided in the attestation for security).
So the application identity is stable over time to changes in the layers underneath it. However changes to the application itself will be different.
In order to address volatility in dependencies those artifacts can be indirectly identified by their signer, e.g., I will always trust a storage library from my CSP. In that sense dependency updates (optionally) won't change the app id.

Without relying on a specific hash of the application, an alternative is to set a policy based on the signer and a vendor supplied identifier for the application. So for an SGX attestation you would look at fields like ISVPRODID and MRSIGNER.

@OR13
Copy link

OR13 commented Feb 13, 2024

Regarding "workload identity", you may find this proposed charter interesting:

https://datatracker.ietf.org/doc/charter-ietf-wimse/

@gkostal
Copy link
Collaborator

gkostal commented Feb 27, 2024

Regarding "workload identity", you may find this proposed charter interesting:

https://datatracker.ietf.org/doc/charter-ietf-wimse/

Thanks for pointing out @OR13 !

@gkostal gkostal added [track] info and data models [Track] Information & Data models for Attestation and removed new topic ideas for new discussion topics labels Jul 16, 2024
@bobbiec
Copy link

bobbiec commented Sep 10, 2024

Here's a couple different perspectives from other worlds, that I think are more closely aligned to @gkostal 's concerns -

In the Backstage developer platform framework, there is a concept of a Component:

A component is a piece of software, for example a mobile feature, web site, backend service or data pipeline (list not exhaustive). A component can be tracked in source control, or use some existing open source or commercial software.

In SPIFFE/SPIRE (and related to WIMSE, above), there is a concept of Workload:

A workload is a single piece of software, deployed with a particular configuration for a single purpose; it may comprise multiple running instances of software, all of which perform the same task. The term “workload” may encompass a range of different definitions of a software system, including:

  • A web server running a Python web application, running on a cluster of virtual machines with a load-balancer in front of it.
  • An instance of a MySQL database.
  • A worker program processing items on a queue.

I want to be able to make decisions like, WebService is allowed to talk to AccountService and Database. And generally, the identity of WebService should be stable even if:

  • WebService adds some new API endpoints
  • WebService moves from Python 3.11 to 3.12 (or even, gets rewritten in Go)
  • WebService upgrades a dependency

And for that reason, I think it's very difficult to do this with a pure software measurement of any component. In my opinion, a useful identity is something like:

  • A human-readable name associated with a component/workload
  • Accompanying metadata about the rest (measurements, etc.)

But this does require something like an endorsement from the author.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[track] info and data models [Track] Information & Data models for Attestation
Projects
None yet
Development

No branches or pull requests

6 participants