Skip to content
This repository has been archived by the owner on Nov 19, 2024. It is now read-only.

feat: add support for canister properties #13

Closed
wants to merge 9 commits into from

Conversation

jplevyak
Copy link
Contributor

This PR adds support for arbitrary properties which are stored as canister information in the state tree and which are set and updated as canister settings. The purpose of these properties is to enable or disable system features without having the change the canister WASM. The immediate use cases are:

  • disable the service worker on ic0.app to enable custom service workers or opt out e.g. to enable HTML preview
  • allow Internet Identity aliases for domains and/or canister ids

Signed-off-by: John Plevyak [email protected]

@jplevyak jplevyak requested a review from Dfinity-Bjoern March 14, 2022 18:01
@nomeata
Copy link
Contributor

nomeata commented Mar 14, 2022

allow Internet Identity aliases for domains and/or canister ids

😞 for tying a front-end feature not to the appropriate concept (the origin), which would be consistent with the design of the Internet Identity so far (which talks to and thinks of fontends only, never canisters), would not artificially restrict the Internet Identity to canister-hosted sites. But maybe that is the point and goal – the DFINITY singularity.

Anyways, this rant is off-topic here, as this is a general mechanism.

spec/index.adoc Outdated
Comment on lines 443 to 445
Users can set arbitrary canister properties via settings, see <<ic-create_canister>> and <<ic-update_settings>>. The (blob) is optional. There is a maximum combined size limit for all properties of 4KB. Properties may be used to enable or disable system features, e.g.:
* disable_ic0_app_service_worker
* allow_internet_identity_aliases where (blob) is a comma separated of canister ids and/or domains.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Editorial comment: The Spec has no notion about service workers or Internet Identity, and so far we try to keep the document (mostly) concise and to the point. So maybe move the examples into a Note? (although my suggested edit above shrinks this anyways)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added some clarification of the roles of canister administrator vs canister developer WRT properties vs metadata. PTAL

spec/index.adoc Outdated Show resolved Hide resolved
spec/index.adoc Outdated Show resolved Hide resolved
jplevyak and others added 2 commits March 14, 2022 12:03
Co-authored-by: Joachim Breitner <[email protected]>
Co-authored-by: Joachim Breitner <[email protected]>
@nomeata
Copy link
Contributor

nomeata commented Mar 14, 2022

Also I wonder: How do settings like the list of controllers and the memory allocation differ from these new properties?

The allocation isn’t publicly visible. Do we know that all future properties are ok to be publicly visible?

The controller has more structure. Do we know that all future properties are indeed just simple blobs?

What I am getting at is that a generic mechanism like this may not actual fit all the upcoming use cases. And since we maybe can’t tell them already, a generic mechanism is maybe not the best thing?

Have you considered just adding more settings for the few use cases that come up?

This would also allow us to use the “right” candid types for them, and support more structure types (lists of aliases etc.) easily, and we wouldn’t have to come up with a new encoding every time.

I conjecture that any usecase that is so far from the core system that this would be odd is also probably too restrictive to not allow dynamic (canister-controlled) settings. These might better be handled by a canister-level API (query with certified variables).

@jplevyak
Copy link
Contributor Author

@nomeata interestingly, this text differentiates between "the system" and "the platform", but it seems this is the first time we are making that distinction. Perhaps we should define these as well as canister administrator ? On one hand this is more of a formal spec, but it does need commentary e.g. The definition of Standard ML vs the Commentary on Standard ML.

spec/index.adoc Outdated
@@ -438,6 +438,15 @@ If the canister has a https://webassembly.github.io/spec/core/binary/modules.htm
+
It is recommended for the canister to have a custom section called "icp:public candid:service", which contains the UTF-8 encoding of https://github.com/dfinity/candid/blob/master/spec/Candid.md#core-grammar[the Candid interface] for the canister.

* `/canister/<canister_id>/property/<name>` (blob):
+
The systems maintains for each canister a generic key-value store, called the “canister properties”. These properties are provided by canister aministrators and while the system does not interpret them other platform components (e.g. the HTTP gateway protocol or Internet Identity) may. In contrast, metadata is provided by canister developers and may be interpreted by the system.
Copy link
Contributor

@chenyan-dfinity chenyan-dfinity Mar 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metadata is also not interpreted by the system, and can be thought of as key-value store. So I am not sure the difference between metadata and property. The current spec only specifies Wasm custom section as metadata, but it can contain other metadata as well. System or custom properties, such as controllers, module_hash and disable_ic0_app_service_worker can all be part of the metadata (some are immutable and some are mutable). It would be nice if we have a unified way of controlling these metadata/property.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could simplify this proposal by enabling metadata to be set via create_canister and changed via update_settings in addition to being set via custom sections, but the current WasmMetadata proto for example is specific to WasmCustomSection. We also have issues with what happens if the user attempts to overwrite something specified in the metadata via update_settings. Then there is the question of what is the provenance of the canister developer vs canister administrator. I agree that perhaps the distinction between whether or not something effects the "system" is not clear and perhaps should be removed at least from the spec. However the distinction that metadata comes from the wasm binary and properties come from canister settings seems pretty clear to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current WasmMetadata distinguishes CUSTOM_SECTION_TYPE_PUBLIC from CUSTOM_SECTION_TYPE_PRIVATE. Is this distinction currently in use? Is this something that we should/should reflect in properties ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the distinction between metadata and properties, as they exist right now (including this PR) is rather clear to me as well:

  • Metadata is tied to a canister module. Before you install code, there is no metadata. You can change it only together with the code. And if you uninstall the code, it’s gone.
  • Properties are tied to the canister instance. They exist even when the canister is empty (no code installed), and can be modified independently of the code.

It’s a good thing that “metadata” has a clear focus here – it’s metadata about the code – and I think that is why we kept it separate from other canister settings back then.

The naming is always a bit artificial, and maybe it helps to sometimes say “canister code metadata” to clarify, but I see no urgent need to unify things here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were using custom_section as the endpoint name initial, but @rossberg wants something more conceptual. Quoting from earlier PR:

Not sure I'm too fond of the name /custom_section/ in this path, since that's more of an implementation detail than a > conceptual notion. Maybe one day we want to have other/additional ways to provide such data. Abstracting this a little also allows to embed naming conventions and additional information in the physical section names.

How about /metadata/ instead?

I understand the current distinction, but we have four endpoints now: controllers, module_hash, metadata and property. It's a bit wired to single out controllers and module_hash as top-level endpoints. By the current definition, controllers will be inside property and module_hash will be inside metadata.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could rename WasmCustomSection to MetadataValue and CustomSectionType to MetadataValueType and add a CANISTER_SETTING_TYPE_XXX type, then we could use metadata but only allow canister settings to see/update those bits of metadata with CANISTER_SETTINGS_TYPE_XXXX. However, it wouldn't be clear in the state tree whether this was coming from a custom section or from canister settings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use metadata/module to represent data from Wasm module, and metadata/setting for canister settings.

The current WasmMetadata distinguishes CUSTOM_SECTION_TYPE_PUBLIC from CUSTOM_SECTION_TYPE_PRIVATE. Is this distinction currently in use? Is this something that we should/should reflect in properties ?

We specify the custom section visibility in the Wasm, but it's not visible in the HTTP endpoint. Non-controller user gets a 403 when accessing a private custom section.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the current definition, controllers will be inside property

I agreee (hence my suggestion to add more settings instead of introducing properties in addition to settings).

and module_hash will be inside metadata.

There is a crucial difference: Metadata can be arbitrarily set by the developer. It comes with the code, but the developer has control.

The module has is derived from the actual code, and it would be horrible if the developer or the admin could manually set it. Here it really is the system making a statement. So while it shares one property (heh) with the metadata (only exists when the canister is not empty), it also differs in crucial ways from metadata.

@nomeata
Copy link
Contributor

nomeata commented Mar 14, 2022

@nomeata interestingly, this text differentiates between "the system" and "the platform", but it seems this is the first time we are making that distinction.

Indeed. I thought we had that in the preamble somewhere, but we don’t. To me, “system“ is everything below the interface described by the Interface Spec, “platform” more vaguely includes stuff “around it” (libraries, motoko). Edge nodes are a bit of an interesting case, because the request routing they do is clearly part of the system, but the HTTP gateway feature is (at the moment, at least) clearly an platform feature – it’s optional, in a sense, and could be provided by a third party.

Perhaps we should define these as well as canister administrator ?

Not sure the distinction between canister developer and canister admin needs to be discussed here. It is just something we have to keep in mind as we play out the use cases in our heads – for the purposes of this document the developer doesn’t exist, because direction interaction with the spec is always through the admin. But we do the admins a favor if we don’t force them (or their tools) to deal with things better dealt by the developer (or their tools).

@nomeata
Copy link
Contributor

nomeata commented Mar 14, 2022

More thoughts; sorry for the brain dump, but this touched many things we have dealt with before, so maybe this is still a useful perspective.


For a generic key-value store like this, designed to be open to multiple future uses, don’t we expect that at some point, one of these use cases really wants the canister to possibly dynamically set these values. (Imagine a canister that hosts many different websites – maybe it wants to dynamically set the Internet Identity alias. Not too unlikely). Sure, no problem, it’s very simple to add a setter and getter to the system API.

But what we have then really is just a generalization of the certified_data feature; generalized to a set of named certified variables (rather than our current feature, restricted to one), and with the ability to set it as the controller.

When we designed the certified variable feature, we considered allowing the canister to maintain a set of them, named. This would have had quite a few benefits – canisters wouldn’t have to implement merkle structures for many simple use-cases, different protocols based on certified data wouldn’t have to use a common merkle structure, easier to have interoperability – but it was rejected because of the complexity it entails on the system to maintain a key-value store rather than a single value (state tree mapping, migrations, accounting and charging). This seems to have changed (because this feature requires most of these). Should we thus maybe do this now:

  • generalize certified variables from one to many named ones (with the old system API addressing a specific one, e.g. certified_data)
  • add the ability to set these fields as the controller (to satisfy the requirements behind this PR)

OTOH, there are wrinkles (certified_data gets reset upon canister installation, but these properties don't…). Hmm, not sure.


Another aspect: Why the size restriction? Surely it wouldn’t be much trouble to just add the size of the properties to the general canister footprint (which already consists of a fair number of things; memories, globals, call contexts, settings…). If we – hypothetically – allow that we’d have the feature that was discussed when we were talking about “system-supported static assets”, which could have been the answer back then for how to put HTML files etc. into (or next to) canisters. There we eventually left them in the canister and built things around it on the higher layer (HTTP Gateway) etc. But it’s worth wondering whether and how properties differ from such static assets, and where that thought leads us.

If we’d lift the size restriction and added canister access to it, we are slowly moving towards what would become a (certified!) filesystem for the canisters. Intriguing!


Oh, and of course the Gretchenfrage: Why does it have to be a system feature at all? Why can’t this be done in canister space, like – see the previous paragraph – HTTP assets? Just define a well-known query function, throw in a little bit of certification based on certified variables, and you are done. No need to bugger this document, or any of the layers below, with it. Happy to help with that.

And at least the example use-cases target mostly advanced canister developers, who’d not have a lot of trouble using the feature this way (especially if cdk-rs has some convenience functions).


Anyways, after these detours my gut feeling is that the proposed solution is, if I may say so, too general. It invites all kinds of use-cases that will invite all kinds of extensions (encoding structured data, dynamic access, large data) which in turn may not greatly fit this extension. So my suggestion at the moment is to either

  • leave non-system issues out of the system, and use canister-level protocols, or
  • extend the set of supported settings one-by-one, as use cases come up (with the benefit of proper typing in the Candid interface!)

@jplevyak
Copy link
Contributor Author

I am now thinking that we should just make this part of the existing metadata which has WasmCustomSectionPrivate and WasmCustomSectionPublic types and just add CanisterSettingsPrivate and CanisterSettingsPublic types to metadata. This would be a comparatively small change and would leverage the existing spec.

My objections to your two suggestions are:

  1. canister level protocols do not solve all the issues. For example what if I want to change some critical behavior (e.g. enforcing certification on the raw.* domain or disabling the raw.* domain entirely for a canister) and I want to grandfather in existing canisters. I can have the system add a property to all canisters created before a specific date, then change the default behavior for all new canisters. Given that we want canisters to live indefinitely, we need some way to tag grandfathered canister behavior and this would be a good way.
  1. having on-offs is extremely expensive in terms of engineering effort and calendar time. This discussion is likely to go on for days, imagine if every little feature had to add this overhead?

@nomeata
Copy link
Contributor

nomeata commented Mar 14, 2022

enforcing certification on the raw.* domain

that sounds like an odd thing to do, isn’t the point of raw to not require certification? :)

Otherwise your objection number 1 does of course have merits.

About 2, also true. But even then, they would still be one-off features, it’s just shoved out of sight of the replica devs, but you still need to have these discussions – what’s the name? What’s the encoding? How to handle complex data? etc.

Yes, maybe it’s different people, but it’s just moving complexity around.

And by doing a too ad-hoc job here, we make those people’s live worse, so I hope we have the time to try, as best as we can, to anticipate their needs and design the feature so that it works well even given likely changes in requirements (e.g. not just flags or binary data, but complex data; possibly ways to modify one value without risking to override all others; dynamic access from canisters).

In other discussions, I think on the forum, some of us have already agreed that it would be useful to have a generic way to map structured candid-typed data onto the state tree, to certify more that just blobs (pinging @roman-kashitsyn). If we have that, maybe also with dfinity/candid#245, then this feature could be (almost) as simple for the system as you want it, while still supporting custom data.


If the current level of abstraction doesn’t seem right, and higher level is unwanted, then we could also try to make the cut at an even lower conceptual level: Instead of allowing the controller to control one branch of the per-canister space of the state tree, but then limiting it to just one level, can we maybe allow them to put arbitrary labeled tree nodes there (i.e. possible multiple levels), and provide a slightly more flexible API to modify them (create a node given by a path, delete a node given by a path, set leaf data at a path). The interpretation of entries (information for II, assets, etc.) is then up to higher level protocols, and the system and this spec will not have to deal with more stuff. It is a strict generalization of this PR, adding more flexibility in the hope that there is less to regret in the future.

This creates a nice parallel to the in-canister hash tree, with the same expressive power, same encoding protocols etc. We could call them “controller-controlled certified variables” in opposition to “canister-controlled certified variables”.

@chenyan-dfinity
Copy link
Contributor

For example what if I want to change some critical behavior (e.g. enforcing certification on the raw.* domain or disabling the raw.* domain entirely for a canister) and I want to grandfather in existing canisters. I can have the system add a property to all canisters created before a specific date, then change the default behavior for all new canisters.

Can these properties be embedded in the Wasm metadata? With the current use case, we don't seem to need the ability to change these properties on the fly, e.g. performing an upgrade to enable/disable certificates is a reasonable requirement.

About grandfather existing canisters, if certain property/metadata is missing, the consumer of this property can define a default behavior, but I don't think there is a generic way of defining default behaviors.

@jplevyak
Copy link
Contributor Author

OK, this has been very fruitful. My take home is that we should start with the existing metadata facility and only if we find some compelling use case which it can not satisfy should we consider adding properties.

I am closing this PR.

Thanx again for the discussion.

@jplevyak jplevyak closed this Mar 14, 2022
@jplevyak jplevyak reopened this Mar 22, 2022
@jwiegley jwiegley closed this Apr 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants