-
Notifications
You must be signed in to change notification settings - Fork 551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
config-linux: add schemata field to IntelRdt #1230
base: main
Are you sure you want to change the base?
Conversation
(disclaimer: I only very briefly glanced over, so apologies for any mistake in interpreting); IIUC, the new field effectively is to replace the other field(s), but those already shipped in a release. Should we consider deprecating the other fields, or add wording to discourage use of those going forward (only for backward compatibility)? |
Thanks @thaJeztah for the quick feedback
Yes, I would say so. This is effectively to make those useless/unneeded, without breaking backwards compatibility. The current model of one field per "schema" is infeasible as we can already see that there are many features (L2 and CDP) not covered by the OSI spec.
I think we could consider that, at least suggest people use the new field. I dunno how deprecation is reacted to in OCI spec. |
@thaJeztah et al, one more consideration is that I don't believe any runtime does any parsing/filtering of the existing |
@@ -654,15 +655,17 @@ The following rules on parameters MUST be applied: | |||
|
|||
* If either `l3CacheSchema` or `memBwSchema` is set, runtimes MUST write the value to the `schemata` file in the that sub-directory discussed in `closID`. | |||
|
|||
* If neither `l3CacheSchema` nor `memBwSchema` is set, runtimes MUST NOT write to `schemata` files in any `resctrl` pseudo-filesystems. | |||
* If `schemata` field is set, runtimes MUST write the value to the `schemata` file in the that sub-directory discussed in `closID`. If also l3CacheSchema` or `memBwSchema` is set the value of `schemata` field must be written last, after the values from `l3CacheSchema` and `memBwSchema` has been written. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll write a longer comment, but I'm considering that we should document that if schemata
is set, runtimes MUST
ignore both the l3CacheSchema
and memBwSchema
fields, and instead to use the schemata
field instead (and write its content "as-is" (that should probably also be documented on the l3CacheSchema
and memBwSchema
fields)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field also needs to be added to the JSON-schema;
runtime-spec/schema/config-linux.json
Lines 259 to 278 in c0e9043
"intelRdt": { | |
"type": "object", | |
"properties": { | |
"closID": { | |
"type": "string" | |
}, | |
"l3CacheSchema": { | |
"type": "string" | |
}, | |
"memBwSchema": { | |
"type": "string", | |
"pattern": "^MB:[^\\n]*$" | |
}, | |
"enableCMT": { | |
"type": "boolean" | |
}, | |
"enableMBM": { | |
"type": "boolean" | |
} | |
} |
As well as the go implementation;
runtime-spec/specs-go/config.go
Line 795 in c0e9043
type LinuxIntelRdt struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woops 🙄 Added
"closID": "guaranteed_group", | ||
"l3CacheSchema": "L3:0=7f0;1=1f", | ||
"memBwSchema": "MB:0=20;1=70" | ||
"schemata": "L3:0=7f0;1=1f\nL2:0=f;1=f;2=f;3=f\nMB:0=20;1=70" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for my understanding, none of the other fields (ClosID
, enableCMT
, enableMBM
) are to be included in the schemata
file?
runtime-spec/specs-go/config.go
Lines 796 to 814 in c0e9043
// The identity for RDT Class of Service | |
ClosID string `json:"closID,omitempty"` | |
// The schema for L3 cache id and capacity bitmask (CBM) | |
// Format: "L3:<cache_id0>=<cbm0>;<cache_id1>=<cbm1>;..." | |
L3CacheSchema string `json:"l3CacheSchema,omitempty"` | |
// The schema of memory bandwidth per L3 cache id | |
// Format: "MB:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;..." | |
// The unit of memory bandwidth is specified in "percentages" by | |
// default, and in "MBps" if MBA Software Controller is enabled. | |
MemBwSchema string `json:"memBwSchema,omitempty"` | |
// EnableCMT is the flag to indicate if the Intel RDT CMT is enabled. CMT (Cache Monitoring Technology) supports monitoring of | |
// the last-level cache (LLC) occupancy for the container. | |
EnableCMT bool `json:"enableCMT,omitempty"` | |
// EnableMBM is the flag to indicate if the Intel RDT MBM is enabled. MBM (Memory Bandwidth Monitoring) supports monitoring of | |
// total and local memory bandwidth for the container. | |
EnableMBM bool `json:"enableMBM,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct 👍
Yeah, not sure that's great, as that would effectively make both an "anything goes" field, but they would give the impression they're only to be used for a specific purpose. I think the existing fields are a bit of a leaky abstraction already; they give the impression that it's an abstraction, but in the end (but I'm not deeply familiar with this matter) it looks like it's just artificially splitting So I could see two possible directions;
For the
For the
|
@thaJeztah appreciate your thoughtful review. My comments in-lined below
I agree
Exactly my thinking/feeling. We could possibly use something like type LinuxIntelRdt struct {
...
Schemata []LinuxIntelRdtSchema
...
}
type LinuxIntelRdtSchema struct {
Type String // "L2", "L3", "MB" etc
Schema String // Depends on the Type, e.g. "0=f;1=f"
} But I don't think that makes anyone's life any easier.
I think this would be a maintenance nightmare. Even with the current very limited validation/filtering specified I think this hasn't been implemented. Plus for a full validation the runtime will need virtually to replicate the validation code in linux kernel. Who knows what the actual format for future "FOOBAR" technology will look like. And e.g. in the case of RDT the min/max width of the bitmask varies between HW etc.
I'm a strong proponent of this approach. Let kernel do the validation because it can do it correctly 😇
I'd strongly suggest to keep it simple. Any schema validation will be a tricky business to do right with all the corner cases like
In any case the "portability" of the schemata is very poor even between reboots/remount of the resctrlfs as the format and accepted fields in the schemata file depends e.g. on the mount options.
The documentation is not Intel's but in Linux kernel (https://docs.kernel.org/arch/x86/resctrl.html). But 👍 for adding a reference. I would refer to that directly instead of describing it here. With new kernel versions there may be (backwards-compatible) extensions and new technologies enabled that the OCI spec would then be missing. I think that also applies to any regular expression we tried to add here. RDT does provide information about the currently active setup (what features are enabled and in which configuration) under |
💯 I fully agree. My main intent here was to mention options we considered (for future visitors, including "future self"), and rejected (for reasons mentioned). It's good to try to provide an abstraction, but in some case, such as this one, I don't think that's possible, and we just need to acknowledge that some parts of the OCI just are very platform-specific and tightly coupled to kernel/platform features.
Yes, that's what I somewhat expected the case to be; I guess the recommendation would be "use the new option", and only in exceptional cases consider using "both" (if you like schrödingers cat).
Heh, yes, let myself go on the "canonical reference" side; definitely "+1" on keeping docs in this spec as minimal as possible, and refer to the kernel reference as source of truth.
If it's well-defined and a regex is possible (either in docs and/or the JSON-schema), I guess that would be good to have.
Yes, that's for sure not a blocker (from my point of view) for this PR. My main motivation for some of these comments is to explore what amount of validation can reasonably be performed by (higher-level) runtimes before "the spec hits the kernel". My general rule of thumb is to keep that as minimal as possible for most cases (unless we can absolutely be sure), but a "well-formedness" check (where possible) can help. Having some amount of early error-handling can reduce burden on maintainers (as all OCI / kernel / runtime errors result in a "OCI runtime" error, and those tend to be piled on by users "I have this same issue!" (no you don't 😂)) |
My feeling is that it's not well-defined enough. The currently supported format for currently supported technologies is described (https://docs.kernel.org/arch/x86/resctrl.html#l3-schemata-file-details-code-and-data-prioritization-disabled) but I wouldn't count on that. Say next kernel version will start supporting id ranges like |
@thaJeztah I think I'll sleep over this and push an update after the weekend, trying to capture all the review comments |
For sure! Mostly exploring here if it's possible (within reason). Having a quick look at examples on that link;
Which would translate to;
and;
Based on that the format "roughly" is a newline-delimited set of features;
So I guess the blanks to fill in is what are accepted characters for Whether |
Yeah, the currently availabe technologies/features follow this format. My fear is that we could get bitten by future extensions (like One option would be to validate the currently known features and just not validate unknown stuff. Another is to follow the currently ubiquitous format In any case, we could make the schemata field an array of schemas Schemata []string |
config-linux.md
Outdated
* if `closID` directory in a mounted `resctrl` pseudo-filesystem doesn't exist, the runtimes MUST create it. | ||
* if `closID` directory in a mounted `resctrl` pseudo-filesystem exists, runtimes MUST compare `l3CacheSchema` and/or `memBwSchema` value with `schemata` file, and [generate an error](runtime.md#errors) if doesn't match. | ||
* if `closID` directory in a mounted `resctrl` pseudo-filesystem exists, runtimes MUST compare `l3CacheSchema` and/or `memBwSchema` and/or `schemata` value with `schemata` file, and [generate an error](runtime.md#errors) if doesn't match. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
too late too relax it to "SHOULD"? It is not so obvious how the comparison should be done from a runtime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@giuseppe good point. For the old fields (l3CacheSchema
and memBwSchema
) let's just keep it as is. But for the new schemata
field we could just drop this requirement. As you said it's not at all clear how this comparison should be implemented (I'd consider that as internal implementation details of the Linux kernel).
This might actually even be desirable. If you think that the higher-level runtime would be managing the CLOS configuration. With the old behavior you cannot actually change the configuration of existing groups, you'd need to delete it and re-create it (and if there were some running containers there you'd need to freeze them, list the pids, delete CLOS, re-create CLOS with new config, assign the pids to the new group, unfreeze...).
Add a new "schemata" field to the Linux IntelRdt configuration. This addresses the complexity of separate schema fields and resolves the issue of supporting currently uncovered RDT features like L2 cache allocation and CDP (Code and Data Prioritization). The new field is for specifying the complete schemata (all schemas) to be written to the schemata file in Linux resctrl fs. The aim is for simple usage and runtime implementation (by not requiring any parsing/filtering of data or otherwise re-implement parsing or validation of the Linux resctrl interface) and also to support all RDT features now and in the future (i.e. schemas like L2, L2CODE, L2DATA, L3CODE and L3DATA and who knows L4 or something else in the future). Behavior of existing fields is not changed but it is required that the new schemata field is applied last. Signed-off-by: Markus Lehtonen <[email protected]>
Add a new "schemata" field to the Linux IntelRdt configuration. This addresses the complexity of separate schema fields and resolves the issue of supporting currently uncovered RDT features like L2 cache allocation and CDP (Code and Data Prioritization).
The new field is for specifying the complete schemata (all schemas) to be written to the schemata file in Linux resctrl fs. The aim is for simple usage and runtime implementation (by not requiring any parsing/filtering of data or otherwise re-implement parsing or validation of the Linux resctrl interface) and also to support all RDT features now and in the future (i.e. schemas like L2, L2CODE, L2DATA, L3CODE and L3DATA and who knows L4 or something else in the future).
Behavior of existing fields is not changed but it is required that the new schemata field is applied last.