Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add directory_layouts to definitions [RHELDST-6065] #47

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

negillett
Copy link
Member

No description provided.

Copy link
Member

@rohanpm rohanpm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you checked whether all the directory layouts in use today can be fit into this scheme?

This is a case where I think the changes to real data (cdn-definitions-private) have to be developed at the same time as the changes to schema. Otherwise it seems very risky that there will be some data for which this schema (or even the entire approach) does not end up being usable.

src/cdn_definitions/data.yaml Outdated Show resolved Hide resolved
# "/content/<type>/<platform>/<rhui>/<variant>/<major>.<minor>/<arch>/<rest>"
pattern: "/(?:content|shadow)/(?P<type>aus|dist|beta|(?:retired-)?els|eus|htb|rhb|tus|e4s)/(?P<platform>rhes|rhs|cf-me)/(?P<rhui>rhui/)?(?P<variant>[^/]+)/(?P<major_version>[0-9])\\.(?P<minor_version>[0-9]+)/(?P<arch>[^/]+)/(?P<rest>.+)$"
type: true
platform: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that I'm understanding exactly what this means.

Does that mean: "if the value is True, then the value should be extracted from the regex capture group of the same name" ? While "if the value is a str, then the str value should be used as-is" ?

If this is the meaning, I'm wondering why we need the true values. Is the presence of the named capture groups not enough? e.g. are there cases where there is a capture group named type but you don't want to use its value as type later on? If so, why capture it?

Also, I think there should be at least one layout in the sample data which is using the str case. Right now they are all true.

Copy link
Member Author

@negillett negillett Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean: "if the value is True, then the value should be extracted from the regex capture group of the same name" ? While "if the value is a str, then the str value should be used as-is" ?

Yes

If this is the meaning, I'm wondering why we need the true values. Is the presence of the named capture groups not enough? e.g. are there cases where there is a capture group named type but you don't want to use its value as type later on? If so, why capture it?

Thats a good question. The only cases I see are ChannelDumpsLayout where major is hardcoded ("6") and RhelAlt layouts where platform is hardcoded ("rhel"), and the patterns written for these do capture those segments. I'm not sure why that is but am trying to keep things as similar as possible to how it was written in cdn-utils.
That aside, I chose this scheme for its flexibility; it allows us to set/unset values manually if needed, despite what might be captured by the regex pattern. Again, not sure of the use case for that but it seemed better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be cleaner to drop the concept of True means take the value out of a capture group.
There is already one special case ...except for rhelalt, where True just means True.
There could be other boolean attributes to be introduced in the future, leading to more special cases.

I would rather suggest this behavior:

  • for each attribute, the default value is null.
  • a value of null means that a capture group of the same name should be used, when present.

Also some of the attributes are probably mandatory, meaning that a value has to be provider either directly (non-null attribute in the layout), or from a capture group, or else a fatal error occurs. We should somehow document which of them will be mandatory in this way.

Copy link
Member Author

@negillett negillett Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A "true" value is now only necessary for rhelalt. For RhelAlt layouts, the platform 'rhel-alt' is captured but overridden to 'rhel'. Maybe the presence of '/rhel-alt/' in the path could be used to set rhelalt to True rather than relying on the definitions. But RhelAltContainersLayout doesn't seem to be using the rhelalt attribute, unlike the other two RhelAlt layouts--would need to understand the logic behind that first. WDYT?

As for mandatory attributes, a scan through cdsutils shows type, platform and platform_major_version are rather important but I don't think their absence would cause a fatal error. It looks like directory layout attributes just open code paths and aren't strictly required by anything. Either way, I think it would be good to at least log to debug the objects filled when a path is matched.

src/cdn_definitions/data.yaml Outdated Show resolved Hide resolved
@negillett
Copy link
Member Author

Have you checked whether all the directory layouts in use today can be fit into this scheme?

This is a case where I think the changes to real data (cdn-definitions-private) have to be developed at the same time as the changes to schema. Otherwise it seems very risky that there will be some data for which this schema (or even the entire approach) does not end up being usable.

Yeah, I translated the classes to real data first and developed the schema in cdn-definitions from that.

rohanpm
rohanpm previously approved these changes Nov 14, 2024
Copy link
Member

@rohanpm rohanpm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can not pick out any remaining problems, but I am still nervous about the change overall.

If at all possible, I do recommend fully completing the corresponding cdn-utils change locally and passing it through all the tests in that repo with all the real data converted to cdn-definitions, before releasing this.

src/cdn_definitions/data.yaml Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants