Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add origin_url field #94

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

spbnick
Copy link
Collaborator

@spbnick spbnick commented Jul 15, 2020

Add an "origin_url" field to every schema object, pointing to the object
within, and served by, the origin CI system.

Fixes #93

Add an "origin_url" field to every schema object, pointing to the object
within, and served by, the origin CI system.

Fixes #93
@spbnick
Copy link
Collaborator Author

spbnick commented Jul 15, 2020

@gctucker, here's the addition of the URL I was talking about at the meeting yesterday.

Copy link
Contributor

@gctucker gctucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what the value of adding this is, given that we don't have the basic data schema really used in production yet. It seems that having links to where the data came from might be useful it but should not really be necessary in order to make a functional database. Would the submitted data not speak for itself?

"format": "uri",
"description":
"The URL of the environment in the origin CI system",
"examples": ["https://kernelci.org/soc/allwinner/"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really an environment, it's a family of devices from a same vendor.

I don't think there is anything that matches the field description on the kernelci.org dashboard. A link to the test platform in a LAVA lab would probably be a bit more relevant, for example:
https://lava.collabora.co.uk/scheduler/device/bcm2836-rpi-2-b-cbg-0

although that's not stored in the kernelci-backend database. What is stored is the name of the test lab and the name of the platform i.e. bcm2836-rpi-2-b for this Raspberry Pi 2b, which is more generic than a specific instance in a test lab.

Does CKI have some view to show details of a runtime environment or test platform?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the "environment" as a way to identify something where test executed, with some precision. As much precision as the submitter can afford. Its only purpose for KCIDB itself is to determine which tests executed in a similar-enough environment, so e.g. we can say the results should be the same, and can group them in the report, the dashboard, or take into account when locating the breaking commit. I still don't know how exactly I would implement or organize this, though.

If KernelCI doesn't expose the reported environment on the dashboard with similar precision, then it can choose not to provide a link here, or provide this link, even though it's of lower precision, just to have something. Or it can provide the Lava link you post.

For the purpose of example, I think the link here is OK. The Lava link would be better, though. Would you mind me using it even though Kernel CI wouldn't provide it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CKI only has hostnames, I think, we can always link to Beaker which has very detailed description of the host. That might never be public, though, so we will probably not going to be using it, instead providing as much information as we can in the environment object itself (once we have the fields described).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KernelCI uses device types, which is basically a name for an "execution environment". At least that covers the immutable part of the environment, i.e. a hardware board with some firmware or a virtual device with a particular configuration. Then each test has some extra parts of the environment such as a root file system or a Docker image with test suites, which changes sometimes but is still part of the environment. The real moving part is the kernel.

So if different labs have the same Raspberry Pi, or a lab has several of them, results for any of them will appear as for the same device type. There just isn't a view on the current dashboard to show all the information specific to a particular device type, or any particular device instance.

I see the "environment" as a way to identify something where test executed, with some precision. As much precision as the submitter can afford.

That's the lab name and device type name as far we're concerned at the moment. I believe the actual instance name is also stored in the database although not shown on the dashboard, at least I think the field for it is still there.

I see the value of this kind of meta-data. But to me, that's rather different to a URL on a web interface.

So rather than using origin_url fields, maybe something like origin_metadata could be used with more arbitrary fields depending on the submitter? For LAVA labs it will be the lab name and the device type, and maybe device instance name. For your CKI/Beaker results, it will be the hostname or whatever works from your point of view.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thank you for explaining how Kernel CI identifies the devices, it will help me come up with a schema to actually support it. I think we more or less in agreement on what the essence of an environment is, and its importance.

Now, origin_url has nothing to do with identifying the environment. That's a job for yet-to-be defined fields.

Regarding origin_metadata, we have misc exactly for that, in environments as well.

The origin_url is just an escape hatch, for humans to reach the origin's representation of the same object (if available), with more data and more features than the implementation-in-progress can afford. In this way, it is similar to misc, which actually worries me, because it would be easier to just plop the link to your own web UI instead of submitting the data we might need to store and correlate. That would be an argument against it, IMO, and one I'm starting to find more and more weighty. Hmm...

@spbnick
Copy link
Collaborator Author

spbnick commented Jul 16, 2020

I wonder what the value of adding this is, given that we don't have the basic data schema really used in production yet. It seems that having links to where the data came from might be useful it but should not really be necessary in order to make a functional database. Would the submitted data not speak for itself?

This is just to provide a way to reach more data and functionality at the origin while we ramp up support for it. Missing fields, fancy graphs, links to other data/objects, etc. We gotta start offering people our reports and dashboard before we have full parity, and these fields could smooth the transition.

I.e. this is not to make the database functional, but to make it easier for both report submitters and developers to make the decision to try our system, even though it might not have all the features they're used to yet. At least they'll be able to reach the original data, if something is missing.

Base automatically changed from master to main January 13, 2021 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a field linking to origin's web-representation of the object
2 participants