Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Emit events in the involved objects namespace #2360

Merged
merged 16 commits into from
Mar 25, 2023

Conversation

ctrought
Copy link
Contributor

What this PR does / why we need it:

Events are currently emitted in the Gatekeeper namespace. It would be beneficial to have events emitted in the namespace where the involved object is so that users maintaining those namespaces are aware of them whether it is generated during the admission or in the audit process. It might also be standard/expected that events are emitted in the same namespace as the involved object.

https://pkg.go.dev/k8s.io/client-go/tools/record#EventRecorder

// The resulting event will be created in the same namespace as the reference object.

When there is no namespace (e.g. cluster resources) the event is emitted in the default namespace which is expected.

Which issue(s) this PR fixes:
fixes #2230

Signed-off-by: Craig Trought [email protected]

@ctrought ctrought force-pushed the fix-emit-events-namespace branch 2 times, most recently from 9109de0 to 47b8f44 Compare October 26, 2022 00:31
@maxsmythe maxsmythe requested review from ritazh and sozercan October 26, 2022 04:29
@maxsmythe
Copy link
Contributor

@ritazh or @sozercan are probably best positioned to have an opinion on this one

@sozercan
Copy link
Member

@ritazh do you remember why we are using gatekeeper's namespace instead of involved object's?

@maxsmythe
Copy link
Contributor

maxsmythe commented Oct 27, 2022

I'm trying to remember this myself.

I think there may have been some notion of security concerns... e.g. it may be possible for a malicious namespace admin to delete the event, which would change the security posture of events as a source-of-truth (not that they're great for that anyway due to throttling).

Or maybe it was making it easy to see all events in one space?

@ritazh
Copy link
Member

ritazh commented Oct 28, 2022

I think we were following the current rbac for violations as part of an audit (constraint is cluster level resource) and as part of the Gatekeeper pod logs (Gatekeeper namespace). Both are only accessible to Gatekeeper policy and cluster admins.

It would be beneficial to have events emitted in the namespace where the involved object is so that users maintaining those namespaces are aware of them whether it is generated during the admission or in the audit process.

This makes sense. Do we want to generate events in both the GK namespace (for the policy/cluster admins) and in the involved object namespace?

@ctrought
Copy link
Contributor Author

ctrought commented Oct 28, 2022

This makes sense. Do we want to generate events in both the GK namespace (for the policy/cluster admins) and in the involved object namespace?

Hmmm, I think it would be redundant to have the events duplicated. For clusters with thousands or even tens of thousands of events I personally wouldn't trade for the convenience of viewing all events from one namespace at the cost of having them duplicated in etcd... but to make it simpler to continue to view all violations a common gatekeeper label could be added to the events so policy admins or cluster-admins can list all gatekeeper events in the cluster using a label selector.

which would change the security posture of events as a source-of-truth (not that they're great for that anyway due to throttling).

I agree with @maxsmythe that events are not really a great source of truth for known violations anyways. But if that is the only way to view all policy violations right now (provided the user enables emitting violations as events) and there is concern about users removing events, users could always create a GK policy to prevent them from deleting events with the proposed GK labels 😆

@ritazh
Copy link
Member

ritazh commented Oct 31, 2022

I personally wouldn't trade for the convenience of viewing all events from one namespace at the cost of having them duplicated in etcd... but to make it simpler to continue to view all violations a common gatekeeper label could be added to the events so policy admins or cluster-admins can list all gatekeeper events in the cluster using a label selector.

+1 on label selector to view all gatekeeper events. Curious how others feel about allowing users to delete the violating events in their namespace.

But if that is the only way to view all policy violations right now

@ctrought Have you evaluated the audit pod logs? Audit pod emits JSON-formatted audit logs to stdout https://open-policy-agent.github.io/gatekeeper/website/docs/audit#audit-logs that should have all the policy violations.

@maxsmythe
Copy link
Contributor

Curious how others feel about allowing users to delete the violating events in their namespace.

I think it depends on the intended purpose of the events. If they are merely informational, then the only concerns about who has access where would be around preventing unintentional data leakage.

@ctrought
Copy link
Contributor Author

ctrought commented Nov 2, 2022

@ctrought Have you evaluated the audit pod logs? Audit pod emits JSON-formatted audit logs to stdout https://open-policy-agent.github.io/gatekeeper/website/docs/audit#audit-logs that should have all the policy violations.

I have and I think that's great! It's a bit cumbersome to extract the current violations though (since last audit run). Need to ship these to an external sink of some kind (elasticsearch, alertmanager, etc.). This is actually one of the reasons I submitted the PR to change the namespace because if we can get the events moved to the involved objects namespace I'm planning to use an event exporter to ship these to alertmanager or some other 'sink' where I should get a compiled feed of all the violations and can send them to slack, email, or whatever receiver we'd like. Having these in the involved namespace will allow us to route the violation/alert accordingly based on source namespace, in addition to providing visibility of violations inside kubernetes to the users managing those namespaces.

+1 on label selector to view all gatekeeper events.

Sounds good, I will update the PR. 😃

@ritazh
Copy link
Member

ritazh commented Nov 2, 2022

@ctrought Per the 11/2/2022 community call, we think it would help to add a flag to allow operators to configure to create the events in the violating resource namespace instead of the default gatekeeper namespace.

@ctrought
Copy link
Contributor Author

ctrought commented Nov 3, 2022

@ctrought Per the 11/2/2022 community call, we think it would help to add a flag to allow operators to configure to create the events in the violating resource namespace instead of the default gatekeeper namespace.

The below is how the involvedObject currently looks for a violation event, the namespace is set to gatekeeper-system instead of the namespace of the involvedObject in order to pass validation. I guess my thought is based on the event api and validation that is enforced it suggests that the event is intended to be emitted in the involvedObjects namespace, and the only reason it currently works is because the involvedObjects namespace is intentionally not set to the actual namespace of the involvedObject.

  kind: Event
  involvedObject:
    kind: PodDisruptionBudget
    name: redis-pdb
    namespace: gatekeeper-system
    uid: HorizontalPodAutoscaler/redis-namespace/redis-pdb/K8sPodDisruptionBudget//pod-disruption-budget
  metadata:
    namespace: gatekeeper-system

@ritazh
Copy link
Member

ritazh commented Nov 4, 2022

The flag should indicate all namespaces should be set to the violating resource's namespace instead of gatekeeper-system.
What about cluster scoped resources?

@ritazh
Copy link
Member

ritazh commented Dec 19, 2022

@ctrought Thanks for opening this PR and discussing options. Are you still working on this PR or blocked by anything?

@ctrought
Copy link
Contributor Author

@ctrought Thanks for opening this PR and discussing options. Are you still working on this PR or blocked by anything?

Hey @ritazh , I will look at this again shortly. 👍

@ctrought ctrought changed the title fix: Emit events in the involved objects namespace feat: Emit events in the involved objects namespace Jan 6, 2023
@ctrought
Copy link
Contributor Author

ctrought commented Jan 6, 2023

@ritazh was there any preference on a distinct flag for each audit & webhook and helm, or distinct flag for audit and webhook with a shared option in helm that sets both?

@ritazh
Copy link
Member

ritazh commented Jan 6, 2023

@ritazh was there any preference on a distinct flag for each audit & webhook and helm, or distinct flag for audit and webhook with a shared option in helm that sets both?

+1 for distinct flags for audit and webhook to make this more flexible.

@ctrought ctrought force-pushed the fix-emit-events-namespace branch from 47b8f44 to 2bb1b59 Compare January 16, 2023 07:49
@ctrought
Copy link
Contributor Author

I've updated the PR, in addition to the feedback made a couple other tweaks

  • Added distinct flags for audit/webhook to enable event creation in the involved objects namespace
    • if there is no namespace for the object with the violation (ie. cluster resources) the event is emitted in the namespace gatekeeper is installed in
  • Remove resource namespace from event message if the event is emitted in the involved namespace or it's a cluster scoped resource as it's redundant
  • Use the actual involved objects ObjectReference metadata in the event
    • I'm not too sure if there was any reason why the old custom generated UID was used (rkind + "/" + rnamespace + "/" + rname + "/" + ckind + "/" + cnamespace + "/" + cname), perhaps it just wasn't as relevant since the events were only ever emitted in the GK namespace. The constraint information is already included in annotations in the event for additional context so the information is still available with it removed from the uid. Kubernetes will correlate events with the object when the metadata of the events involved object matches one in the cluster. For example with the change if the option to emit audit/webhook events in the involved namespace is enabled violations for a deployment will be listed under events for the object when you run kubectl describe deploy my-deployment which is valuable for end users

@ctrought
Copy link
Contributor Author

+1 on label selector to view all gatekeeper events

Looked into this in the process, not currently possible with EventRecorder interface but there is a PR open. If it's implemented I can come back to it.

kubernetes/kubernetes#115058

@ctrought ctrought force-pushed the fix-emit-events-namespace branch from 2bb1b59 to 92d70a0 Compare January 22, 2023 21:25
@ctrought ctrought force-pushed the fix-emit-events-namespace branch from 92d70a0 to 1ba8916 Compare January 22, 2023 21:44
@ctrought
Copy link
Contributor Author

@ritazh anything you need from me to get this merged? thx

@ctrought
Copy link
Contributor Author

ctrought commented Mar 2, 2023

@maxsmythe @sozercan hoping this can be included in the 3.12 milestone?

@ritazh ritazh added this to the v3.12.0 RC milestone Mar 3, 2023
@sozercan
Copy link
Member

sozercan commented Mar 8, 2023

@ctrought looks like tests are failing, can you ptal?

@sozercan
Copy link
Member

@ctrought ptal the failing tests. we are aiming to release rc on 3/24.

@ritazh
Copy link
Member

ritazh commented Mar 22, 2023

@ctrought bump so we can get this PR reviewed and merged for the upcoming release

* move emit namespace logic to getViolationRef for tests

Signed-off-by: Craig Trought <[email protected]>
* add tests for using emit involved namespace flags

Signed-off-by: Craig Trought <[email protected]>
@eshaanm25
Copy link
Contributor

The unit tests pass on the fork and a local machine. The job may need to be re-run for this PR.

Copy link
Member

@ritazh ritazh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ritazh ritazh requested a review from a team March 24, 2023 04:30
Copy link
Contributor

@maxsmythe maxsmythe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ritazh ritazh merged commit 48be4ab into open-policy-agent:master Mar 25, 2023
davis-haba pushed a commit to davis-haba/gatekeeper that referenced this pull request Mar 31, 2023
…t#2360)

Signed-off-by: Craig Trought <[email protected]>
Signed-off-by: Eshaan Mathur <[email protected]>
Co-authored-by: Sertaç Özercan <[email protected]>
Co-authored-by: Eshaan Mathur <[email protected]>
Co-authored-by: Rita Zhang <[email protected]>
salaxander pushed a commit to salaxander/gatekeeper that referenced this pull request Apr 5, 2023
…t#2360)

Signed-off-by: Craig Trought <[email protected]>
Signed-off-by: Eshaan Mathur <[email protected]>
Co-authored-by: Sertaç Özercan <[email protected]>
Co-authored-by: Eshaan Mathur <[email protected]>
Co-authored-by: Rita Zhang <[email protected]>
Signed-off-by: Xander Grzywinski <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Admission Events InvolvedObject Namespace
6 participants