AAP-45875 Runtime Feature Flags #875

fao89 · 2025-10-24T19:45:45Z

Description

https://issues.redhat.com/browse/AAP-45875

What is being changed?
This change is the foundation for enabling runtime platform feature flags for AAP. This updates the django-ansible-base to be the central location where all platform flags are defined. Components can inherit the ansible_base.feature_flags application to inherit all platform feature flag definitions.
Why is this change needed?
To enable runtime platform feature flags in AAP.
How does this change address the issue?
This change addresses the issue by defining a database flag source, which contains all the feature flags along with their associated metadata. These feature flags are installed into each components database at install-time and kept in sync via resource sync (Gateway is the provider)

Before this can be merged, the following should be done:

Confirm each feature flags metadata
1. Only Dispatcher and EDA_Analytics feature flags need their metadata confirmed now.

Type of Change

New feature (non-breaking change which adds functionality)
Documentation update
Test update

Self-Review Checklist

I have performed a self-review of my code
I have added relevant comments to complex code sections
I have updated documentation where needed
I have considered the security impact of these changes
I have considered performance implications
I have thought about error handling and edge cases
I have tested the changes in my local environment

Testing Instructions

Prerequisites

Steps to Test

Recommend testing this through AAP Dev, but it can be tested directly via the test app as well.

Clone https://github.com/ansible/aap-dev/
Enter aap-dev directory and run make configure-sources
Use this PR for the DAB source
Use this PR for the Gateway source - https://github.com/ansible-automation-platform/aap-gateway/pull/1056

Expected Results

AAP Deploys as expected with database feature flags.

Additional Context

Required Actions

Requires downstream repository changes

ansible_base/lib/dynamic_config/dynamic_urls.py

AlanCoding · 2025-10-27T14:28:23Z

ansible_base/resource_registry/tasks/sync.py

+        if system_user:
+            queryset = queryset.exclude(object_id=system_user.id)
+
+    return queryset


This seems extremely out-of-place for this patch. On the surface it seems reasonable, but I want to ask that we treat it separately so we can have a paper trail (new Jira, etc.). In what case is it known that we had a user sync problem with the _system user?

I want that for basic accountability surrounding the change. I believe there is risk in the change, and I don't want that risk to be attributed to the feature flags feature, which it seems to have nothing to do with it.

I was having issues with test_app/tests/resource_registry/test_resource_sync.py::test_resource_sync it seems to be flaky

AlanCoding · 2025-10-27T14:33:16Z

ansible_base/resource_registry/shared_types.py

Going to ask for @chrismeyersfsu review, as he has some patches to code here, philosophically non-conflicting. The sync changes are something I want to advertise more widely.

AlanCoding · 2025-10-27T14:34:59Z

Only Dispatcher and EDA_Analytics feature flags need their metadata confirmed now.

I'm not sure if we're going to keep the dispatcher feature flag, as it was for a transition period. So it's likely better to never port it to the updated flags system.

AlanCoding · 2025-10-27T14:38:43Z

ansible_base/feature_flags/views.py

+    """
+
+    queryset = AAPFlag.objects.order_by('id')
+    permission_classes = try_add_oauth2_scope_permission([IsSuperuserOrAuditor])


@john-westcott-iv remind me, is the rule for using try_add_oauth2_scope_permission that we need it for any view that is surfaced by aap-gateway? That would explain why it is added here, I just want to have it written down.

This is correct. Without this, any views that don't have the permission class will get un-scoped permissions when using PATs (read tokens effectively become read write tokens).

AlanCoding · 2025-10-27T14:43:50Z

ansible_base/feature_flags/views.py

+class FeatureFlagsStatesView(AnsibleBaseDjangoAppApiView, ModelViewSet):
+    """
+    A view class for displaying feature flags states.
+    To add/update/remove a feature flag, see the instructions in


Here comes the big questions - we have a new endpoint being added, follow that.

Under what conditions will this be added? What specific actions are needed downstream of this change? Trying to fill in those blanks:

Each service (AWX, eda-server, etc.) already have this added to INSTALLED_APPS, but the new migration means that installers need to be sure that migration is run

the UI will need to make use of these new endpoints.

But even from these 2 above points, I have a point of confusion. All services will expose these new endpoints. So then what endpoints should the UI make use of? Which endpoints will work?

If a flag is modified via the Gateway API, will that be actively synchronized to the services, or passively?

If the UI is only going to use the endpoints from the Gateway API, should we disable the new API in the other services or make them read-only?

All services will expose these new endpoints. So then what endpoints should the UI make use of? Which endpoints will work?

Answered:

There is an additional viewset included in aap-gateway that adds partial_update, which is not present here. So this exposes everywhere as read-only. The read-only, GET, view is still in gateway (the resource server) but not in the other components, which are fully read-only.

If a flag is modified via the Gateway API, will that be actively synchronized to the services, or passively?

Actively.

The partial_update method definition does synchronize to components, actively.

If the UI is only going to use the endpoints from the Gateway API, should we disable the new API in the other services or make them read-only?

Yes, already done. 👍

AlanCoding · 2025-10-29T17:54:44Z

ansible_base/feature_flags/serializers.py

+        fields = ["name", "state"]
+
+    def to_representation(self, instance=None) -> dict:
+        instance.state = True


This line is surprising to me. What is going on here?

I also don't get it too, @zkayyali812 could you please take a look?

I dont exactly recall why this is here, but I think this is a great call out. Im guessing this line is a remnant, and could likely be removed.

AlanCoding · 2025-10-30T19:38:36Z