Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(flags): Add OpenFeature integration #1209

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

jvz
Copy link
Contributor

@jvz jvz commented Nov 7, 2024

This adds an OpenFeature-based feature flags module along with a built-in provider based on flagd.

As this implements DynamicConfigService, all existing dynamic config usage is supported out of the box here. This is primarily to get agreement with the rest of the project that we'll be using that API (along with the OpenFeature API) for feature flags. This should help a lot with experimental features.

In order to use a managed dependency in the `protoc` configuration settings, dependencyResolutionManagement must be used instead of enforcePlatform.
This adds an OpenFeature-based feature flags module along with a built-in provider based on flagd.
import spock.lang.Specification
import spock.lang.Subject

class OpenFeatureDynamicConfigServiceSpec extends Specification {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the chances of doing this in java? We're trying desperately to keep from adding more groovy code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do.

@Import({OpenFeatureConfigConfiguration.class, TransientConfigConfiguration.class})
public class BootstrapComponents {
@Bean
@ConditionalOnMissingBean(VersionResolver.class)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs say to use this only on auto-config classes. This isn't one of those, is it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a class imported by an auto-config class in one case and one named as an auto-config class in the other case. However, this has made me realize that there's a bit of a bean mess in the various Spinnaker services because PluginsAutoConfiguration imports PlatformComponents (which is an auto-config but is being directly imported as a regular config in this context), but PluginsAutoConfiguration is not itself an auto-config, and it is directly imported by different configuration classes in different services in such a way that the beans from there aren't directly visible by IntelliJ consistently through different modules. I want to look more closely at this.

And to be specific, this BootstrapComponents class is a de-duplication of the two beans defined exactly this way in both PluginsAutoConfiguration and PlatformComponents I found while doing something else at one point that got slightly mixed in here due to affected code (declarations of a default DynamicConfigService bean).

This whole area may need some deeper digging. My main purpose is to provide the beans for accessing OpenFeatureAPI directly, but I added the DynamicConfigService integration for a simpler way to get started using flagd (or whatever OpenFeature provider you want to use in theory) which can control all the existing features of Spinnaker that use DynamicConfigService. However, now this raises further questions about where to place the beans.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. This is no worse than it was before in PluginsAutoConfiguration and PlatformComponents.

@dbyron-sf
Copy link
Contributor

Could you add some high-level motivation for this? What are the benefits of OpenFeature?

@jvz
Copy link
Contributor Author

jvz commented Nov 7, 2024

What I find to be a killer benefit is being able to use targeting for feature flag enablement. While DynamicConfigService offers basic support for a few orthogonal targeting criteria, OpenFeature defines a more general feature flag API.

Then, I combined that with flagd by using the Java provider as a reference implementation. What I like about flagd (besides being OSS and already well-integrated with OpenFeature) are the existing ways of defining feature flags which uses some JsonLogic thing for defining more complex feature flag targeting rules.

Benefits-wise, there are a few different targets. The impetus behind integrating with this is to help enable more continuous delivery of Spinnaker itself by flagging experimental and new stuff behind flags that can be selectively rolled out in different ways. For example, I've been researching how to go about extending Fiat to support external policy engines, but due to the existing mess in Fiat, I've found that very hard to do without affecting a lot of existing code, and I want to be able to selectively roll out changes in authorization based on context like the current user, application, resource type, and so on, and not just in a boolean fashion.

Another advantage is to develop things in a sort of default-on way rather than using conditional beans and other easy to break configuration methods for feature flags. While features are still in alpha, they're not active, but once they're stable, the feature flags can be removed.

@dbyron-sf
Copy link
Contributor

Thanks for that @jvz . I need to digest it a bit. One thing we have noticed is that, at least with the spring implementation of DynamicConfigService is that it becomes super slow/expensive when used in combination with lots of (clouddriver) accounts if they're defined in yml files. More generally it slows down with lots of config properties, but lots of accounts is a way to get there. Basically, any access of one property causes a copy (and perhaps parsing...can't remember) of all properties. We've since moved accounts to the account API, but for folks who haven't, this could be a drag. The tl;dr was basically, don't use the DynamicConfigService. Does this implementation still use enough spring under the covers to suffer this way?

@jvz
Copy link
Contributor Author

jvz commented Nov 7, 2024

... Does this implementation still use enough spring under the covers to suffer this way?

It allows it as a backup, though perhaps I could make that configurable as well. Right now, it configures a DynamicConfigService to fallback to a Spring-based one for undefined feature flags from OpenFeature.

@dbyron-sf
Copy link
Contributor

I think I see where you're coming from...or at least the ability to enable features per-user (or some other attribute) makes sense. I'm not quite sure I see the part about conditional beans. Can you give an example of how this would play out as a feature moves from alpha to stable?

@jvz
Copy link
Contributor Author

jvz commented Nov 11, 2024

The conditional beans aspect would be where various beans are annotated like @ConditionalOnProperty("some.feature.flag"). The more of these you have, the larger number of combinations of possible beans available explodes which makes it much easier to write beans in such a way that various configurations can cause startup failures due to missing beans and such. It also makes it harder to write error handling code to alert to misconfiguration (like via a FailureAnalyzer as those are based on the exception type, and missing beans always throw a NoSuchBeanDefinitionException which is hard to generically analyze). Similarly, when using @ConditionalOnMissingBean or other conditions, these become very hard to use for Spinnaker features without configuration errors that very few people would ever understand how to debug.

As for a move from alpha to stable, the idea would be that relevant beans configured for an alpha feature wouldn't necessarily be conditionally created; instead, the beans would use feature flags to enable or disable the code within the beans to actually do anything (comparable to how fiat-api works today). When a feature is alpha, it can be selectively enabled by admins using whatever feature flag conditions they want (e.g., maybe you have a group of users who want to try out new features early on like Firefox Nightly versus other users who only want to use stable features), and as a feature graduates to stable, the feature flags can change their default variant to enabled, and then the feature flag can be removed (or converted into a more basic property if it makes sense for the feature to remain optional but doesn't require dynamic feature flags).

If you think of this from an implementation point of view, the difference between feature flags and configuration properties would deal with how dynamic the input settings are expected to be. When dealing with Spring properties, for example, these are usually written with the assumption that the property is defined at startup and never changes (and only the limited @RefreshScope beans we have in Spinnaker assume otherwise, but that has performance issues as you mentioned when the number of Spring properties increases as property lookup is O(n) instead of anything sublinear due to the numerous extension points in Spring for handling properties). When dealing with a feature flag, the flag should be checked nearly every time (with potential caching involved).

@dbyron-sf
Copy link
Contributor

OK, agreed. Seems like this is a bit of a separate thing from OpenFeature integration (i.e. always enable all the beans and check flags inside them). As in, I don't think this PR actually moves in this direction. And yes, as we get more flags, this will be more important.

@dbyron-sf
Copy link
Contributor

There is something nice about beans not loading at all as far as reasoning about changes in behavior. It's worth the tradeoff though.

@jvz
Copy link
Contributor Author

jvz commented Nov 11, 2024

I think the bean loading bit should be limited to optional features rather than unstable features. Otherwise, the combinatorial explosion of possible configurations to consider is too large for us humans to handle (once you have, say, 10 optional features, you now have 1024 combinations of flags possible, and it's likely that ~1000 of them are invalid combinations).

@dbyron-sf
Copy link
Contributor

You're saying conditional bean loading is for optional features only, or for unstable features only? I think you're saying unstable.

@jvz
Copy link
Contributor Author

jvz commented Nov 11, 2024

Conditional bean loading should be for optional features only, right. Unstable features that are meant to be part of the default system shouldn't bother creating conditional beans (but they should annotate their stability guarantees at least).

*/
@Component
@EnableOpenFeature
public class OpenFeatureConfigConfiguration {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DO we need to conditionally enable this configuration if "openfeature.eanbled" true kinda thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly. I was originally thinking of making this conditional on a FeatureProvider bean being available, but auto-configuration doesn't work all that well in this scenario. I'm thinking there may be some properties related to enabling use of OpenFeature for DynamicConfigService versus Spring properties versus both versus nothing.


import com.netflix.spinnaker.kork.flags.flagd.SyncFlagsEvent
import org.apache.logging.log4j.kotlin.Logging
import org.eclipse.jgit.api.Git
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To note, we've seen... rather nasty impacts using JGit on this. Particularly at LARGE scale. For FF usage it MAY be ok, but clouddriver in particular bypassed that intentionally to do native git clones. Examples of past issues:

  • SYMLINK failure handling on tmp repo locations
  • Git shallow clone failures when odd files were hit
    JUST a minor ocncern at this point.

.setBranch(properties.branch)
.setDepth(1)
.setDirectory(directory.toFile())
.setCredentialsProvider(UsernamePasswordCredentialsProvider("token", properties.apiToken))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tied to the above jgit concern.. API token vs. Oauth2 creds?

@@ -178,6 +183,7 @@ dependencies {
// some of the modules would still use <1.70 as they can't be upgraded without upgrading spring boot
api("org.bouncycastle:bcpkix-jdk18on:${versions.bouncycastle}")
api("org.bouncycastle:bcprov-jdk18on:${versions.bouncycastle}")
api("org.eclipse.jgit:org.eclipse.jgit:${versions.jgit}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/spinnaker/clouddriver/blob/02a601956d8cabd83b8db1516ab8a97aaa531a1a/clouddriver-cloudrun/clouddriver-cloudrun.gradle#L27 IF we choose to really support jgit (vs. say... referencing an internal artifact load via clouddriver APis to provide artifacts which has chicken/egg headaches)... probably should consider cleaning these other areas up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would probably make sense to use artifact support here, though how would you use feature flags in the artifact system?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants