Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(artifacts): Add ArtifactStore to Kork #1069

Merged
merged 5 commits into from
Jul 5, 2023

Conversation

xibz
Copy link
Contributor

@xibz xibz commented Jun 29, 2023

This commit adds a new feature called ArtifactStore. This allows for storage of embedded/base64 types to some storage, eg S3. So instead of having duplication of some artifact in a context, it'll now be referenced by a artifact URI.

Also, there's a nice README that explains various settings as well as how to setup a local S3 for testing artifact storage

README link

The related PRs
spinnaker/clouddriver#5976
spinnaker/orca#4481
spinnaker/rosco#998
spinnaker/gate#1674

Further to enable this feature can be enabled via feature flag

artifactStore:
  enabled: true
  applicationsRegex: "only-these-applications-*" # optional regex to enable certain regex application names to have this enabled
  s3:
    enabled: true
    bucket: s3-artifact-bucket

We've also allowed for certain applications to have it enabled with shortens the blast radius of testing.

Also for some metrics, we saw large pipelines be reduced by 70%. It really comes down to how often an artifact is referenced in each stage. Simple smaller pipelines were reduced by 30%.

Screen Shot 2023-06-29 at 12 56 15 AM

Without any formatting, the JSON file is 26,960,606 bytes in length, out of which over 20M are for the reference fields of embedded artifacts (almost exactly 75%).


TODO

  • Create PR for Gate
  • Create PR for Deck

Comment on lines +47 to +49
* <p>Note: It is very important that the S3 bucket has object lock on it to prevent multiple writes
* {@see https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lock-overview.html}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, what about deletion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, also works for deletion. You can think of the object lock as an SQL row lock

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, what is the plan for deleting old artifacts in the bucket?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah,so S3 has built in cleanup of old objects. So whatever you use for infrastructure, eg pulumi, terraform, etc, you'd just need to set that when creating the bucket

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps OldPipelineCleanupPollingNotificationAgent could have some additional logic to clean up artifacts when cleaning up old pipeline executions since the logic is more complex than "older than"? I think it's OK for that to happen in a followup PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, it's just out of scope for S3 artifact stores since they have their own way of handling clean up. However, I do think if someone adds some storage that doesn't implement that, then some logic will be needed to handle that. I haven't thought too much on the design of that since that was outside of the scope, but am totally okay with some future PR to address this.

Comment on lines 86 to 94
ArtifactStore storage = Mockito.mock(S3ArtifactStore.class);
when(storage.store(Mockito.any())).thenReturn(testCase.mockArtifact);

EmbeddedArtifactSerializer serializer =
new EmbeddedArtifactSerializer(new ObjectMapper(), storage);
ObjectMapper objectMapper = new ObjectMapper();
SimpleModule module = new SimpleModule();
module.addSerializer(Artifact.class, serializer);
objectMapper.registerModule(module);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can move these lines outside of the for

Copy link
Contributor Author

@xibz xibz Jun 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't think so. The for loop is iterating for each test case. From that test case we create a storage which then creates a serializer.

That serializer then is used to register itself to the newly created module which is then used to register itself to the newly created object mapper.

* hasAuthorization will ensure that the user has proper permissions for retrieving the stored
* artifact
*/
private void hasAuthorization(String id, String userId) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we move this method to the abstract class and enforce to check authorization? for future ArtifactStore implementations?

Copy link
Contributor Author

@xibz xibz Jun 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, potentially? I'll need to think about this some, cause it would require some movement of some stuff or may not change at all. Just requires some thinking here.

@xibz
Copy link
Contributor Author

xibz commented Jun 30, 2023

Also added gate PR, spinnaker/gate#1674

private String applicationsRegex = null;
/** Configuration for an s3 client which will utilize credentials in the AWS credentials file. */
@Data
public static class S3ClientConfig {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pondering how helpful it would be to match the knobs clouddriver provides in S3ArtifactProviderProperties and S3ArtifactAccount.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a bad idea. I think originally I was trying to keep v2 of the AWS SDK as separate as possible, but went a little overboard with the config :p

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can refactor and instead look up the account based on some account name that is part of the artifact store properties. However, new fields would need to be added to the current S3ArtifactAccount class and a new account-name field to the artifact store configuration. We'd also need to see what is all compatible in configuration between V1 and V2 of the client configuration. I imagine this shouldn't be an issue.

This does require a little bit more work. Mostly restructuring things to be moved out of clouddriver into kork, and then adding the appropriate fields.

I feel like this can be a separate PR however. I would like to ensure that we implement it internally first so we don't deviate too far from upstream, if that works for you guys. Let me know your thoughts

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good, and yeah, totally agree with not holding up this PR for it.

Benevolent Benjamin Powell and others added 4 commits July 5, 2023 17:35
This commit adds a new feature called ArtifactStore. This allows for
storage of embedded/base64 types to some storage, eg S3. So instead of
having duplication of some artifact in a context, it'll now be
referenced by a artifact URI.

Also, there's a nice README that explains various settings as well as
how to setup a local S3 for testing artifact storage

[README link](kork-artifacts/src/main/java/com/netflix/spinnaker/kork/artifacts/README.md )

Signed-off-by: benjamin-j-powell <[email protected]>
…cts/README.md

Co-authored-by: Nemesis Osorio <[email protected]>

Update kork-artifacts/src/main/java/com/netflix/spinnaker/kork/artifacts/README.md

Co-authored-by: Nemesis Osorio <[email protected]>

Update kork-artifacts/src/main/java/com/netflix/spinnaker/kork/artifacts/artifactstore/ArtifactDeserializer.java

Co-authored-by: Nemesis Osorio <[email protected]>

Update kork-artifacts/src/main/java/com/netflix/spinnaker/kork/artifacts/README.md

Co-authored-by: David Byron <[email protected]>

Update kork-artifacts/src/main/java/com/netflix/spinnaker/kork/artifacts/README.md

Co-authored-by: David Byron <[email protected]>

Update kork-artifacts/src/main/java/com/netflix/spinnaker/kork/artifacts/README.md

Co-authored-by: David Byron <[email protected]>

Update kork-artifacts/src/main/java/com/netflix/spinnaker/kork/artifacts/artifactstore/EmbeddedArtifactSerializer.java

Co-authored-by: Sergio Quintero <[email protected]>

Update kork-artifacts/src/main/java/com/netflix/spinnaker/kork/artifacts/README.md

Co-authored-by: Sergio Quintero <[email protected]>

Update kork-artifacts/src/main/java/com/netflix/spinnaker/kork/artifacts/README.md

Co-authored-by: Sergio Quintero <[email protected]>

Update kork-artifacts/src/main/java/com/netflix/spinnaker/kork/artifacts/README.md

Co-authored-by: Sergio Quintero <[email protected]>
…cts/artifactstore/ArtifactStoreURIBuilder.java

Co-authored-by: David Byron <[email protected]>
…cts/README.md

Co-authored-by: David Byron <[email protected]>

Update kork-artifacts/src/main/java/com/netflix/spinnaker/kork/artifacts/README.md

Co-authored-by: David Byron <[email protected]>
implementation "com.fasterxml.jackson.core:jackson-databind"
implementation platform('software.amazon.awssdk:bom')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't able to build clouddriver locally using a local kork snapshot, but it built fine after removing this line. I think this is unnecessary?

Copy link
Contributor Author

@xibz xibz Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I built it just fine. I had to clear gradle cache however. I did use korkComposite=true in gradle.properties

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like software.amazon.awssdk:s3 is duplicated

Copy link
Member

@nemesisOsorio nemesisOsorio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm excited to test this out

@dbyron-sf dbyron-sf added the ready to merge Approved and ready for merge label Jul 5, 2023
@mergify mergify bot added the auto merged label Jul 5, 2023
@mergify mergify bot merged commit 18d1c6e into spinnaker:master Jul 5, 2023
5 checks passed
xibz pushed a commit to xibz/spinnaker.io that referenced this pull request Jul 10, 2023
This commit updates the next release notes to contain the artifact store
feature.

spinnaker/kork#1069

Signed-off-by: benjamin-j-powell <[email protected]>
xibz pushed a commit to xibz/spinnaker.io that referenced this pull request Jul 10, 2023
This commit updates the next release notes to contain the artifact store
feature.

spinnaker/kork#1069

Signed-off-by: benjamin-j-powell <[email protected]>
xibz pushed a commit to xibz/spinnaker.io that referenced this pull request Jul 10, 2023
This commit updates the next release notes to contain the artifact store
feature.

spinnaker/kork#1069

Signed-off-by: benjamin-j-powell <[email protected]>
dbyron-sf added a commit to spinnaker/spinnaker.io that referenced this pull request Jul 10, 2023
* docs(kork): add release notes for artifact store change

This commit updates the next release notes to contain the artifact store
feature.

spinnaker/kork#1069

Signed-off-by: benjamin-j-powell <[email protected]>

* Update content/en/docs/releases/next-release-preview/index.md

Co-authored-by: David Byron <[email protected]>

* Update content/en/docs/releases/next-release-preview/index.md

Co-authored-by: David Byron <[email protected]>

* docs(kork): add release notes for artifact store change

This commit updates the next release notes to contain the artifact store
feature.

spinnaker/kork#1069

Signed-off-by: benjamin-j-powell <[email protected]>

---------

Signed-off-by: benjamin-j-powell <[email protected]>
Co-authored-by: benjamin-j-powell <[email protected]>
Co-authored-by: David Byron <[email protected]>
dbyron-sf added a commit to dbyron-sf/kork that referenced this pull request Aug 10, 2023
and make dependencies on log4j-api explicit.

spinnaker#1069 introduced the dependency on org.apache.logging.log4j.log4j-core in kork-artifacts, but it's not necessary.  At least, after removing it, and verifying the dependency is gone from

$ ./gradle kork-artifacts:dependencies

the code still builds fine.  As well

$ git grep "import org.apache.logging.log4j"
kork-secrets/src/main/java/com/netflix/spinnaker/kork/secrets/user/UserSecretTypeProvider.java:import org.apache.logging.log4j.LogManager;
kork-security/src/main/java/com/netflix/spinnaker/security/AuthenticatedRequestDecorator.java:import org.apache.logging.log4j.ThreadContext;

turns up classes present in log4j-api, but nothing in log4j-core.

As well uses of the @log4j2 annotation depend on log4j-api (see https://projectlombok.org/api/lombok/extern/log4j/Log4j2)

$ git grep @log4j2
kork-artifacts/src/main/java/com/netflix/spinnaker/kork/artifacts/artifactstore/ArtifactStoreConfiguration.java:@log4j2
kork-artifacts/src/main/java/com/netflix/spinnaker/kork/artifacts/artifactstore/s3/S3ArtifactStore.java:@log4j2
kork-core/src/main/java/com/netflix/spinnaker/kork/jackson/JsonTypeNameParser.java:@log4j2
kork-credentials/src/main/java/com/netflix/spinnaker/credentials/jackson/SensitiveSerializer.java:@log4j2
kork-secrets/src/main/java/com/netflix/spinnaker/kork/secrets/EncryptedSecret.java:@log4j2

This all makes the comment in spinnaker-dependencies.gradle true again:

Per analysis, log4j-core is not included in dependencies
mergify bot pushed a commit that referenced this pull request Aug 10, 2023
* chore(dependencies): remove unnecessary dependency on log4j-core

and make dependencies on log4j-api explicit.

#1069 introduced the dependency on org.apache.logging.log4j.log4j-core in kork-artifacts, but it's not necessary.  At least, after removing it, and verifying the dependency is gone from

$ ./gradle kork-artifacts:dependencies

the code still builds fine.  As well

$ git grep "import org.apache.logging.log4j"
kork-secrets/src/main/java/com/netflix/spinnaker/kork/secrets/user/UserSecretTypeProvider.java:import org.apache.logging.log4j.LogManager;
kork-security/src/main/java/com/netflix/spinnaker/security/AuthenticatedRequestDecorator.java:import org.apache.logging.log4j.ThreadContext;

turns up classes present in log4j-api, but nothing in log4j-core.

As well uses of the @log4j2 annotation depend on log4j-api (see https://projectlombok.org/api/lombok/extern/log4j/Log4j2)

$ git grep @log4j2
kork-artifacts/src/main/java/com/netflix/spinnaker/kork/artifacts/artifactstore/ArtifactStoreConfiguration.java:@log4j2
kork-artifacts/src/main/java/com/netflix/spinnaker/kork/artifacts/artifactstore/s3/S3ArtifactStore.java:@log4j2
kork-core/src/main/java/com/netflix/spinnaker/kork/jackson/JsonTypeNameParser.java:@log4j2
kork-credentials/src/main/java/com/netflix/spinnaker/credentials/jackson/SensitiveSerializer.java:@log4j2
kork-secrets/src/main/java/com/netflix/spinnaker/kork/secrets/EncryptedSecret.java:@log4j2

This all makes the comment in spinnaker-dependencies.gradle true again:

Per analysis, log4j-core is not included in dependencies

* chore(dependencies): use version 2.17.1 of log4j-bom

to address CVE-2021-45105 and CVE-2021-44832.

* chore(dependencies): use version 2.20.0 of log4j-bom

to stay up to date
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants