Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow 403 for overwrite prevention #5134

Merged
merged 10 commits into from
Nov 29, 2023

Conversation

emcake
Copy link
Contributor

@emcake emcake commented Nov 28, 2023

Which issue does this PR close?

Closes #5132.

What changes are included in this PR?

This extends the match case and also guards the match on if !overwrite.

Are there any user-facing changes?

  • Enables receiving an AlreadyExists error for 403 when overwrite = false.

  • Previously if you had overwrite=false and for some reason you received PRECONDITION_FAILED, you would have received an AlreadyExists error.

@github-actions github-actions bot added the object-store Object Store Interface label Nov 28, 2023
source: Box::new(source),
path: to.to_string(),
},
Some(StatusCode::PRECONDITION_FAILED) | Some(StatusCode::FORBIDDEN)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this, it will convert authorization errors into NotFound which whilst I can see the appeal for your specific workload, would be a regression/bug for others

Copy link
Contributor Author

@emcake emcake Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about extending S3CopyIfNotExists to allow specifying the return code instead?

Copy link
Contributor

@tustvold tustvold Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially, although I think it would need to be a new variant to avoid being a breaking change.

I would strongly encourage you to file this as a bug on whatever store is returning 403 though, it should not be returning 403. Perhaps it might be worth seeing if it can be fixed, as opposed to working around it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be done non-breaking by allowing specification via a new environment variable (COPY_IF_NOT_EXISTS_RETURN_CODE) and make the default value 412?

I would strongly encourage you to file this as a bug on whatever store is returning 403 though, it should not be returning 403. Perhaps it might be worth seeing if it can be fixed, as opposed to working around it here?

That’s possible but it’s a commercial solution so the turn-around time makes this hard.

On your point about 403 being ‘wrong’ - I’d argue we are in non-standard territory and everything is fair game. I’m making use of some overwrite prevention logic and in that context 403 makes sense - you’re not allowed to overwrite the file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d argue we are in non-standard territory and everything is fair game

The HTTP specification is fairly unambiguous as to when a 403 is a valid return code - https://datatracker.ietf.org/doc/html/rfc7231#section-6.5.3

I think this could be done non-breaking by allowing specification via a new environment variable (COPY_IF_NOT_EXISTS_RETURN_CODE) and make the default value 412?

Yeah... It's kind of gross but AWS_COPY_IF_NOT_EXISTS_RETURN_CODE_OVERRIDE is unlikely to accidentally come up elsewhere, and might be a way to move forward should a solution from the vendor not be forthcoming

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The system I’m taking advantage of emulates object lock for non-versioned buckets, which returns 403 when you perform a destructive action:

https://docs.aws.amazon.com/AmazonS3/latest/userguide/troubleshoot-403-errors.html#troubleshoot-403-object-lock

so from its perspective, it’s implementing the API correctly. It’s not pretty but it works.

If this seems like an acceptable solution I’ll create the new environment variable check and modify the match.

Copy link
Contributor

@tustvold tustvold Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough - I also did a pass through https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html#ErrorCodeList and there doesn't appear to be an error code for legal hold that we could detect and handle.

I think I would prefer adding a new HeaderWithStatus variant, as I think this will be more discoverable and consistent

@emcake
Copy link
Contributor Author

emcake commented Nov 28, 2023

Please ignore, I've just re-read your comment. Will re-implement with HeaderWithStatus

}

impl std::fmt::Display for S3CopyIfNotExists {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Self::Header(k, v) => write!(f, "header: {}: {}", k, v),
Self::HeaderWithStatus(k, v, code) => {
write!(f, "header-with-status: {}: {k}: {v}", code.as_u16())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The status code comes first because as far as I can tell, all common separators one might use to separate the return code from the header are also valid to have inside the header.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HTTP protocol uses : as a separator itself, and so is forbidden in header names, and is probably a terrible idea in header values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll swap the parse order and add some tests validating it.

object_store/src/aws/precondition.rs Outdated Show resolved Hide resolved
}

impl std::fmt::Display for S3CopyIfNotExists {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Self::Header(k, v) => write!(f, "header: {}: {}", k, v),
Self::HeaderWithStatus(k, v, code) => {
write!(f, "header-with-status: {}: {k}: {v}", code.as_u16())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
write!(f, "header-with-status: {}: {k}: {v}", code.as_u16())
write!(f, "header-with-status: {k}: {v}: {}", code.as_u16())

I think the order should match the variant

object_store/src/aws/precondition.rs Outdated Show resolved Hide resolved
object_store/src/aws/precondition.rs Outdated Show resolved Hide resolved
object_store/src/aws/client.rs Outdated Show resolved Hide resolved
@tustvold
Copy link
Contributor

Thank you for sticking with this

@tustvold tustvold merged commit 8867a1f into apache:master Nov 29, 2023
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
object-store Object Store Interface
Projects
None yet
Development

Successfully merging this pull request may close these issues.

object_store: Allow 403 Forbidden for copy_if_not_exists S3 status code
2 participants