-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
feat: implement conditional delete feature for s3 objects #13330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
All contributors have signed the CLA ✍️ ✅ |
localstack-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Welcome to LocalStack! Thanks for raising your first Pull Request and landing in your contributions. Our team will reach out with any reviews or feedbacks that we have shortly. We recommend joining our Slack Community and share your PR on the #community channel to share your contributions with us. Please make sure you are following our contributing guidelines and our Code of Conduct.
|
I have read the CLA Document and I hereby sign the CLA |
bentsku
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @mfurqaan31 and thanks for your contribution!
We have some contribution guidelines that details how a PR should look like and what it should contain. It needs to have tests that are validated against AWS to show how the feature behaves.
In the current state of this PR, the pipeline will fail because we were testing the previous behavior. See all the tests in tests.aws.services.s3.test_s3_api.TestS3DeletePrecondition
Currently, the behavior implemented is not correct: only Directory Buckets support x-amz-if-match-last-modified-time and x-amz-if-match-size, so the behavior that you changed around them is not correct (LocalStack does not support Directory Buckets yet). The link you provided only talks about AWS S3 supporting IfMatch headers.
Conditional writes and deletes come from a concurrency problem (to avoid overwriting files, etc), and this part is not addressed in your PR. We have systems in place in S3 with locks (as in PutObject) to make sure concurrent writes cannot happen.
Let me know if you want to push further in this PR, or if you'll need any help. Thanks again for the contribution!
|
Thanks for the feedback, I just want to confirm my next steps before updating the PR: Add tests to validate the conditional delete behavior against AWS. Update the implementation so that only the If-Match conditional delete is supported for non-directory buckets, while the other preconditions (If-Match-Last-Modified-Time and If-Match-Size) remain unsupported (since directory buckets are not implemented yet). Review and handle any potential concurrency concerns, similar to how PutObject uses locking in S3. Please confirm if this plan aligns with what you expect for this PR. |
|
Yes, those steps look good! This might be a lot of work and a complex feature, so feel free to ping me if there's anything I can help you with. |
|
@bentsku can you please review this. |
|
@mfurqaan31 seems like there are linting issues, could you please run |
953ab46 to
84cdc7e
Compare
bentsku
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay in re-reviewing the PR!
Thanks a lot for providing the requested changes! When checking out the PR and looking at what would be required to provide full concurrency support, the changes would be quite substantial and would need updates in other operations (PutObject, CopyObject and DeleteObjects as well, to ensure that no other "write" operations on S3 would be targeting the same object).
So I'd suggest that we remove the locking part and we can add it in a follow-up PR, in order to reduce the scope of the work in this one and reduce the complexity.
As I mentioned early in this PR, this is a complex feature that has wide ranging implication, sorry for being a bit risk-averse here 😅
On the other hand, I'd like to have more test coverage, especially on versioned buckets if possible. Once this is addressed and the code free of locks, I believe we should be good to go 👍
| if if_match: | ||
| # If If-Match is specified and object doesn't exist, it's a precondition failure | ||
| raise PreconditionFailed("Object does not exist", Condition="If-Match") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: where is this validated? I cannot see this exception being raised in the tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this exception has not been implemented in the tests, it was also getting raised in the function post_object of class S3Provider so i also did not raise it
| # Normalize ETags by removing quotes for comparison | ||
| object_etag = s3_object.etag.strip('"') | ||
| expected_etag = if_match.strip('"') | ||
| if object_etag != expected_etag: | ||
| raise PreconditionFailed( | ||
| "At least one of the pre-conditions you specified did not hold", | ||
| Condition="If-Match", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have this code 3 times in the delete_object function, might be worth pulling it in a small utility
| self._notify(context, s3_bucket=s3_bucket, s3_object=found_object) | ||
| store.TAGS.tags.pop(get_unique_key_id(bucket, key, version_id), None) | ||
| # Acquire lock before any operations to ensure atomicity | ||
| with self._preconditions_locks[bucket][key]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: I think this is a bit problematic for a few reasons:
right now, we do not create a precondition lock for non-versioned buckets, which means that we won't properly lock down concurrent delete_object with put_object and copy_object. This is somewhat out of scope of this PR and we could tackle it in a follow up, but it should follow more closely the same logic as put_object, acquiring the internal lock of the S3 object and doing minimal logic inside the lock block
Sorry for the back and forth on this, but I would suggest we would remove the locking entirely for now and it will be added by a follow-up PR, as it probably requires changes in PutObject, DeleteObjects and CopyObject if we want it to work properly.
So let's remove the locking, and focus on making sure IfMatch works properly 👍
| class TestS3DeletePrecondition: | ||
| @markers.aws.validated | ||
| def test_delete_object_if_match_non_express(self, s3_bucket, aws_client, snapshot): | ||
| @markers.snapshot.skip_snapshot_verify(paths=["$..Error.RequestID", "$..RequestId"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: why are we skipping those fields? I don't see them in the snapshots
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your code, you've updated the logic related to Buckets with versioning enabled, however I don't see any tests validating those changes. It would be great to have coverage for it if possible 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have removed tests "test_delete_object_if_match_non_express", "test_delete_object_if_match_all_non_express" and replaced them with test_delete_object_if_match_success, test_delete_object_if_match_failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks! But it would be great to also write tests that verify that this works properly in all cases. Right now the behavior with versioned buckets is not validated, so we cannot be sure that the code in the provider is in line with what AWS does.
this is related to my other comments about the PreconditionFailed("Object does not exist") exception you are raising but we do not validate that it is the exception that AWS is raising.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure will do it
Motivation
This PR addresses issue #13323, which requests support for conditional delete operations in S3. These conditions allow object deletions to occur only when specific criteria are met.
Changes
--if-matchcondition for S3 delete-object operations.