Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(defrag): close temp file in case of error #18851

Merged
merged 1 commit into from
Nov 7, 2024

Conversation

ghouscht
Copy link
Contributor

@ghouscht ghouscht commented Nov 7, 2024

See issue #18841 and the discussion #18822 (comment) for details.

As written in the issue the temp file is closed if the defragmentation works as expected. The only case where it is left open is when the bolt.Open call fails. I think it is enough to simply include the close there, in my opinion there is no need to complicate this more.

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

@k8s-ci-robot
Copy link

Hi @ghouscht. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ghouscht
Copy link
Contributor Author

ghouscht commented Nov 7, 2024

/assign @ahrtr

Can you have a look at this please? I'm assigning you because you started the discussion in the other PR. As written in the description this single line change is enough to ensure the temp file is always closed. This error branch was the only one that left open the file.

@codecov-commenter
Copy link

codecov-commenter commented Nov 7, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 14.28571% with 12 lines in your changes missing coverage. Please review.

Project coverage is 68.87%. Comparing base (4ad9261) to head (060aa3b).
Report is 2 commits behind head on main.

Current head 060aa3b differs from pull request most recent head a19d408

Please upload reports for the commit a19d408 to get more accurate results.

Files with missing lines Patch % Lines
server/storage/backend/backend.go 0.00% 7 Missing ⚠️
server/embed/serve.go 0.00% 1 Missing and 1 partial ⚠️
server/etcdserver/api/rafthttp/http.go 66.66% 0 Missing and 1 partial ⚠️
server/etcdserver/apply_v2.go 0.00% 0 Missing and 1 partial ⚠️
server/lease/leasehttp/http.go 0.00% 0 Missing and 1 partial ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
Files with missing lines Coverage Δ
server/etcdserver/api/rafthttp/http.go 77.34% <66.66%> (ø)
server/etcdserver/apply_v2.go 69.23% <0.00%> (ø)
server/lease/leasehttp/http.go 64.18% <0.00%> (ø)
server/embed/serve.go 57.45% <0.00%> (ø)
server/storage/backend/backend.go 81.08% <0.00%> (-1.75%) ⬇️

... and 28 files with indirect coverage changes

@@            Coverage Diff             @@
##             main   #18851      +/-   ##
==========================================
+ Coverage   68.76%   68.87%   +0.11%     
==========================================
  Files         420      420              
  Lines       35524    35531       +7     
==========================================
+ Hits        24427    24471      +44     
+ Misses       9671     9636      -35     
+ Partials     1426     1424       -2     

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4ad9261...a19d408. Read the comment docs.

@@ -499,6 +499,7 @@ func (b *backend) defrag() error {
tdbp := temp.Name()
tmpdb, err := bolt.Open(tdbp, 0600, &options)
if err != nil {
temp.Close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also need to remove the temp file, refer to below

if rmErr := os.RemoveAll(tmpdb.Path()); rmErr != nil {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really needed? etcd anyway removes the db.tmp.* files at startup.

Copy link
Member

@ahrtr ahrtr Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the tmp file instead of db.tmp.*

EDIT: the tmp file locates under b.db.Path(). It would be better to cleanup it sooner

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should I simply add a os.Remove(temp.Name()) after the call to Close?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so. Please also log an error message if the removing fails

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added something, let me know what you think. Is there also a need to write a unit test for this?

In my opinion the error log doesn't really provide any value to an etcd user so personally I would omit it. As a user I do not really care if there is an old temp file lying around (that will be deleted on next etcd restart anyway) that is why I think it is not needed. However, I'm fine with adding it if you think it is needed 🙂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use defer here for cleanup? I saw the line 525 does the same thing. If there are multiple places to handle, maybe we can consider to use defer here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use defer here for cleanup?

We don't need to close & remove it in normal case. The tmp will be renamed to the bbolt db file.

In my opinion the error log doesn't really provide any value to an etcd user so personally I would omit it.

It's for debug purpose. It should be very rare. Overall not a big problem, and doesn't deserve too much time to debate on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to close & remove it in normal case. The tmp will be renamed to the bbolt db file.

I mean,

defer func() {
    if returnErr != nil {
          dosomething
    }
}()

It's just an option.

@ahrtr
Copy link
Member

ahrtr commented Nov 7, 2024

/ok-to-list

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahrtr, fuweid, ghouscht

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ahrtr ahrtr merged commit 694ace2 into etcd-io:main Nov 7, 2024
16 checks passed
@ahrtr
Copy link
Member

ahrtr commented Nov 7, 2024

could you backport this PR to 3.5 and 3.4?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants