Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CP-35551: Remove dependency on kernel blktap2 driver #364

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

MarkSymsCtx
Copy link
Contributor

No description provided.

Copy link
Contributor

@rosslagerwall rosslagerwall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me except for some minor complaints about error handling.

control/tap-ctl-allocate.c Outdated Show resolved Hide resolved
control/tap-ctl-allocate.c Outdated Show resolved Hide resolved
return errno;
}

flock(f, LOCK_EX);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the return value?

Copy link
Contributor Author

@MarkSymsCtx MarkSymsCtx Dec 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially the only way this can fail is with an EINTR or ENOLCK but it's probably worth at least commenting that, similarly the other LOCK_EX and no check.

Only option would be to fail hard which is maybe better than continuing blindly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, EINTR can and should be retried.

ENOLCK is probably quite unlikely, but if it happens and you continue you'd end up in a situation where two processes could be in the critical section at the same time.

/* err = ioctl(fd, BLKTAP2_IOCTL_FREE_TAP, minor); */
/* err = (err == -1) ? -errno : 0; */
/* close(fd); */
flock(fd, LOCK_EX);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the return value?

control/tap-ctl-free.c Outdated Show resolved Hide resolved
control/tap-ctl-free.c Outdated Show resolved Hide resolved
drivers/tapdisk-vbd.c Outdated Show resolved Hide resolved
drivers/tapdisk-vbd.c Outdated Show resolved Hide resolved
drivers/tapdisk-image.c Outdated Show resolved Hide resolved
err = glob(pattern, 0, NULL, &glbuf);
switch (err) {
case GLOB_NOMATCH:
EPRINTF("%s, failed to find parent NBD path", pattern);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the return value of this function? Above it returns -1 and sets errno, here it returns a value from the GLOB_* namespace, and below it returns -errno.

@jandryuk
Copy link
Contributor

Hi. Nice work on removing the blktap kernel module. Thank you!

I'm trying to modify libxl so it writes xenstores entries to backend/vbd3 so tapback can pick them up. /etc/xen/scripts/block-tap uses tap-ctl to create and remove the tapdisks.

One issue is that libxl waits for XenbusStateInitWait before launching the hotplug script. tapback doesn't move to InitWait without hotplug-status, but the hotplug scripts write hotplug-status. libxl then times out waiting for InitWait. I am using 96ffa61 to move tapback to InitWait and things seem to work with it. It seems like other code in tapback is waiting for hotplug-status connected, so moving to InitWait earlier should be okay for XAPI's use as well as libxl.

With this branch merged into master, I get a "No such file or directory" error when running tap-ctl destroy:

# tap-ctl list 
  286214    0    0        vhd /home/user/test.vhd
# tap-ctl destroy -p 286214 -m 0
No such file or directory
# tap-ctl list 
       -    0    -          - -
# tap-ctl free -m 0
# tap-ctl list 

It seems that the tapdisk process exits when tap_ctl_close() runs. The subsequent tap_ctl_detach() then cannot find the pid. I haven't seen this before. Is this expected?

With tapback, there is some trickiness with physical-device. It expects to lookup devices by major:minor. I modified the block-tap script to write_dev with the nbd path - that is needed since libxl will need that path to pass to QEMU. The hotplug script returns 0:0 for the device major:minor which won't be correct. I guess tapback sees physical-device first, and then physical-device-path second. The second one is correct and maybe takes precedence? I have only tested a single tapdisk at a time, so minor 0 is always correct.

It seems like physical-device code should be removed and only physical-device-path should be used by tapdisk?

I have more testing to do. There were a few instances where tapdisk prints repeats of:

ERROR: errno -5 at vhd_complete: /home/user/fedora1.vhd: op: 3, lsec: 26763264, secs: 1, blk: 6534, blk_offset: 30170439
ERROR: errno -5 at __tapdisk_vbd_complete_td_request: req xenvbd-20-2048.1: read 0x0008 secs @ 0x019866b8 - Input/output error
ERROR: errno -5 at vhd_complete: /home/user/fedora1.vhd: op: 3, lsec: 26763264, secs: 1, blk: 6534, blk_offset: 30170439
ERROR: errno -5 at vhd_complete: /home/user/fedora1.vhd: op: 3, lsec: 26763264, secs: 1, blk: 6534, blk_offset: 30170439
...
ERROR: errno -5 at __tapdisk_vbd_request_timeout: req xenvbd-20-2048.1 timed out, retried 120 times
ERROR: errno -5 at __tapdisk_vbd_request_timeout: req xenvbd-20-2048.1 timed out, retried 120 times

This may have been when I had stray tapdisks hanging around. #377 is help address that issue since block-tap could find the wrong minor.

@MarkSymsCtx
Copy link
Contributor Author

Hi. Nice work on removing the blktap kernel module. Thank you!

I'm trying to modify libxl so it writes xenstores entries to backend/vbd3 so tapback can pick them up. /etc/xen/scripts/block-tap uses tap-ctl to create and remove the tapdisks.

One issue is that libxl waits for XenbusStateInitWait before launching the hotplug script. tapback doesn't move to InitWait without hotplug-status, but the hotplug scripts write hotplug-status. libxl then times out waiting for InitWait. I am using 96ffa61 to move tapback to InitWait and things seem to work with it. It seems like other code in tapback is waiting for hotplug-status connected, so moving to InitWait earlier should be okay for XAPI's use as well as libxl.

With this branch merged into master, I get a "No such file or directory" error when running tap-ctl destroy:

# tap-ctl list 
  286214    0    0        vhd /home/user/test.vhd
# tap-ctl destroy -p 286214 -m 0
No such file or directory
# tap-ctl list 
       -    0    -          - -
# tap-ctl free -m 0
# tap-ctl list 

It seems that the tapdisk process exits when tap_ctl_close() runs. The subsequent tap_ctl_detach() then cannot find the pid. I haven't seen this before. Is this expected?

With tapback, there is some trickiness with physical-device. It expects to lookup devices by major:minor. I modified the block-tap script to write_dev with the nbd path - that is needed since libxl will need that path to pass to QEMU. The hotplug script returns 0:0 for the device major:minor which won't be correct. I guess tapback sees physical-device first, and then physical-device-path second. The second one is correct and maybe takes precedence? I have only tested a single tapdisk at a time, so minor 0 is always correct.

It seems like physical-device code should be removed and only physical-device-path should be used by tapdisk?

I have more testing to do. There were a few instances where tapdisk prints repeats of:

ERROR: errno -5 at vhd_complete: /home/user/fedora1.vhd: op: 3, lsec: 26763264, secs: 1, blk: 6534, blk_offset: 30170439
ERROR: errno -5 at __tapdisk_vbd_complete_td_request: req xenvbd-20-2048.1: read 0x0008 secs @ 0x019866b8 - Input/output error
ERROR: errno -5 at vhd_complete: /home/user/fedora1.vhd: op: 3, lsec: 26763264, secs: 1, blk: 6534, blk_offset: 30170439
ERROR: errno -5 at vhd_complete: /home/user/fedora1.vhd: op: 3, lsec: 26763264, secs: 1, blk: 6534, blk_offset: 30170439
...
ERROR: errno -5 at __tapdisk_vbd_request_timeout: req xenvbd-20-2048.1 timed out, retried 120 times
ERROR: errno -5 at __tapdisk_vbd_request_timeout: req xenvbd-20-2048.1 timed out, retried 120 times

This may have been when I had stray tapdisks hanging around. #377 is help address that issue since block-tap could find the wrong minor.

Yeah, this is very much a work in progress and not ready for merging yet. It looks like there is some unanticipated change to refcounting on one layer or another that causes the VBD to determine it's all done and the server can exit.

@MarkSymsCtx MarkSymsCtx marked this pull request as draft April 25, 2023 15:49
@jandryuk
Copy link
Contributor

My libxl patches are here if anyone wants to look: https://github.com/jandryuk/xen/commits/libxl-blktap

@MarkSymsCtx
Copy link
Contributor Author

@jandryuk if you try this updated PR it should work better. I've now got it working and integrated with xapi and the storage manager code in XenServer without the kernel driver being present on the system. There are still some integration rough edges with the toolstack which need to be resolved but they all should be that side and not in blktap.

@jandryuk
Copy link
Contributor

Thanks, @MarkSymsCtx . tap-ctl destroy is working properly again.

It's been wroking along with a couple of tapback changes I'll try to get posted soon.

Above, I mention two VHDs showing these errors:

ERROR: errno -5 at vhd_complete: /home/user/fedora1.vhd: op: 3, lsec: 26763264, secs: 1, blk: 6534, blk_offset: 30170439
...
ERROR: errno -5 at __tapdisk_vbd_request_timeout: req xenvbd-20-2048.1 timed out, retried 120 times

Both check as invalid:

# vhd-util check -n ~user/test.vhd  
block 1 (offset 0x2027) clobbers footer
/home/user/test.vhd appears invalid; dumping metadata
# vhd-util check -n ~user/fedora1.vhd  
primary footer invalid: invalid cookie
/home/user/fedora1.vhd appears invalid; dumping metadata

I'm not sure how that happened. The fedora one is still booting for my testing, though it ran fsck one at least one or two boots. There is a decent chance tapdisks were left lingering on the VHD files between runs - in other words there could have been two tapdisks pointing at the same VHD file.

@jandryuk
Copy link
Contributor

In block-tap, I had to do this:

    dev=$(tap-ctl create -a $target)
    # returned /run/blktap-control/tapdisk/tapdisk-0
    # get pid and minor
    find_device
    # Create nbd unix path
    dev=$( printf "/run/blktap-control/nbd%ld.%d" "$pid" "$minor" )
    write_dev $dev

Would it make sense for tap-ctl create to return the NBD path /run/blktap-control/nbd$PID.$MINOR instead of /run/blktap-control/tapdisk/tapdisk-$MINOR. tapdisk-$MINOR seems useful for tapdisk-internal operations, but not useful outside of tapdisk.

@jandryuk
Copy link
Contributor

jandryuk commented May 1, 2023

I'm using this branch with 4 commits on top of this PR for libxl support: https://github.com/jandryuk/blktap/tree/pr-364-libxl

it includes #377 because that fixes an issue with block-tap remove. The other commits are all documented for why they exist. I don't know how XAPI works, so they can use a review from that angle.

@jandryuk
Copy link
Contributor

jandryuk commented May 1, 2023

I was also thinking of changing to the newer (~2012!) uri syntax: jandryuk/xen@40e2f59

@v-thakkar
Copy link

Hi 👋 I work at Vates and came across this PR. Love that we're getting closer to removing the dependency on blktap kernel patch to use NBD instead of it. Especially after this change in XAPI: xapi-project/xen-api@6433c79 . I was hoping to test this PR along with #377. But I was wondering @MarkSymsCtx if you also have some branch in SM with the related changes?

We were running into this error and were wondering if that is something we can help with. But since you mentioned above that you have some storage manager code, thought to ask you first. Thanks :)

@MarkSymsCtx
Copy link
Contributor Author

Hi 👋 I work at Vates and came across this PR. Love that we're getting closer to removing the dependency on blktap kernel patch to use NBD instead of it. Especially after this change in XAPI: xapi-project/xen-api@6433c79 . I was hoping to test this PR along with #377. But I was wondering @MarkSymsCtx if you also have some branch in SM with the related changes?

We were running into this error and were wondering if that is something we can help with. But since you mentioned above that you have some storage manager code, thought to ask you first. Thanks :)

As I said above #364 (comment), this is still very much a work in progress. We do have some of the SM changes but we've had to defer all of this work to the next major platform release to avoid breaking customer workloads. Taking the opportunity of the platform release to make these breaking changes.

@jandryuk
Copy link
Contributor

I had been hoping to revisit this $soon to see if we could move it forward. Breaking customer workloads is 😞 though. @MarkSymsCtx do you have any rough idea of your schedule for the next major release? Are you talking months or years? I was hoping to be able to drop the blktap kmod before the next time I have to uprev Linux versions.

I'm working with libxl and not xapi, and that required some changes as seen in https://github.com/jandryuk/blktap/commits/pr-364-libxl. @v-thakkar if you could test jandryuk@dea81cf and jandryuk@680512a with XAPI, that would really help me out. jandryuk@e443937 would also be good to tested, but that might require the nbd changes in this PR.

@v-thakkar
Copy link

I agree with @jandryuk. Will appreciate the details about your schedule for the next major release @MarkSymsCtx , so that we all can work collaboratively to remove the dependency on blktap patch and don't have to keep backporting it to newer kernel versions. :)
@jandryuk Thanks for the link to your commits. I'll try to test it with XAPI next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants