Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error probing device: Error sending ATA command IDENTIFY PACKET DEVICE to '/dev/sr0 #732

Closed
cmurf opened this issue Feb 20, 2020 · 36 comments · Fixed by #1312
Closed

Error probing device: Error sending ATA command IDENTIFY PACKET DEVICE to '/dev/sr0 #732

cmurf opened this issue Feb 20, 2020 · 36 comments · Fixed by #1312

Comments

@cmurf
Copy link

cmurf commented Feb 20, 2020

udisks2-2.8.4-4.fc32.x86_64
systemd-udev-245~rc1-2.fc32.x86_64
both:
kernel-5.6.0-0.rc2.git0.1.fc32.x86_64
kernel-5.5.3-200.fc31.x86_64

[ 6.313699] vm.local udisksd[768]: Error probing device: Error sending ATA command IDENTIFY PACKET DEVICE to '/dev/sr0': Unexpected sense data returned:
0000: 70 00 05 00 00 00 00 0a 00 58 00 01 21 04 00 00 p........X..!...
0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
(g-io-error-quark, 0)

This is in a qemu-kvm guest.

@tbzatek
Copy link
Member

tbzatek commented Feb 20, 2020

Yes, this is known, it's a QEMU issue in the way the CD-ROM device emulation is done. We can't do much here, other than making the warning silent.

@paulmenzel
Copy link

Probably unrelated, but for the record, there is a regression in Linux 5.6-rc1 still not fixed in Linus’ master (5.6-rc3). The patch fixes this.

@bsdice
Copy link

bsdice commented Aug 3, 2024

Yo, hit this today upgrading to 6.6.44 kernel, running udisks2 2.10.1-4 on Arch.

udisksd[5518]: Error probing device: Error sending ATA command IDENTIFY PACKET DEVICE to '/dev/sr0': Unexpected sense data returned:
0000: f0 00 01 00 50 00 02 0a 00 00 00 02 00 1d 00 00 ....P...........
0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
(g-io-error-quark, 0)

Possibly related to this, unverified: https://lore.kernel.org/stable/[email protected]/

@sushi2503
Copy link

Same error here since kernel upgrade to 6.10.2.arch1-2 on Arch.

@paulmenzel
Copy link

@bsdice, @sushi2503, sounds like a Linux kernel regression. Please bisect.

@sushi2503
Copy link

@paulmenzel : For me to. Sorry I haven't the knowledge to bisect, hope bsdice has it...

@bsdice
Copy link

bsdice commented Aug 6, 2024

I compiled 6.6.44 with the following three patches reverted (old dog making an educated guess) and that appears to have fixed the error:

ata-libata-scsi-do-not-overwrite-valid-sense-data-when-ck_cond-1.txt
ata-libata-scsi-fix-offsets-for-the-fixed-format-sense-data.txt
ata-libata-scsi-honor-the-d_sense-bit-for-ck_cond-1-and-no-error.txt

@paulmenzel Enough info to work with? I am not sure if there are interdependencies with the patches so I reverted all three.

@paulmenzel
Copy link

Nice. But as you can see this is the UDisks’ project. It’d be great if you could report this to the commit authors and maintainers.

@bsdice
Copy link

bsdice commented Aug 6, 2024

@paulmenzel Thank you so much for reply, I will just cc the author.

Can you say if this is UDisks mis-interpreting kernel's changed sense data, or if kernel is wrongly sending malformed sense data? Because the patch comments claim to have rectified previously wrong data: "Correct the ATA PASS-THROUGH fixed format sense data offsets to conform to SPC-6 and SAT-5 specifications."

@paulmenzel
Copy link

Can you say if this is UDisks mis-interpreting kernel's changed sense data, or if kernel is wrongly sending malformed sense data?

Sorry I haven’t looked closely, and do not know. (I am not a UDisk person.) I just know, that it wouldn’t matter as the Linux kernel has a no regression policy.

[…]

@istenburg
Copy link

Arch linux Kernel 6.10.3-arch1-2 udisks2 2.10.1-4, also receive this message
Aug 09 10:26:46 I-NET udisksd[1038]: Error probing device: Error sending ATA command IDENTIFY DEVICE to '/dev/sda': Unexpected sense data returned: 0000: f0 00 01 00 50 00 01 0a 00 00 00 00 00 1d 00 00 ....P........... 0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ (g-io-error-quark, 0) Aug 09 10:26:46 I-NET udisksd[1038]: Error probing device: Error sending ATA command IDENTIFY DEVICE to '/dev/sdb': Unexpected sense data returned: 0000: f0 00 01 00 50 00 00 0a 00 00 00 00 00 1d 00 00 ....P........... 0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ (g-io-error-quark, 0)
Is this a hardware error?

@bsdice
Copy link

bsdice commented Aug 9, 2024

Most likely not a hardware error, but a kernel or UDisks regression. I reverted three libata patches and bug went away. I wrote to patch author and ata subsystem maintainer at kernel.org but no response so far. Might be due to vacation time on northern hemisphere. Possibility 1 is that the kernel patches are wrong, 2 that kernel now emits correct sense data which were previously slightly wrong or format garbled, but UDisks cannot process it now correctly. Don't have time to follow this rabbit hole down, maintainers have to do it.

@sushi2503
Copy link

@bsdice : Maybe it would be better to open a kernel bug report ?

@bsdice
Copy link

bsdice commented Aug 9, 2024

I don't want to do it myself. From what I have seen you sub to -stable and reply to GregKH for the specific patch mail that there is a regression. Patches will be dropped if plausible. This puts pressure on patch writer or subsystem maintainers to clarify or fix it.

@istenburg
Copy link

So you can probably rule out a kernel bug.
I tested the lts-kernel, same problem...

@sushi2503
Copy link

Most importantly is that this get fixed.

@paulmenzel
Copy link

So you can probably rule out a kernel bug.

Why? @bsdice wrote that reverting some patches fixed the issue for them.

I tested the lts-kernel, same problem...

Patches are backported to LTS series too. What Linux kernel do you mean exactly? The 6.10.3 from your earlier comment?

@bsdice
Copy link

bsdice commented Aug 9, 2024

@istenburg No you can't rule that out, try a kernel from 3 months ago like 6.6.40. The stable kernel branches all got backported the three patches. That is why you see it now on Arch which is rolling linux package with latest kernel + patches and linux-lts package which is on 6.6 branch + even more stable patches.

@bsdice
Copy link

bsdice commented Aug 9, 2024

https://lore.kernel.org/stable/[email protected]/T/#mc60ff08e7c20d45faf9588e42048b5f01acaa0a3

Also broken: hddtemp, hdparm, probably more. This will be fun to watch, because the kernel is now acting properly, but applications are now misbehaving. UDisks will need a code update...

Edit: After thinking about it for a bit, best way forward probably is to revert the kernel changes, notify software maintainers of code using that API and give them 3 or 6 or 12 months to fix the use of the API. Then re-introduce the kernel change. hdparm will probably be fixed first, so people just have to copy or emulate the changes.

@sushi2503
Copy link

Yep, I have the same error with my external backup /dev/sda disk but began with my sr0.

Fun, as you say. Some codes are made with the elbows!

Let see the machinery...

@christian-heusel
Copy link

After @bsdice has commented on the Arch Linux Forum thread I also found this issue and forwarded it to the LKML thread aswell 😊

This is the message: https://lore.kernel.org/all/[email protected]

@tbzatek
Copy link
Member

tbzatek commented Aug 12, 2024

Sorry I haven’t looked closely, and do not know. (I am not a UDisk person.) I just know, that it wouldn’t matter as the Linux kernel has a no regression policy.

I wouldn't necessarily play on the no-regression-policy note in this case, it's not the same scenario like breaking syscall API. I.e. the following statement from upstream

TL;DR: it is very hard to say that we have introduced a regression, because
this crap has basically been broken in one way or another since it was
introduced...

may be actually correct.

@christian-heusel
Copy link

Well yes and no, especially since it breaks a ton of other tools aswell 😅

@tbzatek
Copy link
Member

tbzatek commented Aug 12, 2024

This is in a qemu-kvm guest.

qemu is known to have incomplete or incompatible implementation of ATA emulation, both for disk drives and optical drives.

The issue discussed recently is a separate one. Suggesting to continue in #1305 (not sure if github allows to split existing discussions).

@tbzatek
Copy link
Member

tbzatek commented Aug 12, 2024

Well yes and no, especially since it breaks a ton of other tools aswell 😅

Any reasonable CI would catch this. I agree that this kind of change is better to coordinate with userspace prior to pushing to stable kernel branches, but it happened in the past a couple of times.

UDisks needs to be fixed either way, this particular code has been sitting there for over a decade with no recent developer having actual hands-on experience with ATA commands.

@bsdice
Copy link

bsdice commented Aug 12, 2024

Niklas came up with a patch to bridge kernel correctness and a little ignorant user space software: https://lore.kernel.org/linux-ide/[email protected]/

Nine days from report to tentative fix, underappreciated agility.

@yan12125
Copy link

From the commit message in https://lore.kernel.org/linux-ide/[email protected]/, it seems something in udisks should be changed?

A lot of user space programs incorrectly assume that the sense data is in descriptor format, without checking the RESPONSE CODE field of the returned sense data (to see which format the sense data is in).

@paulmenzel
Copy link

Sorry I haven’t looked closely, and do not know. (I am not a UDisk person.) I just know, that it wouldn’t matter as the Linux kernel has a no regression policy.

I wouldn't necessarily play on the no-regression-policy note in this case, it's not the same scenario like breaking syscall API. I.e. the following statement from upstream

TL;DR: it is very hard to say that we have introduced a regression, because
this crap has basically been broken in one way or another since it was
introduced...

may be actually correct.

That does not matter. Linux never breaks userspace, so people can be sure they can always update the Linux kernel without problems.

@sushi2503
Copy link

6.10.4, 6.10.5, still the same.

@christian-heusel
Copy link

Yes, the reverts have just been added to the stable series :) https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-6.10/revert-ata-libata-scsi-honor-the-d_sense-bit-for-ck_cond-1-and-no-error.patch

@sushi2503
Copy link

Nice, soon fixed.

@sushi2503
Copy link

Since kernel 6.10.6 I don't see the errors anymore, sounds like fixed.

@tri-llionaire
Copy link

6.10.6 works for me, on 6.10.7 it's broken again. I had to boot back into 6.10.6 to mount the drive.

@sushi2503
Copy link

Here on 6.10.7 and 6.10.8 it's working fine.

@tbzatek
Copy link
Member

tbzatek commented Sep 9, 2024

So I have the affected kernel 6.10.4 here and can reproduce the problem. As a source of inspiration I took a look at udev's id_ata and to my surprise found out the code was originally written by the same author. Only that udev has been maintained properly over the years, while nobody touched udisksata.c since 2012.

Important fixes from udev are systemd/systemd#13654 and systemd/systemd#24923.

Anecdotally this concern more or less applies to udisks as well: systemd/systemd#2362 (comment) and systemd/systemd#2363 (comment)

Posted #1311 with a quick fix, please test. We don't have variety of hardware to test this on.

@tbzatek
Copy link
Member

tbzatek commented Sep 9, 2024

Experimenting with my SATA drives and qemu ATA device, it turned out a SG v3 code path is taken. This udev change has effectively taken sense code checking out of the v3 path while udisks still maintains it and fails on qemu devices: systemd/systemd@66026fc

I don't think that change is right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants