-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[APU2] Kernel error: gpio-keys-polled gpio-keys-polled: unable to claim gpio 0, err=-517 #204
Comments
HI, we have just recently noticed the issue. I doubt it is a module ordering issue, because the ACPI GPIOs definitions used by gpio-keys-polled come from the GPIO controller which binds to gpio_amd_fch, so one shouldn't work without the other. THe real problem lies within the Linux kernel itself... There is a separate driver added to the kernel for apus: https://github.com/torvalds/linux/blob/master/drivers/platform/x86/pcengines-apuv2.c |
Okay, thanks. I think I understand now... So, we have the modules pcengines_apuv2 and leds_gpio (or gpio_keys_polled?) fighting over the gpio control of these leds. That means as a workaround, we could simply blacklist pcengines_apuv2 until a better fix lands either in the kernel or firmware. I actually just tried that (blacklisting pcengins_apuv2) and the error messages are gone now, while the leds are still operational (and the first led is now on again, once the APU2 is booted). So, that seems to work. Is there any crucial functionality missing if I blacklist this module? I'm guessing no, since the module wasn't available on my older Debian 10 kernel anyway and looking at the code you linked, the only other thing that seems interesting to me is the front button, but I have never used it before either, so I guess I could sacrifice it for a temporary workaround. |
There is no difference between ACPI definitions of the buttons and leds and the kernel driver. Bot support the leds and front button in the same manner. However:
There is no good-for-all solution in this case. If you use the latest firmware, then you are probably better blacklisting the kernel driver, because ACPI will handle it anyway. |
Just to clarify: By kernel driver you mean the module pcengines_apuv2? |
Indeed |
Use ACPI/GPIO to handle LED config instead. pcengines/apu2-documentation#204
@miczyg1 Is the newly released firmware v4.14.0.5 supposed to fix this issue? Going by the release notes, I assumed so, but apparently, it does not. I installed v4.14.0.5 today. After rebooting, the LEDs stay turned off and there are no sysfs entries for them anymore – which is to be expected, because I had blacklisted the pcengines_apuv2 module as a workaround before. If I undo this blacklisting configuration, the LEDs are functional again, but the kernel error messages from gpio-key-polled show up again. So maybe the ACPI GPIO entries are not the root cause for these error messages? |
@silentcreek we will be testing the |
Hey there, leaving this here for posterity and people finding it via search. Just received one of the last APU 4D4 right before the Great Shortage. Flashed last available apu4_v4.14.0.5.rom to it. Since I prefer stability and need some patches, I compiled long-term supported kernel 5.4.156 on Arch. With this version LED control is unavailable because of the ACPI GPIO change in apu4_v4.14.0.5.rom vs apu4_v4.14.0.4.rom. Took me a fair bit of digging to figure this out. I recommend on page https://pcengines.github.io/index.html in known issues a new item:
Support for 5.4 will end only in December 2025 according to https://en.wikipedia.org/wiki/Linux_kernel_version_history so a fair bit of life is left in this one. LEDs will show up in /sys/class/leds/ as per usual. I added ledtrig-heartbeat and ledtrig-netdev to /etc/modules-load.d/apu.conf and removed all blacklisted modules in /etc/modprobe.d, they are not needed. Example:
Happy blinking. ;-) |
@bsdice Have you tried loading the module pcengines_apuv2 or is your custom kernel not compiled with CONFIG_PCENGINES_APU2? |
@silentcreek I have CONFIG_PCENGINES_APU2=m but it is of no use because if you look at https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/platform/x86/pcengines-apuv2.c?h=linux-5.4.y there is no support for "apu4" in 5.4. There is an even older driver, https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/leds/leds-apu.c?h=linux-5.4.y which supports only APU1. This is also why on kernel 5.4 I get no conflicts when using a APU 4D4 board, simply not supported by the specialized drivers. For newer kernels indeed it would appear that on BIOS 4.14.0.4 and below one could blacklist pcengines-apuv2 module and rely on ACPI support of leds-gpio. Whereas on BIOS 4.14.0.5, ACPI is no longer supported and you need a more recent kernel >= 5.5 and the updated pcengines-apuv2 module with APU4 support, see torvalds/linux@3d00da1#diff-c49c6019d36475a9c027f716107d819089d60cb74455e851cad9bf62f4d1e643 S1 seems available through /proc/interrupts on my system and raw GPIO access to switch over SIM slots etc. is also there by using /sys/class/gpio/gpiochip320/ngpio et al. |
@bsdice Ah, sorry, I completly disregarded the fact that you have a different model, namely the APU4. Then I guess that change between v4.14.0.4 and v4.14.0.5 really did more harm than good and should be reverted after all... |
I have done some testing with the the This issue seems to be fixed in the This, combined with the fact that the driver is only present in recent kernel versions, makes me think that the ACPI bindings should be restored. @miczyg1 what do you think? |
@mkopec in such case we need to revert the changes which removed the ACPI bindings and explain in the documentation that the kernel driver is buggy and shall be blacklisted. Later we may try to update the status of these commits and let people know if these 3 commits work or not. |
The documentation was updated here: https://pcengines.github.io/apu2-documentation/linux_issues/ |
Just leaving this here for folks finding this through a search engine... People running Arch may want to check out my Linux kernel 5.4 build script here: https://seitics.de/files/linux-5.4/ Features:
Contact info is at the top of the domain. I've uploaded a test kernel+include package for Arch so it is easy to try. No warranties. I use it on Arch on APU4D4. |
Hi, speaking as the official kernel maintainer of the apu board family driver. Sorry for replying so late, but i just found this ticket only by accident - obviously nobody here ever considered
Due lack of exact information on your Kernel configuration, I can only guess: a) you still got the old (meanwhile removed) apu1 driver running, and these two are conflicting b) you catched the intermediate BROKEN firmware versions, which added some arbitrary, unthoughtful and
Works as designed. The kernel should never mess w/ gpios or leds on its, unless there's a special need for that.
Yes, there was a problem that leds_gpio or gpio_amd_fch might not have been loaded, when pcengines-apv2 tries to initialize them. Only happens if these are compiled as modules. You can test this by loading them in the correct order
And you should blacklist (or not compile) amd-pinctrl - this is the wrong driver, which might be triggered by broken ACPI tables.
Do some gpio or led devices appear w/o loading the apu board driver ? In this case, you're one of the unlucky ones who got a broken firmware (see above). --mtx |
The APU board driver doesn't use APCI at all - and there shouldn't be any gpio ACPI bindings at all, since APCI just can't describe the whole setup in a sane way. And certainly: DO NOT USE AMD0030 (!!!) - it's the WRONG device, and risk hardware damage.
The APCI bindings (which were introduced much later than the pcengines board driver - before you that, you folks actually suggest playing dangerous games with raw register access from userland!) have been broken from the beginning and should have never been introduced w/o at least speaking to the kernel maintainer. You broke it, not us. |
You have the (stable) apu board driver fighting with broken newer firmware. Downgrade the firmware.
Depending on firmware version: several gpios (eg. reset lines), different and now suddenly model specific naming of leds, different keycode, ... or maybe loosing it all. And by the way: this is isn't the correct place for discussing kernel issues - just for reporting broken firmware. --mtx |
@metux Thanks for the insight. I perceive an aggressive tone though. Why that? I realize this is sometimes needed on LKML and Torvalds has been doing it for decades, but this isn't LKML... ;-) I think you just made people delete their posts because I see comments in my inbox which are no longer here. Anyhow. APU2 seems EOL because of chip shortages. There hasn't been a Coreboot update since half a year, so I presume a contract with Pascal ran out. Which probably means this ACPI issue will never be fixed in Coreboot PCengines branch. I am inclined to patch it locally and compile one last final ROM for APU4D4 myself. Because userland obviously should never be able to damage hardware. I also checked kernel 6.2 and somewhere in last months CONFIG_PINCTRL_AMD became bool, so can't make it a module anymore and blacklist it. Would need to disable it entirely. I realize in the field out there basically no-one will be running 6.2. |
There's a lot. If you had talked to me (or maybe asked Pascal + friends - I had explained it detail), before making such wild hacks, you would have known, that this statement is wrong: The apu board driver adds only those gpios that are known to be safe, the LED names are not model specific.
As said, it names the LEDs in HW specific ways, thus userland needs to change again for every single board model.
Support is under way. But since hw/firmware vendor is so uncooperative, not even attempts to talk to the official kernel maintainer, I didn't have this on high priority.
No, that's a very bad solution. As said: existing userland - in the field (or even on high seas) for many years - now needs to be changed. And it's using the WRONG driver. And no: ACPI just cannot describe the full setup in a sane way. |
Whohoh! You're binding two drivers to the same register space ? Good luck that. As the mess w/ broken firmware and uncorpative vendor continues, I'm really considering APCI blacklisting.
This breaks stable distros and make industrial customers unhappy. Why should anybody still spend so much money for such an unreliable platform instead of some cheap china stuff from Wish or Aliexpress ? --mtx |
Your "fix" made it worse. Loosing existing functionality and breaking existing userland. Why didn't you just talk to me first ? Such things don't work well by a one man show. We have good reasons for our extensive open discussions and review process. |
Actually, I'm quite dissappointed and pissed. You probably don't know the long story behind this, so I give you a short overview: Back when writing the apu board driver, I already was in contact w/ PCengines (they knew me and my role). At that time there was no intent to change anything at PCengies side. The APU board driver (that doesn't use any acpi stuff) went into mainline and was shipped by Distros. So people now got mainline and Distro support for LEDs, front keys, sim switch and pcie/baseband reset. (assuming their hardware was correctly wired as documented in the schematics). Some time later (too lazy to scan the git log again, when exactly that was), they suddenly added some pieces of ACPI bindings, in a quick shot (w/o even telling us first), changing the LED naming (now model specific) and key code and dropping all the other stuff. And the worst of all: highjacking an ACPI ID for some very different devices, that just happens to share some common ancestor with the actual one (creating potential for HW damage by uncautious users) Creating a situation where - depending on firmware version (!) - where users loose functionality and changing userland ABI - thus breaking existing userland. And who's gonna have to deal with pissed users (or lots of broken installations in the field) ? Guess who ! I'm really fed up w/ cleaning up the mess of the firmware vendor again and again. I hoped that PCengines would be different, being the one x86 vendor doing some decent work. But I'm DEEPLY DISAPPOINTED. The whole mess could have been prevented several years ago, if somebody there had spent a few mins to pick up the phone and call me (or at least drop a mail). With a bit of cooperation we could have full mainline support for the whole APU line, all models, including the fresh ones.
Please do so, and submit a pull request. (sadly I'm too busy for also taking care of FW) I actually thought about the same, but just have limited range of board models right now, so testing will be tricky.
Indeed. This was exactly the main reason for writing a new gpio driver.
Yes, that's because it's just meant for some very low level board/soc specific stuff - it doesn't make any sense to load it later (eg. after PCI setup). Sorry for being noisy, but I have to repeat it again: this is about an entirely different device. Pcengines highjacked the device id of a different device -- thus broke ACPI rules.
For the time being, I don't see much more practical chance except recompiling w/ CONFIG_PINCTRL_AMD. Maybe (if I'm bored enought), I could add specific blacklisting in APCI core. But that will be last resort, we already have too much of such nasty quirks in the kernel. And at certain point it's really questionable whether we kernel folks should clean up the mess that HW vendors produce, just for free. At certain point we have to say: "sorry folks, not our problem, get some decent HW instead". |
@metux If I may ask, how did you stumble on this issue? Did you wreck a board? Wreck how much? Or did some commercial user wreck one? How is that your problem even? Comparing an APU2..4 to one of those Made in China boards is too strong for me. Basically nobody does thermal management or signal integrity right. Pascal does, knows his stuff. And no-one has ECC RAM which works properly in an embedded AMD64 platform. Everything made in Asia will just crash instead. I just built a release 4.19.0.1 for my APU4 using pce-fw-builder on coffee break. Now trying to figure out what I want to get rid of. pcengines/coreboot@bca61d7 and maybe something else. |
The file gpio.asl i.e. AMD0030 as mentioned by you was last touched in summer of 2019 (!) https://github.com/pcengines/coreboot/commits/release/src/mainboard/pcengines/apu2/acpi/gpio.asl Are you certain that leds.asl and buttons.asl aren't just a minor nuissance which prevent pcengines_apuv2 from loading? So delete gpio.asl and also from dsdt.asl. Quite the lobotomization. |
After accidentially writing to some regs in the suppsed gpio bank, the board behaved erratic. What I do recall is the words "undefined" and "reserved" in some AMD FCH specs. Now we know that "reserved" really means "never ever touch it, or you break things".
Yes, Asian boards are usually a great deal of fun. If I wasn't under NDA, I could tell funny stories from various industrial places.
Look for the gpio ASL files. |
Are we looking at the same branch ?
"minor" depends on whether you actually need this stuff and have existing applications written for the mainline kernel.
Yep, just drop that ASL and rebuild, should do the trick. |
I'm using the default Debian kernel configuration which enables the APU1 and APU2 drivers as modules:
No. The APU1 module is not loaded by default on my system.
Which versions do you mean? Looking at the history of src/mainboard/pcengines/apu2/acpi/, the GPIO bindings were added in 2019, then they were removed in October 2021 and then the removal was reverted in November 2021. Do you mean all releases between July 2019 and today, except for then one between October and November 2021 are broken, intermediate releases? In any case, yes, I was/am using those firmware releases.
Doing this leads to the original kernel error message reported here. But that is, per your description, to due the APCI definitions in the firmware.
The module pinctrl_amd is not loaded on my system by default. So, it doesn't seem to be triggered by these ACPI tables (and I haven't blacklisted it). Edit: Oh, Debian seems to compile-in the pinctrl_amd driver, rather than building it as a module (
Yes, since I blacklisted the APU2 driver after this discussion here, the leds appear in sysfs. I didn't check other GPIO devices, because I was only interested in controlling the LEDs and that works just fine.
I don't know the proper way to look for ACPI devices, but Edit: I did find something under /sys/class/gpio: |
From a user perspective, I think this is a bit shortsighted. The error manifested itself after a kernel upgrade (4.19 to 5.10) which brought a new driver. The firmware or hardware haven't changed and at that point, so it isn't too far-fetched to think it's a kernel regression - again, just as a mere user. Now, I'm not trying to suggest that you're wrong (especially since I can't make any technical judgements here). I'm just trying to say, I wouldn't paint this as black and white, especially considering, that the first mainline release including your driver and the first firmware release with these ACPI bindings are just 3 months apart.
True, indeed. My bad. But to be fair, I think 3mdeb was in general really helpful with all sorts of issues, and unfortunately, I can't say I've always gotten replies to bug reports sent to LKML or Kernel Bugzilla. |
So, I built a firmware with these three asl files removed, and their respective include statements removed from dsdt.asl as well. Now, when I boot my system (Debian 11 with Linux 5.10), I get the original kernel error messages again! I checked that the APU1 driver is not loaded and I can also see that the gpiochip with the alias "acpi:AMD0030:AMD0030:" is gone (there is a different gpiochip now with the label gpio_amd_fch). In addition, the leds show up with different names now in sysfs and there are no duplicate entries, leading me to believe that removing the ACPI bindings worked. So, I checked to see if the modules are loaded in the correct order, but that doesn't make a difference either.
Even if I remove the modules and load them manually in this order, the error message shows up. Any ideas how to resolve this? |
@metux The more I think back to the history of this issue, the more I believe we're missing something here.
This is actually not causing the kernel errors I reported. It does not matter which firmware you use. The error still happens. After reading through this issue here again, I noticed that I actually did try a firmware release where 3mdeb removed the ACPI GPIO definitions (v4.14.0.5) back in October 2021. I still got the kernel errors. So, it's no surprise that my newly custom built firmware without the APCI bindings doesn't solve the problem either. We're still missing the root cause for the kernel errors, it seems. |
I tried to fix the error as well (Linux kernel 6.2.1, latest firmware lobotomized) but failed and ran out of time. Switched back to v4.16.0.1 03/07/2022 on APU4D4, which I think is one of the most stable for me. Updated to kernel 5.4.233 with 1004-gpio_keys_polled.patch and 1004-gpio_keys_polled.patch patches as mentioned, also with option CONFIG_PINCTRL_AMD=y to eliminate the driver race. Nothing blacklisted. Not using any GPIOs and I also don't care if LED names are platform-agnostic. ;-) |
Possibly late to the party but I fixed this the "easy" way by simply loading a modified DSDT at boot-time. Debian and many other distro kernel configs enable The process is pretty straight-forward:
On Debian (using
Confirmation appears at the end of the verbose (
Only when NOT using patched Debian
If all goes well on reboot the kernel should report something similar to:
And now the generic naming of the items ("apu" not "apu2") is in use so any scripts that manipulate them will need correcting:
|
Thanks. Let me discuss with the team the correct approach here. Then, we can extend the validation suite for Dasharo (coreboot+SeaBIOS) to check if this method works on every release. It will likely not happen in the upcoming one, but it will happen in the next one. |
Use ACPI/GPIO to handle LED config instead. pcengines/apu2-documentation#204
Hi,
after upgrading my APU2 from Debian 10 (Kernel 4.19) to Debian 11 (Kernel 5.10) I'm getting a couple of these error messages at boot time:
kernel: gpio-keys-polled gpio-keys-polled: unable to claim gpio 0, err=-517
I'm also noticing that the first LED stays off which was not the case before.
I found a forum discussion [1] that suggests that this issue should be fixed in Linux 5.3 and that this is likely a module ordering issue. The former can't be true since my kernel is newer than 5.3, obviously. And as for the latter: I tried explicitely loading the mentioned modules leds-gpio and gpio_amd_fch in different orders by adding a cofiguration file in /etc/modules-load.d/ but that didn't work.
I also tried upgrading to the latest coreboot version v4.14.0.4 (was on some 4.12.x release before), but that didn't help either.
Does anyone know how to fix this? Thanks!
Timo
[1] http://pcengines.spruz.com/forums/?page=post&id=F1D75D99-D369-4310-B098-2F222A3CDBAC&fid=1A77794F-FF7D-44CA-AF64-CAA2588102ED
The text was updated successfully, but these errors were encountered: