Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Classic HID host: unreliable pairing to some Android devices #607

Closed
Slion opened this issue Jun 14, 2024 · 55 comments
Closed

Classic HID host: unreliable pairing to some Android devices #607

Slion opened this issue Jun 14, 2024 · 55 comments

Comments

@Slion
Copy link

Slion commented Jun 14, 2024

Pico W SDK 1.5.1

Various Android devices are pairing successfully and auto reconnect just fine through Interface, no issue for instance on Huawei P30 Pro (Android 10), HONOR Magic V2 (Android 14) or F(x)tec Pro¹ (Android 11).

However neither Samsung Galaxy Tab S6 (Android 12) nor Tab S8 Ultra (Android 14) could be paired through Interface. Both tablets exhibited slightly different behaviour. The S6 would not show a pairing confirmation prompt but it would still connect without pairing. The S8 would show a pairing confirmation and would connect successfully even though pairing actually failed. Both tablets having failed to pair with the Pico W are not able to auto reconnect once the connection is lost.

If I go through Android system Bluetooth settings to initiate the pairing it eventually succeeds. On both tablets, the first pairing attempt fails but a second attempt, shortly after the first one, hangs for a suspiciously long time and then succeeds. However at least the S8 still manages to lose the pairing after all and eventually keeps prompting to confirm pairing. Trying to pair from Honor Magic V2 Android Bluetooth settings also fails so I guess you are not suppose to do that, you need BluetoothHidDevice to be up and running.


Can the HID host somehow request different kind of pairing method? Thus I could try different ones see if some work better than others.

I'm getting those logs as usual:

16:52:24:087 -> SSP User Confirmation Request with numeric value '879295'
16:52:24:092 -> SSP User Confirmation Auto accept

When the host initiate the connection it fails straight away with L2CAP_CONNECTION_RESPONSE_RESULT_REFUSED_SECURITY.
That seems to be the case for all Android devices running BluetoothHidDevice or not.

Came across raspberrypi/pico-sdk#1457 so I reset the flash but it did not help.

@Slion Slion changed the title HID host: some devices connect but fail to pair Classic HID host: some devices connect but fail to pair Jun 14, 2024
@mringwal
Copy link
Member

Resetting the flash should help if the Pico W has a stored bonding, but the remote device has lost it.

You'll need to check the logs for details. You can test hid_host on a desktop, which directly provides HCI log files, see port/libusb for using an USB Bluetooth on Mac/Linux.

You can configure your IO Capabilities via gap_ssp_* in src/gap.h

@Slion
Copy link
Author

Slion commented Jun 14, 2024

You can configure your IO Capabilities via gap_ssp_* in src/gap.h

Thanks, tried the following but still no joy.

    gap_ssp_set_io_capability(SSP_IO_CAPABILITY_DISPLAY_YES_NO);
    gap_ssp_set_auto_accept(true);

Same with SSP_IO_CAPABILITY_NO_INPUT_NO_OUTPUT.

@Slion
Copy link
Author

Slion commented Jun 15, 2024

Resetting the flash should help if the Pico W has a stored bonding, but the remote device has lost it.

Mentioned that in edits above, it did not help.

@Slion Slion changed the title Classic HID host: some devices connect but fail to pair Classic HID host: some Android devices connect but fail to pair Jun 15, 2024
@Slion Slion changed the title Classic HID host: some Android devices connect but fail to pair Classic HID host: unreliable pairing to some Android devices Jun 15, 2024
@Slion
Copy link
Author

Slion commented Jun 15, 2024

I've updated the first post with my latest findings. The situation is not as bad as I initially thought since I eventually found a workaround to pair with those problematic devices. Still I'm planning to provide full logs at some point to enable a thorough investigation and hopefully a fix.

@Slion
Copy link
Author

Slion commented Jun 16, 2024

Logs from a failed pairing with Samsung Galaxy Tab S8 initiated from Interface - connection succeeds but pairing fails:
tab-s8-pairing-fail.zip

@Slion
Copy link
Author

Slion commented Jun 16, 2024

Logs from a successful pairing with Honor Magic V2 initiated from Interface:
magic-v2-proper-pairing.zip

@Slion
Copy link
Author

Slion commented Jun 16, 2024

You'll need to check the logs for details. You can test hid_host on a desktop, which directly provides HCI log files, see port/libusb for using an USB Bluetooth on Mac/Linux.

Not sure what you mean by this. I usually develop on Windows but I have a Linux workstation I use mostly for building Android.

@Slion
Copy link
Author

Slion commented Jun 16, 2024

  • When I initiate pairing from my Windows PC to my Tab S8 I get the pairing prompt with numeric comparison on both the tablet and the PC and all works fine.
  • When I initiate pairing from Pico W I get L2CAP_CONNECTION_RESPONSE_RESULT_REFUSED_SECURITY.
  • When I initiate pairing from Tab S8 to Pico W I get the pairing confirmation prompt on the tablet without number and pairing fails.

How can I setup the Pico W so that it will send pairing request with numeric comparison? My hope is that this will behave more like the PC and should work.

@Slion
Copy link
Author

Slion commented Jun 16, 2024

How can I setup the Pico W so that it will send pairing request with numeric comparison?

So using SSP_IO_CAPABILITY_NO_INPUT_NO_OUTPUT I would get the simple pair confirmation.
Now using SSP_IO_CAPABILITY_DISPLAY_YES_NO I get the numeric confirmation but otherwise it behaves all the same, no improvements. Pairing from Tab S8 triggers numeric confirmation but eventually fails though connection works. Pairing from Pico W fails early with L2CAP_CONNECTION_RESPONSE_RESULT_REFUSED_SECURITY, no pairing prompt, no connection.

    gap_ssp_set_io_capability(SSP_IO_CAPABILITY_DISPLAY_YES_NO);
    gap_ssp_set_auto_accept(true);
    gap_secure_connections_enable(true);

@Slion
Copy link
Author

Slion commented Jun 18, 2024

I'm not sure what the pklg files bring over the text files. Looking at text files differences between the Magic V2 successful pairing and the Tab S2 failed one, without knowing much about the protocols involved they look very similar. One thing stands out though on the Tab S8 side, it says: sm.c.4787: Unexpected PDU 1 in state 82.

No idea what that is though or if it could be relevant somehow😁

@Slion
Copy link
Author

Slion commented Jun 18, 2024

Yeah ok that's our issue here:

            log_info("Unexpected PDU %u in state %u", packet[0], sm_conn->sm_engine_state);
            sm_pdu_received_in_wrong_state(sm_conn);

And looking at that function implementation we indeed recognize a pairing error:

static inline void sm_pdu_received_in_wrong_state(sm_connection_t * sm_conn){
    sm_pairing_error(sm_conn, SM_REASON_UNSPECIFIED_REASON);
}

State 82 is SM_BR_EDR_W4_ENCRYPTION_COMPLETE. Not sure what PDU one is though, it's basically packet[0] which is sm_pdu_code.

Here is the last packet we received in the logs before that error:
ACL <= 0B 20 0B 00 07 00 07 00 01 00 00 20 10 07 07

@Slion
Copy link
Author

Slion commented Jun 18, 2024

Turns out handling of SM_BR_EDR_W4_ENCRYPTION_COMPLETE comes only when config defines ENABLE_CROSS_TRANSPORT_KEY_DERIVATION.

So I added the following to my config:

#define ENABLE_CROSS_TRANSPORT_KEY_DERIVATION
#define ENABLE_LE_SECURE_CONNECTIONS

Sadly that only took me one step further to sm.c.4787: Unexpected PDU 1 in state 83 which is SM_BR_EDR_INITIATOR_W4_FIXED_CHANNEL_MASK which is never handle in that switch.

@Slion
Copy link
Author

Slion commented Jun 19, 2024

To be confirmed over time but it looks like I found a workaround.
If I turn off secure connections the pairing is successful it seems.
gap_secure_connections_enable(false);

Though if the connection is not secure it's a major concern when we are talking about keyboard input so I would need to understand how unsecure is that exactly.

@mringwal
Copy link
Member

Hi. Could you try the current version of BTstack on the develop branch? We've fixed something that sounds similar.
If that doesn't fix your issue, please capture the full HCI trace incl. debug output and convert it into a .pklg file with tool/create_packet_log.py and post it here.


The HCI log is the ground truth, while the debug output can help to understand the internal state. It's best to have both, but if I had to choose, I would go with only the HCI trace, as the stack behaviour can be reconstructed from that while the debug output only show that something might have gone wrong.


With LE, you've got two security levels: LE Secure Connection with MITM protection, by entering the passphrase in the keyboard or numeric comparison on two systems - that's actually secure.
Any other option is also secure if there's no attacker present during initial pairing. if there's one present, it can either set-up a man-in-the middle (LE Secure Connections without MTIM protection), or crack the link key directly (LE Legacy Pairing).

@Slion
Copy link
Author

Slion commented Jun 19, 2024

Could you try the current version of BTstack on the develop branch?

Can I just replace the one from the Pico SDK with this one?

Shall I enable the following configuration options:

#define ENABLE_CROSS_TRANSPORT_KEY_DERIVATION
#define ENABLE_LE_SECURE_CONNECTIONS

@Slion
Copy link
Author

Slion commented Jun 19, 2024

please capture the full HCI trace incl. debug output and convert it into a .pklg

I thought that's what I did above. Was the debug output missing? Not sure how to enable it.

@mringwal
Copy link
Member

Can I just replace the one from the Pico SDK with this one?

Yes. You might need to fix the CMake list of build files however.

You can also test on desktop system with an USB Bluetooth dongle instead as the stack behaves more or less identical on all platforms.

Shall I enable the following configuration options:

Yes, please enable these.

@mringwal
Copy link
Member

I thought that's what I did above. Was the debug output missing? Not sure how to enable it.

Almost. It looks like the textual log looks like this:
08:02:37:153 -> [00:00:29.695] CMD => 35 0C 05 01 0B 00 01 00

Could you remove the first timestamp? The conversion tool expects the line to start with [xx:xx:xx.xxx] ...

In any case, I'm interested in the log when running the current version of develop for analysis (if the issue is still present)

@Slion
Copy link
Author

Slion commented Jun 19, 2024

With LE, you've got two security levels: LE Secure Connection with MITM protection, by entering the passphrase in the keyboard or numeric comparison on two systems - that's actually secure.
Any other option is also secure if there's no attacker present during initial pairing. if there's one present, it can either set-up a man-in-the middle (LE Secure Connections without MTIM protection), or crack the link key directly (LE Legacy Pairing).

This is classic Bluetooth though those Samsung devices they seem to involve BLE somehow.
Something to do with that cross transport key feature maybe.
May or may not be related to this: bluez/bluez#810

@mringwal
Copy link
Member

Oh. Missed that. gap_secure_connections_enable(false) disables BR/EDR Secure Connections, which by itself isn't less secure (at least I haven't read otherwise). However, without BR/EDR Secure Connections, Cross Transport Key Derivation is not possible, which most likely avoids the bug you've run into. It should have been fixed in newer versions of the stack.

@Slion
Copy link
Author

Slion commented Jun 19, 2024

Could you remove the first timestamp?

I'll double check that next time around. It's just noise from the disconnect at the start of the session I think.

Can you confirm that all what's needed for your logs is the following or do I need to enable the debug log somehow on top of the HCI logs?

target_compile_definitions(picow_bt_example_common INTERFACE
    #WANT_HCI_DUMP=1 # This enables btstack debug
    )

@mringwal
Copy link
Member

Your log looks good besides the unexpected additional timestamp in each line. The line above should look like this:
[00:00:29.695] CMD => 35 0C 05 01 0B 00 01 00

I don't work on the Pico W currently and don't know the details.

@Slion
Copy link
Author

Slion commented Jun 19, 2024

Oh. Missed that.

Sorry for spamming you with info on that issue but I had to take a deep dive in your code to get to the bottom of it 🪠
Thankfully it looks like it's paying off 🥳

@Slion
Copy link
Author

Slion commented Jun 19, 2024

Your log looks good besides the unexpected additional timestamp in each line.

Ho silly me, that's the timestamp from the vscode serial monitor 😏

@Slion
Copy link
Author

Slion commented Jun 19, 2024

I can confirm ENABLE_CROSS_TRANSPORT_KEY_DERIVATION and ENABLE_LE_SECURE_CONNECTIONS are not needed when gap_secure_connections_enable(false);, pairing from the Tab S8 just works simply by disabling those secure connections which isn't less secure apparently.

@Slion
Copy link
Author

Slion commented Jun 27, 2024

I've had another go at pairing from another Tab S8 Ultra to another Pico W with gap_secure_connections_enable(false); and without ENABLE_CROSS_TRANSPORT_KEY_DERIVATION and ENABLE_LE_SECURE_CONNECTIONS. It failed, connection worked but pairing failed. So it looks like this still needs further investigation. Not sure when I'll get around it.

@mringwal
Copy link
Member

Please always upload HCI logs. It should work with gap_secure_connections_enable(true); and ENABLE_CROSS_TRANSPORT_KEY_DERIVATION and ENABLE_CROSS_TRANSPORT_KEY_DERIVATION.

@Slion
Copy link
Author

Slion commented Jun 28, 2024

Please always upload HCI logs. It should work with gap_secure_connections_enable(true); and ENABLE_CROSS_TRANSPORT_KEY_DERIVATION and ENABLE_CROSS_TRANSPORT_KEY_DERIVATION.

I would need to try the development branch too. I'm still using the btstack version from the Pico W SDK.

@peterharperuk
Copy link
Contributor

The pico-sdk dev branch recently updated btstack if that's any help

@Slion
Copy link
Author

Slion commented Jun 29, 2024

The pico-sdk dev branch recently updated btstack if that's any help

Thanks for the heads up. Here is the commit. Looks like a dependency to some Bluedroid codec was added too. Might be easier for me to just wait for the next Pico SDK release.

@Slion
Copy link
Author

Slion commented Jun 29, 2024

I could try btstack v1.6.1 both with and without secure connection and ENABLE_CROSS_TRANSPORT_KEY_DERIVATION and ENABLE_LE_SECURE_CONNECTIONS I have the same issues. I'll see if I can test the develop branch.

@Slion
Copy link
Author

Slion commented Jun 29, 2024

The develop branch has the same pairing issues and it also crashes the board somehow soon after connecting.

@Slion
Copy link
Author

Slion commented Jun 29, 2024

Here are the logs from v1.6.1 failing to pair, I nuked the Pico W flash before that recording too:
v1.6.1-tab-s8-pairing-fail.zip

I still get that Unexpected PDU 1 in state 83 same as mentioned above.

Also the Bluetooth stack on the tablet is somehow messed up after that failed pairing, discovery ain't working until I turn Bluetooth off and back on. One of my Pico W did manage to pair with that tablet at some point. Back then I thought it was because I turned off secure connections but I can't reproduce it now.

@Slion
Copy link
Author

Slion commented Jun 29, 2024

I tried disabling BLE to test if that improved our pairing somehow but it did not change anything.

@Slion
Copy link
Author

Slion commented Jun 29, 2024

The develop branch has the same pairing issues and it also crashes the board somehow soon after connecting.

I know why it crashes. I need to adjust for #602.

@mringwal
Copy link
Member

Please share a HCI log that shows the "Unexpected PDU 1 in state xx" when using the develop branch.

@Slion
Copy link
Author

Slion commented Jun 30, 2024

There it is:
664b08a-tab-s8-pairing-fail.zip

@mringwal
Copy link
Member

mringwal commented Jul 1, 2024

Thanks. First impression: the error indicates that the SM is in state SM_BR_EDR_INITIATOR_W4_FIXED_CHANNEL_MASK, which is wrong, as it is not "Initiator" in your log, it should be Responder. (And yes, receiving a pairing request as initiator would be wrong). Stay tuned...

@mringwal
Copy link
Member

mringwal commented Jul 1, 2024

Ah... there's a Classic Role change before the connection is fully opened and BTstack stores the current role when it receives the Connection Complete event. That explains why BTstack assumes Initiator role. The question now is, what's correct here? Which side is the initiator for the SM pairing, if it is triggered after a Classic role change....

@mringwal
Copy link
Member

mringwal commented Jul 1, 2024

Do you configure Android to become Peripheral in your Keyboard simulator? If yes, could you try once to stay Central and see if the pairing works as expected?

@Slion
Copy link
Author

Slion commented Jul 1, 2024

I'm using that BluetoothHidDevice API which is fairly high level. I don't think I have control over the role Peripheral or Central. I'll take a closer look though. It's possibly worth noting that connection and pairing initiated from the Pico W are simply always not working with any Android device. That's a topic for another issue though, see #612.

I'm thinking the role change is to be expected in this rather unusual scenario where the device initiate the connection with the host.

@mringwal
Copy link
Member

mringwal commented Jul 1, 2024

I'm not able to get an answer from the Bluetooth Core v5.4 spec about which side should send the SM Pairing Request after a Role Change. A quick test indicates that iOS16 expects the device in Central role to send it while Android 14 expects the device that initiated the connection to do so.

Could you test the `develop-sm-role-change' branch and post the HCI log?
dc3e249

In this commit, BTstack tries to switch into Peripheral Role when it receives a Pairing Request although it was in Central Role when the pairing has started.

I'm not happy with this, as it there's a clear race condition between both sides sending a SM Pairing Request.
In your log, BTstack is still waiting for the result from the L2CAP Information Request which happens after Android sent the SM Pairing Request, but there's no guarantee for that and the question would be how a collision (both sides sends SM Pairing Request at the same time) should/could be handled.

@Slion
Copy link
Author

Slion commented Jul 1, 2024

Looks like it worked:
dc3e24965-tab-s8-pairing-fail.zip

So far I did two tries, one without nuking the flash or recording the logs and it did not work.
A second one with HCI logs after nuking the flash and, to my surprise, pairing was completed successfully.

@Slion
Copy link
Author

Slion commented Jul 1, 2024

Trying to reproduce that surprising success failed. I tried pairing again after unpairing from the Android tablet and that failed. I nuked the flash and tried again and it failed. Here are the logs from that failed attempt from clean flash:

dc3e24965-tab-s8-pairing-real-fail.zip

I'm not happy with this, as it there's a clear race condition between both sides sending a SM Pairing Request.

Maybe that's indeed what's happening here and that's why sometimes it works but mostly it does not.
Surely there is a way to sort it out though. Pairing with Windows PC is reliable for instance. Is there a way to capture logs from pairing with a PC? Would that help? I had in mind to implement an HCI Bluetooth USB dongle with a Pico W that could easily be used to capture such logs. In theory it's fairly easy to do with TinyUSB, not sure if I can easily forward the HCI commands either through BTstack or to the driver directly.

@mringwal
Copy link
Member

mringwal commented Jul 1, 2024

Thanks for the logs. Both logs go over the initial issue, so that's good. One shows a complete pairing process while the other doesn't change state and reports pairing as failed. I'll retry with my Android 14 phone tomorrow to see if I can reproduce it.

You Windows PC most likely either does not support BR/EDR Secure Connections as only a few newer USB Bluetooth Dongles support it (but it would be visible in the HCI log). This might help to get HCI traces on Windows: https://learn.microsoft.com/en-us/windows-hardware/drivers/bluetooth/testing-btp-tools-btvs
(Again, only if you're curious, I hope that your issue reproduces against my Pixel 7a).

@Slion
Copy link
Author

Slion commented Jul 1, 2024

I'll retry with my Android 14 phone tomorrow to see if I can reproduce it.

So far I could only reproduce the issue with Samsung tablets as mentioned in the first post. Even Lineage OS worked fine so I'm guessing it could be Samsung specific. I could test a Samsung phone as I have not done it yet. You may not be able to reproduce it with your Pixel.
You should be able to download that Interface app from the Play Store and use it for testing. It will disconnect after 5mn I believe unless you buy a subscription. If you want I could also add you to the testing group so you can "buy" free fake subscriptions. I ought to release a new version though as I have made a few useful changes to the Bluetooth menu.

However the issue might be reproducible without the app simply by trying to pair from the Bluetooth settings.

@Slion
Copy link
Author

Slion commented Jul 2, 2024

One shows a complete pairing process while the other doesn't change state and reports pairing as failed.

Yes, this is consistent with what happened. The first one was a success even though the file is named "-fail". I did not realize it was a success until I was posting it. Apologies about the confusing naming.

The second one, named "-real-fail" was indeed a failed pairing.

@Slion
Copy link
Author

Slion commented Jul 2, 2024

I just tried with a Samsung Galaxy A22 phone and same issue. Connection is working but pairing failed. That's indeed very much a Samsung specific issue so far.

That brings me to my next point about Bluetooth proper vocabulary. I find those definitions for pairing and bonding.

Pairing: The process of generating, distributing, and authenticating keys for encryption purposes.
Bonding: The process of pairing followed by distribution of keys used to encrypt the link in future reconnections.

Assuming those definitions are valid for both Classic and Low Energy, I believe this issue is in fact about bonding failing but pairing still working. Is that correct?

@Slion
Copy link
Author

Slion commented Jul 2, 2024

I just tried with a Samsung Galaxy A22 phone and same issue. Connection is working but pairing failed. That's indeed very much a Samsung specific issue so far.

I tried that again and bonding worked this time. Also with the tablet I could get successful bonding again. So far with your patch I have between 25% and 50% chance to get a successful bonding.

@mringwal
Copy link
Member

mringwal commented Jul 2, 2024

Your logs show that the initial step, Android sending a Pairing Request when we wait to send it ourselves a bit later, is working now. But there's a different issue/bug that causes the fails.
I can observe a similar behaviour on the Pixel 7 / Android 14, but with the patch, it worked 3 times out of 3.

Could you send me 3 more logs where pairing fails for you?

Terminology: I'm usually happy if people don't mix connect and pairing/bonding. In this case, the issue is with the last step of the LE pairing/bonding part, where the device exchange their real Bluetooth address. The LE Long Term keys are generated as part of the Cross-Transport Key Derivation, where both sides calculate the LE key from the Link Key generated in the Classic (BR/EDR) pairing. As this happens locally on each side, the LE key isn't actually transferred over the air...

@Slion
Copy link
Author

Slion commented Jul 3, 2024

Could you send me 3 more logs where pairing fails for you?

I'm afraid not, it looks like they all succeed today 🥳 I modified the Android app Bluetooth menu to make it clear when the bonding actually worked. Maybe I was seeing false negative yesterday cause the Android Bluetooth settings don't always show the newly bond devices immediately.
3-tries-100%-success.zip

@mringwal
Copy link
Member

mringwal commented Jul 4, 2024

I was able to identify a bug in the key distribution, which was responsible for the incorrect behaviour in the "S8 dc3e249" case.

While comparing the behaviour of iOS and Android, I've realized that for both OS, the device in Central Role at the time of the Encryption Change event initiates the CTKD. That's great as BTstack didn't use the current role at this point and it's easy to do so - without implementing any work arounds. Please try the version on develop.

I'm closing this issue as CTKD works even if there are role changes for iOS and Android. Feel free to open a new issue if needed.

@mringwal mringwal closed this as completed Jul 4, 2024
@Slion
Copy link
Author

Slion commented Jul 4, 2024

Please try the version on develop.

One quick try and it works great so far.

I was able to identify a bug in the key distribution

Awesome work, thank you.

Also the Bluetooth stack on the tablet is somehow messed up after that failed pairing, discovery ain't working until I turn Bluetooth off and back on.

That's still an issue apparently but out of the scope of this ticket and possibly not worth investigating anyway. Also somehow it fixes itself after a few minutes.

@Slion
Copy link
Author

Slion commented Jul 4, 2024

Shall I keep using ENABLE_CROSS_TRANSPORT_KEY_DERIVATION and ENABLE_LE_SECURE_CONNECTIONS knowing that I still want to support both BLE and Classic?
I only defined them when I started investigating that issue. They were not defined in the HID example.

@mringwal
Copy link
Member

mringwal commented Jul 5, 2024

Yes. If the other side supports CTKD it's tricky (= I haven't figured it out yet) how to properly avoid it. If the peer starts it and we don't support it, the pairing fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants