-
-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESP32S3: crash when subscribing with phone but not laptop #106
Comments
Update: the problem seems to be occurring somewhere in ble_gap_conn_find in ble_gap.c. At first, the connection works fine, for a second or so. During this time, every call to ble_gap_conn_find is looking for handle 1, which is the first one it finds. But then eventually, it gets a call to find handle 65535 (0xffff), in a list from 1 to 7 (I have 7 connections active), which fails and returns null, causing the error. This seems like the handle's gotten corrupted or otherwise deleted. Happening in a call to ble_att_svr_read_handle, which can replace the conn_handle with BLE_HS_CONN_HANDLE_NONE = 0xffff, under certain circumstances. Continuing to poke around. Update 2: Although several notifications successfully take place and several packets of data are transmitted before something goes wrong, the error happens on the first Gap event after subscription, which is a Read event (BLE_GATT_ACCESS_OP_READ_CHR). Seems like this event doesn't fire for the first few packets at all? I don't really know how Gap events work, so I can't shed much light there. |
Update 3: Here's a full stack trace immediately before the call to ble_gap_conn_find that returns 0 (causing the assert fail): `Backtrace: [...] 0x42037114: ble_gatts_val_access at [idf path]/esp-idf/components/bt/host/nimble/nimble/nimble/host/src/ble_gatts.c:375 0x42037196: ble_gatts_chr_val_access at [idf path]/esp-idf/components/bt/host/nimble/nimble/nimble/host/src/ble_gatts.c:421 0x4203a85a: ble_att_svr_read at [idf path]/esp-idf/components/bt/host/nimble/nimble/nimble/host/src/ble_att_svr.c:398 0x4203b34a: ble_att_svr_read_handle at [idf path]/esp-idf/components/bt/host/nimble/nimble/nimble/host/src/ble_att_svr.c:473 0x42032173: ble_gattc_notify_custom at [idf path]/esp-idf/components/bt/host/nimble/nimble/nimble/host/src/ble_gattc.c:4169 0x42025234: NimBLECharacteristic::notify(unsigned char const*, unsigned int, bool) at [project]/components/esp-nimble-cpp-1.4.1/src/NimBLECharacteristic.cpp:512 0x4202524d: NimBLECharacteristic::notify(bool) at [project]/components/esp-nimble-cpp-1.4.1/src/NimBLECharacteristic.cpp:419 [my function calls] 0x42006a11: loopTask(void*) at [idf path]/esp-idf/components/arduino/cores/esp32/main.cpp:50 0x40382c6d: vPortTaskWrapper at [idf path]/esp-idf/components/freertos/port/xtensa/port.c:131` |
Update 4: in ble_gattc_notify_custom, argument txom is non-null until the very last call (the one that causes an error), where it's null. It never seems to be null when I use the MacBook. That's potentially the problem! I'll see if I can figure out what's making it become null eventually. Update 5: ble_hs_mbuf_from_flat returns NULL when os_mbuf_copyinto returns error code 1 from os_mbuf_append, which is error OS_ENOMEM. It's running out of memory, but why would this happen only with a certain client device? Update 6: tried increasing MSYS_1 block count in menuconfig from 12 to 120, didn't fix anything. Neither did increasing the HCI buffer counts or ACL buffer count. |
Finally found something that makes it work: in ble_gattc.c, line 4169, replacing "BLE_HS_CONN_HANDLE_NONE" with "conn_handle" makes it work (essentially, instead of using the "no conn handle" marker, it just uses the conn handle that it already has. I have no idea why that's fixing things, and I would really like to find a solution that doesn't require each separate computer we use for development to make this edit (to a very obscure file) independently. If anyone knows how to fix this, please let me know! |
Hello, good work digging into this. Seems to be an odd issue for sure, what version of IDF are you using? |
Thank you! I'm on ESP-IDF 4.4.3 (it says "v4.4.3-dirty" when I run idf.py --version, not sure if that matters). The version of this repo is 1.4.1 and Arduino is ESP Arduino component version 2.0.5. |
@h2zero Any update on this? |
Sorry I haven't had time to dig into this much, but as you show in your investigation above there seems to be an issue in the BLE stack somewhere, not in this library. One thing that should probably be done is remove the assert in the |
Okay, I've created an issue with Espressif, hopefully they can fix the issue. Thank you! |
I have esp-nimble-cpp running on an ESP32S3 from ESP-IDF with Arduino component. I have a characteristic that should be updating and notifying about 3 times per second. I have a MacBook and iOS phone available for testing (waiting for a coworker to test this on PC/Android). When I subscribe to the characteristic from the MacBook, it works fine. When I do so from the iPhone, the ESP32S3 crashes, with the following stack trace:
`assert failed: static int NimBLECharacteristic::handleGapEvent(uint16_t, uint16_t, ble_gatt_access_ctxt*, void*) NimBLECharacteristic.cpp:277 (rc == 0)
Backtrace: [...]
0x40375cde: panic_abort at [idf path]/esp-idf/components/esp_system/panic.c:402
0x4037f9ed: esp_system_abort at [idf path]/esp-idf/components/esp_system/esp_system.c:128
0x40386359: __assert_func at [idf path]/esp-idf/components/newlib/assert.c:85
0x42025357: NimBLECharacteristic::handleGapEvent(unsigned short, unsigned short, ble_gatt_access_ctxt*, void*) at [project]/components/esp-nimble-cpp-1.4.1/src/NimBLECharacteristic.cpp:277 (discriminator 1)
0x42036f54: ble_gatts_val_access at [idf path]/esp-idf/components/bt/host/nimble/nimble/nimble/host/src/ble_gatts.c:375
0x42036fd6: ble_gatts_chr_val_access at [idf path]/esp-idf/components/bt/host/nimble/nimble/nimble/host/src/ble_gatts.c:421
0x4203a69a: ble_att_svr_read at [idf path]/esp-idf/components/bt/host/nimble/nimble/nimble/host/src/ble_att_svr.c:398
0x4203b18a: ble_att_svr_read_handle at [idf path]/esp-idf/components/bt/host/nimble/nimble/nimble/host/src/ble_att_svr.c:473
0x42031fb3: ble_gattc_notify_custom at [idf path]/esp-idf/components/bt/host/nimble/nimble/nimble/host/src/ble_gattc.c:4169
0x42025224: NimBLECharacteristic::notify(unsigned char const*, unsigned int, bool) at [project]/components/esp-nimble-cpp-1.4.1/src/NimBLECharacteristic.cpp:513
0x4202523d: NimBLECharacteristic::notify(bool) at [project]/components/esp-nimble-cpp-1.4.1/src/NimBLECharacteristic.cpp:420
[my function calls inside loop()]
0x420069f9: loopTask(void*) at [idf path]/esp-idf/components/arduino/cores/esp32/main.cpp:50
0x40382c6d: vPortTaskWrapper at [idf path]/esp-idf/components/freertos/port/xtensa/port.c:131`
The crash happens with 2 different apps on the phone (LightBlue and the app I'm developing) so it seems to be something on the ESP side. Does anyone know how to fix this, or at least what's causing it? (A bit of poking around makes it seem like the ESP can't find the connection info for the phone, but I have no idea what would cause that.)
The text was updated successfully, but these errors were encountered: