Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Startup fails due to GPS module communication #340

Open
maehw opened this issue May 5, 2023 · 6 comments
Open

Startup fails due to GPS module communication #340

maehw opened this issue May 5, 2023 · 6 comments

Comments

@maehw
Copy link

maehw commented May 5, 2023

Hi all!

As I've taken my notes in English (and everything is becoming quite technical), I am adding his as an issue to github, not the the OBS forum:

We are currently trying to build some OBS HW v00.03.12. The flashed firmware version is OBS/v0.18.849.

As we've previously seen GPS modules failing, we've been testing the GPS modules in advance: we've hooked the modules up to a Windows PC using a USB/serial converter cable and running the u-blox u-center GNSS evaluation software (using a small test jig). Everything worked fine - all the GPS modules all got fixes with their antennas were attached, at least 4 satellites in sight, quick fix!

When being soldered on to the OBS PCB, the OBS won't boot up correctly and "freeze" showing "0 sats SN:0" in the display.

I've made two approaches to narrow the issue down:

  1. Use a Saleae Logic analyzer, sniff and analyze (using a UBX HLA) the traffic between OBS and GPS module ; no ESP32 UART connection to the PC; powering the OBS by operating the switch
  2. Have a look at the ESP32 UART logging data and also sniff the data; the OBS power switch doesn't play any role here

Approach 1

Initialization: agreement on a common baud rate

  • Around 55 ms after powerup, the GPS module sends a lot of NMEA data at 9600 baud (multiple $TDINF messages, $GNRMC, $GNGGA, 2x $GNGSA, 3x $GPGSV, $BDGSV, $GNTXT including the text ANTENNA OK)
  • After another approx. 540 ms, the GPS module sends some more NMEA messages
  • After this burst, another approx. 670 ms pass, until the OBS starts to send some data itself: it queries CFG-RINV at 115'200 baud and doesn't get an answer after 200 ms twice
  • OBS then switches to 9600 baud and sends CFG-PRT for reconfiguration to 115'200 baud, directly switches itself back to 115'200 baud and re-sends CFG-RINV at the higher baud rate while the GPS still sends using the lower 9600 baud rate
  • GPS reconfigures to using 115'200 baud and sends an ACK-ACK for the CFG-PRT message at that higher rate

Continued failed queries of the remote inventory

  • After around 200 ms the OBS resends the CFG-RINV query as it has not gotten a response in the meantime
  • After another approx. 200 ms the OBS again resends the CFG-RINV query as it has not gotten a response in the meantime
  • The GPS doesn't seem to care and sends some $GNRMC, $GNGGA, $GNGSA, $GPGSV, $BDGSV, $GNTXT NMEA message
  • After another approx. 200 ms the OBS again resends the CFG-RINV query as it has not gotten a response in the meantime - for two more times!
  • OBS finally issues a CFG-RST request
  • GPS acknowledges the request by responding with an ACK-ACK for CFG-RST

Failed message configurations after reset and still no remote inventory

  • After some 85 milliseconds delay, again NMEA data from the GPS module
  • OBS sends an CFG-MSG request for message 0x0B 0x32, i.e. AID-ALPSRV
  • GPS replies with ACK-NACK
  • OBS requests AID-ALP, MON-VER, MON-HW, NAV-STATUS and CFG-NAV5
  • GPS replies with 37(! seems not to be specification conformant) byte long MON-VER
  • GPS replies with 36 byte long CFG-NAV5
  • After another approx. 200 ms the OBS sends another CNFG-NAV5 request
  • GPS replies with CFG-NAV5
  • After a short delay, the GPS also sends again some $GNRMC, $GNGGA, $GNGSA, $GPGSV, $BDGSV, $GNTXT NMEA message
  • OBS once again requests CFG-RINV
  • No reply from the GPS module within approx. 200 ms so OBS once again requests CFG-RINV
  • Still no reply from the GPS module within about 240 ms
  • OBS requests CFG-MSG for 0x01 0x20, i.e. NAV-TIMEGPS with a rate of 1 on the relevant port
  • ACK-NACK from the GPS module
  • OBS requests CFG-MSG for 0x01 0x03, i.e. NAV-STATUS with a rate of 1 on the relevant port
  • ACK-NACK from the GPS module
  • OBS requests CFG-MSG for 0x0A 0x09, i.e. MON-HW with a rate of 1 on the relevant port
  • ACK-NACK from the GPS module
  • Afterwards, cyclic NMEA message burst from the GPS module continue - no UBX messages, no more data from the OBS

Findings

  • The response of MON-VER doesn't look as if it was conforming to the specification (I'd expect a 40 byte long reply with 30 bytes software version and 10 bytes hardware version); the value starts with T3,RomFw,1.1(48), not sure if this is what one would expect from an original u-blocks module!
  • Why no CFG-RINV reply at all? Nor an empty dump neither even an ACK-NACK (not sure if we expect any ACK-* though)?
  • Why the ACK-NACK to the CFG-MSG requests?
  • In my understanding either OBS does not request muting NMEA messages or it does not work.

Approach 2

E (933) esp_core_dump_flash: No core dump partition found!
[    36][I][OpenBikeSensorFirmware.cpp:199] setup(): openbikesensor.org - OBS/v0.18.849
[    38][I][esp32-hal-i2c.c:75] i2cInit(): Initialising I2C Master: sda=21 scl=22 freq=100000
[    43][W][Wire.cpp:301] begin(): Bus already started in Master Mode.
[    89][I][VoltageMeter.cpp:40] VoltageMeter(): Initializing VoltageMeter.
[    90][I][VoltageMeter.cpp:54] VoltageMeter(): Characterized using eFuse Vref
[    92][I][VoltageMeter.cpp:62] VoltageMeter(): eFuse Two Point: NOT supported
[    99][I][VoltageMeter.cpp:66] VoltageMeter(): eFuse Vref: Supported
[   109][I][VoltageMeter.cpp:75] VoltageMeter(): VoltageMeter initialized got 0.21V.
[   155][I][OpenBikeSensorFirmware.cpp:617] loadConfig(): Load cfg
[  1058][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  1259][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  1259][E][gps.cpp:361] sendAndWaitForAck(): Failed to send cfg. 0x3406 NAK: 0 after 401ms
[  1370][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GNRMC: $GNRMC<snip-to-end-of-line>
[  1371][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GNGGA: $GNGGA<snip-to-end-of-line>
[  1380][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GNGSA: $GNGSA,A,1,,,,,,,,,,,,,99.9,99.9,99.9,1*0A
[  1388][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GNGSA: $GNGSA,A,1,,,,,,,,,,,,,99.9,99.9,99.9,4*0F
[  1398][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GPGSV: $GPGSV,<snip-to-end-of-line>
[  1409][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GPGSV: $GPGSV,<snip-to-end-of-line>
[  1420][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GPGSV: $GPGSV,<snip-to-end-of-line>
[  1431][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GPGSV: $GPGSV,<snip-to-end-of-line>
[  1440][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA BDGSV: $BDGSV,<snip-to-end-of-line>
[  1447][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GNTXT: $GNTXT,1,1,01,ANTENNA OK*2B
[  1492][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  1692][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  1692][E][gps.cpp:361] sendAndWaitForAck(): Failed to send cfg. 0x3406 NAK: 0 after 400ms
[  1696][E][gps.cpp:305] setBaud(): Switch to 115200 was not possible, back to 9600.
[  1903][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  2103][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  2103][E][gps.cpp:361] sendAndWaitForAck(): Failed to send cfg. 0x3406 NAK: 0 after 400ms
[  2107][E][gps.cpp:310] setBaud(): NO GPS????
[  7112][I][gps.cpp:450] addStatisticsMessage(): New: readGPSData(clear: 190 bytes in buffer, lastCall 5009ms ago, at 1970-01-01T00:00:07)
[  7315][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  7398][W][gps.cpp:663] encode(): Unexpected GPS char in state null: c7 Ç
...
[  7631][W][gps.cpp:663] encode(): Unexpected GPS char in state null: fe þ
[  7639][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 324ms.
[  7645][E][gps.cpp:361] sendAndWaitForAck(): Failed to send cfg. 0x3406 NAK: 0 after 534ms
[  7853][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  8053][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  8053][E][gps.cpp:361] sendAndWaitForAck(): Failed to send cfg. 0x3406 NAK: 0 after 400ms
[  8057][I][gps.cpp:180] softResetGps(): Soft-RESET GPS!
[  8232][W][gps.cpp:663] encode(): Unexpected GPS char in state null: c7 Ç
...
[  8247][W][gps.cpp:663] encode(): Unexpected GPS char in state null: 88 �
...

I haven't put the same amount of effort into analysing this situation. But after start-up it looks like the GPS sending at 115'200 baud and the OBS having fallen back to 9600 baud - so the cyclic NMEA messages may not even be interpretable as ASCII/NMEA messages. All UART input is garbage this makes everything fail?!

Summary

I am not sure what's going on but I can provide further debugging details when I get some guidance.

Random thoughts:

  • Is it possible that u-blox u-center activated NMEA messages that the OBS firmware cannot cope with as they are unexpected?
  • Some corner case where baud rates are changed without being required?
  • It looks to me that the OBS does not wait for neither the reset request nor the baudrate reconfiguration being acknowledged by the GPS module before continuing communication.
  • Counterfeit GPS module that is not specification-conformant - works with the u-blox u-center but not the OBS firmware?
  • Is missing almanach data problematic? Didn't provide any so far. If that's the case, I suggest the firmware handling this situation differently if it can be recognized.
  • Is the remote inventory or the access to it broken and causing all those issues?

I haven't dived too much into the firmware.

Your help is highly appreciated.


Edit:

I played around with the u-blox u-center and found out about the NMEA messages:

  • GxGGA seems to stand for Global Positioning System Fix Data
  • GxGSA GNSS DOP and Active Stallites
  • GxGSV GNSS Satellites in View
  • GxRMC Recommended Minimum Specific GNSS Data
  • GxTXT Text Transmission
  • BDGSV still seems like random unknown magic

Maybe they cannot be deactivated and are even required by the OBS? Or did u-blox u-center just activate them because of its opened views and store those settings on the module?
E.g. GxGGA "SVs Used" or GxGSA "SVs Used": "Number of SVs used for Navigation"

Could OBS display the min./max. C/N_0 of the 4 satellites with the best C/N_0 instead of some magic (absolute?) noisePerMs noise level from MON-HW? Maybe the latter correlates with N_0 but doesn't give any info about signal reception quality?


Edit2:

Okay, does not look like any NMEA relations:

  • Gps::parseUbxMessage() calls mIncomingGpsRecord.setInfo(mGpsBuffer.navSol.numSv, ...) when it receives UBX NAV-SOL messages.
  • GpsRecord::setInfo(uint8_t satellitesInUse, GPS_FIX gpsFix, uint8_t flags) stores the argument in member mSatellitesUsed
  • Gps::showWaitStatus() prints String(mCurrentGpsRecord.mSatellitesUsed) + "sats SN:" + String(mLastNoiseLevel); on the display

From a static code analysis (not runtime debugging) view, we're stuck in the following endless loop:

while (!gps.hasFix(obsDisplay)) {
    currentTimeMillis = millis();
    gps.handle();
    // ...
    gps.showWaitStatus(obsDisplay);
    if (button.read() == HIGH) {
      log_d("Skipped get GPS...");
      obsDisplay->showTextOnGrid(2, obsDisplay->currentLine(), "...skipped");
      break;
    }
  }
@amandel
Copy link
Member

amandel commented May 7, 2023

Wow thanks for all this detail and investing the time. We can drop the mSatellitesUsed != 0 this is from former times where it was possible to configure how many satellites are needed for a stable fix. We have learned that the module knows this best and removed it. This will not change the situation since mFixStatus is also only updated with NAV-SOL.
I've changed a bit the timeouts especially in communication error situations in one of the last firmware updates, may be the firmware needs to wait longer until it gives up - especially in the situation of a baud rate change?
In general the firmware ignores the unexpected messages so this should not harm, but we switch these off in the same way other UBX messages are switched on. If this call fails, likely also other messages do not arrive. NAV-SOL is requested for every second update here

setMessageInterval(UBX_MSG::NAV_SOL, 1);
- as a test can you try to enable this message with the u-blox u-center?

Can you confirm that the module supports only limited set of UBX messages? Can you try the "important" messages for the OBS: NAV-POSLLH, NAV-VELNED, NAV-SOL. Depending on which is missing - we can find a possible substitute.

@maehw
Copy link
Author

maehw commented May 8, 2023

Hi @amandel ! Thanks for your quick response.

Can you give me a hint how to enable this message with the u-blox u-center? The GUI is not very intuitive.

IIRC I've tried to use UBX CFG-MSG in the past but did not see any effect (probably also received ACK-NACKs).

I've added some module test firmware to investigate further: https://forum.openbikesensor.org/t/modultests/1685

See also: https://github.com/maehw/ObsHwModuleTests/blob/main/ObsGpsTest/README.md

@amandel
Copy link
Member

amandel commented May 8, 2023

The useful window in u-center is the Messages View window (opens on F9 in my version). There you can go down the Tree: UBX-NAV-POSLLH as a sample. If "Autopoll" is enabled or you press "Poll" in the status bar of the window you should see the current date (or not see if it does not work). At UBX-CFG-MSG you can also create and send the cfg messages. What does it show for UBX-MON-VER?

@maehw
Copy link
Author

maehw commented May 9, 2023

Okay, thank you for those hints.

I had played around with the Messages View before.

The current module under monitoring only shows NMEA messages. UBX is grayed out.

I can double click the menu entry (e.g. NAV-POSLLH) or choose Poll Message or Enable Message or Disable Message. The GUI stays grayed out.

However, the binary console shows some entries with µB sync bytes (0xB5 0x62), i.e. UBX binary messages. But I am not sure if they are sent to the GPS module or received from it. I'd have to attach the logic analyzer again.

IIRC UBX is limited as low baud rates (such as 9600) may limit capabilities due to limited bandwidth.

I don't quite understand the persistency concept yet - is the "port" (we use the NEO-6M's "serial port") configuration stored and loaded during power-up? Or only when controlled from OBS?

Are you using u-blox u-center with a baudrate of 115'200 or 9600 baud or should it not make a difference?

I think I need a noob-friendly step-by-step manual. Too many things that can go wrong.

I am adding an UBX/NMEA message reception indicator in the test firmware.


Edits:

  • My problem is also that I can only use one spare GPS module with u-blox u-center. The other one's are already soldered into OBS setups. Unless you'd still connect a USB/serial converter "between" GPS module and OBS' ESP32 uC.

  • I was able to reconfigure to 115'200. Still, polling UBX messages does not work - simply no reply. Power cycling the NEO-6M board reset the baud rate to 9600. (No persistence here.)

  • This particular module is shown as "u-blox M8/8" even though it should be a 6M. I was unable to change settings/filters.


Edit: Michael gave me the hint that I could simply remove the ESP32s from their sockets to connect the GSP modules soldered to OBS PCBs via USB/serial converters directly (without VCC) w/o interference from the MCU.

@maehw
Copy link
Author

maehw commented May 10, 2023

Tiny update as I am wiser and older now.

I've got three different GPS module type on my desk. They behave differently on the serial, after supplying power. They all tell version identifiers in their first three lines, abbreviated:

  • Techtotop1 Multi-GNSS Recv., T3,RomFW,1.1(48),Aug 12 '16
  • u-blox AG, HW UBX-G60xx, ROM Core 7.03 (45969) Mar 17 '11
  • u-blox AG, HW UBX-G60xx, ROM Core 6.02 (36023) Oct 15 '09

Not sure if those that say they're u-blox really are...

1On the topic: u-blox v Techtotop: China’s Developing Jurisprudence on Evidentiary Burden Shifting

Those claiming they are u-blox are communicating with 115'200 baud and seem to be configurable. The start with some messages but are silent (probably until re-configured properly).

Those claiming they are Techtotop are communicating with 9600 baud and are babbling NMEA... probably not even configurable a bit.

We could add detection of Techtotop ones and I think one even could add support for NMEA-only modules - if they provide all relevant info and the OBS project is willing to support them.

@gluap
Copy link
Contributor

gluap commented May 13, 2023

We had a discussion about supporting NMEA-only modules a few years back (also because someone had received them despite ordering Neo6m). AFAIR Back in the day it turned out that the modules in question were so shitty (bad position values, bad fix) that implementing NMEA support for them would have been a waste of time - The hardware was bad enough that they would have had to be exchanged even in case the the firmware had supported NMEA. Unfortunately I can't find the thread and it may have been discussed during a call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants