Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix esp32 select race conditions. #780

Merged
merged 26 commits into from
Feb 5, 2024
Merged

Conversation

balazsracz
Copy link
Collaborator

  • Fixes race condition that caused ESP32's select to not wake up even though an executable was posted.
  • Fixes another race condition where variables are not locked when accessing from ISR (relevant on ESP32 only).
  • Optimizes an unnecessary round of select wakeup.

Note: review is in #774 which was irreversibly closed due to an incorrect sequence of operations I did on github.

===

  • Fixes race condition in ESP32 select-wakeup implementation.

A correct implementation of a selectable fd driver has to check in vfs_select_start()
whether the fd is readable. This check was missing from the prior implementation.
We had an application level check of the queue being non-empty, but there is a
window of time between the application doing this check, and the select()
implementation of the esp32 getting to calling vfs_select_start(). The wakeup
implementation was only effective if it came after vfs_select_start, by the design
of esp's select mechanism, since the wakeup semaphore only comes in vfs_select_start.

Since the esp32's select() is very slow, this was actually a pretty big gap.

The OpenMRN Device::select implementation does not suffer from this race
condition, because the event group bits can be set at any time, even if
Device::select is still in the setup phase. Added a comment to this effect.

  • Fixes another (smaller) race condition in ESP32's select wakeup.

The wakeup_from_isr routine consults the pendingWakeup_ and inSelect_ variables.
These variables need to be locked, because multi-core ESP32's could run an isr
on one core and othercode on a different core.

Moves the atomic lock from esp_wakeup_from_isr into OSSelectWakeup::wakeup_from_isr.

  • Optimizes unnecessary select iterations.

When the application already knows about the executables in the queue, we don't
need select to terminate with EINTR. We either ran the executable and the queue
is empty (in which case we want select to sleep), or we know the queue is not
empty and thus will run select with a timeout of 0.

- All instances of ifdef ESP32 replaced with ESP_PLATFORM
- Removed conditional code for supporting ESP-IDF v3.x and v4.x
  Only ESP-IDF v5 and up is now supported
- new feature for OPENMRN_HAVE_SOCKET_FSTAT
- fixed compile errors around spiram
- refactored definition of ADC pin.
- fixes around printf PRIu32.
- do not print error messages on CAN overflow
- fix to not start the hub twice in the wifi manager
* master:
  Fixes file comment.
  Adds an application (hub_test) for testing the throughput of a hub or a CAN-bus (#762)
  Adds a helper function to decode a 14-bit 9.2.1.1 address into address type and raw address. (#766)
  Fix broken test.
  Accessory packet refactoring and POM support (#764)
  Adds support for Offset(n) attribute in the CDI  (#765)
  Adds local loopback to TractionThrottle. (#756)
  Fix test flakiness.
  Fix compile error on FdUtils under freertos.
This will be usable for printing a summary of the stats for a developer printout like a log statement.
A correct implementation of a selectable fd driver has to check in vfs_select_start()
whether the fd is readable. This check was missing from the prior implementation.
We had an application level check of the queue being non-empty, but there is a
window of time between the application doing this check, and the select()
implementation of the esp32 getting to calling vfs_select_start(). The wakeup
implementation was only effective if it came after vfs_select_start, by the design
of esp's select mechanism, since the wakeup semaphore only comes in vfs_select_start.

Since the esp32's select() is very slow, this was actually a pretty big gap.

The OpenMRN Device::select implementation does not suffer from this race
condition, because the event group bits can be set at any time, even if
Device::select is still in the setup phase. Added a comment to this effect.
The wakeup_from_isr routine consults the pendingWakeup_ and inSelect_ variables.
These variables need to be locked, because multi-core ESP32's could run an isr
on one core and othercode on a different core.

Moves the atomic lock from esp_wakeup_from_isr into OSSelectWakeup::wakeup_from_isr.
When the application already knows about the executables in the queue, we don't
need select to terminate with EINTR. We either ran the executable and the queue
is empty (in which case we want select to sleep), or we know the queue is not
empty and thus will run select with a timeout of 0.
* master:
  Upintegrate changes from the OpenMRNIDF repository (#771)
  Adds support for DCC extended accessories  (#769)
  Fix incorrect consumer identified message being emitted by dcc accy producer. (#768)
  Avoids rendering hidden segments. (#767)
  Adds trailing zero to the cdi XML file written to the filesystem. (#777)
  Fix target subdirectory name (#775)
  Fixes file comment.
  Adds an application (hub_test) for testing the throughput of a hub or a CAN-bus (#762)
  Adds a helper function to decode a 14-bit 9.2.1.1 address into address type and raw address. (#766)
  Fix broken test.
  Accessory packet refactoring and POM support (#764)
  Adds support for Offset(n) attribute in the CDI  (#765)
  Adds local loopback to TractionThrottle. (#756)
  Fix test flakiness.
  Fix compile error on FdUtils under freertos.
* bracz-idf-downintegrate:
  Upintegrate changes from the OpenMRNIDF repository (#771)
  Fix comments.
  Adds support for DCC extended accessories  (#769)
  Fix incorrect consumer identified message being emitted by dcc accy producer. (#768)
  Avoids rendering hidden segments. (#767)
  Adds trailing zero to the cdi XML file written to the filesystem. (#777)
  Fix target subdirectory name (#775)
  Fixes file comment.
  Adds an application (hub_test) for testing the throughput of a hub or a CAN-bus (#762)
  Adds a helper function to decode a 14-bit 9.2.1.1 address into address type and raw address. (#766)
  Fix broken test.
  Accessory packet refactoring and POM support (#764)
  Adds support for Offset(n) attribute in the CDI  (#765)
  Adds local loopback to TractionThrottle. (#756)
  Fix test flakiness.
  Fix compile error on FdUtils under freertos.
…ompiler-warnings

* 'master' of github.com:bakerstu/openmrn:
  Fix compiler warnings in openmrn when using new GCC's. (#772)
…-merge

* bracz-esp-compiler-warnings:
  Fix compiler warnings in openmrn when using new GCC's. (#772)
  Fix comment.
  Upintegrate changes from the OpenMRNIDF repository (#771)
  Fix comments.
  Adds support for DCC extended accessories  (#769)
  Fix incorrect consumer identified message being emitted by dcc accy producer. (#768)
  Avoids rendering hidden segments. (#767)
  Adds trailing zero to the cdi XML file written to the filesystem. (#777)
  Fix target subdirectory name (#775)
* bracz-tmp-compile-fix-merge:
  Fix compiler warnings in openmrn when using new GCC's. (#772)
  Fix comment.
  Upintegrate changes from the OpenMRNIDF repository (#771)
  Fix comments.
  Adds support for DCC extended accessories  (#769)
  Fix incorrect consumer identified message being emitted by dcc accy producer. (#768)
  Avoids rendering hidden segments. (#767)
  Adds trailing zero to the cdi XML file written to the filesystem. (#777)
  Fix target subdirectory name (#775)
* master:
  Latency test with maximum stats and custom process evaluation (#773)
* bracz-stat-max:
  Latency test with maximum stats and custom process evaluation (#773)
  Fixes data type of max.
  Fix comment.
  Adds documentation comment to latency test consumer.
  Fix compiler warnings in openmrn when using new GCC's. (#772)
  Fix comment.
  Upintegrate changes from the OpenMRNIDF repository (#771)
  Fix comments.
  Adds support for DCC extended accessories  (#769)
  Fix incorrect consumer identified message being emitted by dcc accy producer. (#768)
  Avoids rendering hidden segments. (#767)
  Adds trailing zero to the cdi XML file written to the filesystem. (#777)
  Fix target subdirectory name (#775)
@balazsracz balazsracz merged commit 157139e into master Feb 5, 2024
8 checks passed
@balazsracz balazsracz deleted the bracz-select-race-condition branch February 5, 2024 03:24
balazsracz added a commit that referenced this pull request Feb 5, 2024
* master:
  Fix esp32 select race conditions. (#780)
  Latency test with maximum stats and custom process evaluation (#773)
  Fix compiler warnings in openmrn when using new GCC's. (#772)
  Upintegrate changes from the OpenMRNIDF repository (#771)
  Adds support for DCC extended accessories  (#769)
  Fix incorrect consumer identified message being emitted by dcc accy producer. (#768)
  Avoids rendering hidden segments. (#767)
  Adds trailing zero to the cdi XML file written to the filesystem. (#777)
  Fix target subdirectory name (#775)
  Fixes file comment.
  Adds an application (hub_test) for testing the throughput of a hub or a CAN-bus (#762)
  Adds a helper function to decode a 14-bit 9.2.1.1 address into address type and raw address. (#766)
  Fix broken test.
  Accessory packet refactoring and POM support (#764)
balazsracz added a commit that referenced this pull request Jun 21, 2024
…t-hub-router

# By Balazs Racz (26) and others
* 'master' of github.com:bakerstu/openmrn: (28 commits)
  Fix build of esp8266 train implementation.
  Removes unnecessary includes that might not exist on an embedded compiler.
  Fix compilation of TempFile under esp8266.
  Add libatomic to esp8266 nonos target.
  Fix compile errors in time_client app.
  Fixes in file memory space: (#786)
  Change startup state to stopped. (#784)
  Fixes write code for spiflash. (#782)
  Handles bus passive in TivaCan. (#781)
  Update libify to support IDF export with symlinks (#770)
  Fix esp32 select race conditions. (#780)
  Latency test with maximum stats and custom process evaluation (#773)
  Fix compiler warnings in openmrn when using new GCC's. (#772)
  Upintegrate changes from the OpenMRNIDF repository (#771)
  Adds support for DCC extended accessories  (#769)
  Fix incorrect consumer identified message being emitted by dcc accy producer. (#768)
  Avoids rendering hidden segments. (#767)
  Adds trailing zero to the cdi XML file written to the filesystem. (#777)
  Fix target subdirectory name (#775)
  Fixes file comment.
  ...

# Conflicts:
#	src/utils/constants.cxx
#	src/utils/sources
balazsracz added a commit that referenced this pull request Jun 21, 2024
* bracz-direct-hub-router: (35 commits)
  High-performance hub component for dealing with many sockets and high throughput (#760)
  Fix test build.
  Fixed comment and adds a todo.
  Remove unnecessary log.
  Fix comments and reduce unnecessary log level.
  FIx comments.
  Fix build of esp8266 train implementation.
  Removes unnecessary includes that might not exist on an embedded compiler.
  Fix compilation of TempFile under esp8266.
  Add libatomic to esp8266 nonos target.
  Fix compile errors in time_client app.
  Fixes in file memory space: (#786)
  Change startup state to stopped. (#784)
  Fixes write code for spiflash. (#782)
  Handles bus passive in TivaCan. (#781)
  Update libify to support IDF export with symlinks (#770)
  Fix esp32 select race conditions. (#780)
  Latency test with maximum stats and custom process evaluation (#773)
  Fix compiler warnings in openmrn when using new GCC's. (#772)
  Upintegrate changes from the OpenMRNIDF repository (#771)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant