-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix esp32 select race conditions. #774
Closed
balazsracz
wants to merge
14
commits into
bracz-tmp-compile-fix-merge
from
bracz-select-race-condition
Closed
Fix esp32 select race conditions. #774
balazsracz
wants to merge
14
commits into
bracz-tmp-compile-fix-merge
from
bracz-select-race-condition
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This will be usable for printing a summary of the stats for a developer printout like a log statement.
of arbitrary internal node processing.
A correct implementation of a selectable fd driver has to check in vfs_select_start() whether the fd is readable. This check was missing from the prior implementation. We had an application level check of the queue being non-empty, but there is a window of time between the application doing this check, and the select() implementation of the esp32 getting to calling vfs_select_start(). The wakeup implementation was only effective if it came after vfs_select_start, by the design of esp's select mechanism, since the wakeup semaphore only comes in vfs_select_start. Since the esp32's select() is very slow, this was actually a pretty big gap. The OpenMRN Device::select implementation does not suffer from this race condition, because the event group bits can be set at any time, even if Device::select is still in the setup phase. Added a comment to this effect.
The wakeup_from_isr routine consults the pendingWakeup_ and inSelect_ variables. These variables need to be locked, because multi-core ESP32's could run an isr on one core and othercode on a different core. Moves the atomic lock from esp_wakeup_from_isr into OSSelectWakeup::wakeup_from_isr.
When the application already knows about the executables in the queue, we don't need select to terminate with EINTR. We either ran the executable and the queue is empty (in which case we want select to sleep), or we know the queue is not empty and thus will run select with a timeout of 0.
atanisoft
approved these changes
Feb 3, 2024
* bracz-tmp-compile-fix-merge: Fix compiler warnings in openmrn when using new GCC's. (#772) Fix comment. Upintegrate changes from the OpenMRNIDF repository (#771) Fix comments. Adds support for DCC extended accessories (#769) Fix incorrect consumer identified message being emitted by dcc accy producer. (#768) Avoids rendering hidden segments. (#767) Adds trailing zero to the cdi XML file written to the filesystem. (#777) Fix target subdirectory name (#775)
Base automatically changed from
bracz-stat-max
to
bracz-tmp-compile-fix-merge
February 5, 2024 03:12
Updates the latency test of hub_test: - refactors the stats printing code into the Stats class - adds max statistic to the class - adds the latency consumer object that can be included in a product to verify the application level latency. === * Adds a debug print function to the stats object. This will be usable for printing a summary of the stats for a developer printout like a log statement. * Adds maximum to the statistics object. * Adds a consumer object for event based testing. * Adds hook to the latency test consumer. This allows testing latency of arbitrary internal node processing. * Adds documentation comment to latency test consumer. * Fix comment. * Fixes data type of max.
* master: Latency test with maximum stats and custom process evaluation (#773)
* bracz-stat-max: Latency test with maximum stats and custom process evaluation (#773) Fixes data type of max. Fix comment. Adds documentation comment to latency test consumer. Fix compiler warnings in openmrn when using new GCC's. (#772) Fix comment. Upintegrate changes from the OpenMRNIDF repository (#771) Fix comments. Adds support for DCC extended accessories (#769) Fix incorrect consumer identified message being emitted by dcc accy producer. (#768) Avoids rendering hidden segments. (#767) Adds trailing zero to the cdi XML file written to the filesystem. (#777) Fix target subdirectory name (#775)
balazsracz
added a commit
that referenced
this pull request
Feb 5, 2024
- Fixes race condition that caused ESP32's select to not wake up even though an executable was posted. - Fixes another race condition where variables are not locked when accessing from ISR (relevant on ESP32 only). - Optimizes an unnecessary round of select wakeup. Note: review is in #774 which was irreversibly closed due to an incorrect sequence of operations I did on github. === * Fixes race condition in ESP32 select-wakeup implementation. A correct implementation of a selectable fd driver has to check in vfs_select_start() whether the fd is readable. This check was missing from the prior implementation. We had an application level check of the queue being non-empty, but there is a window of time between the application doing this check, and the select() implementation of the esp32 getting to calling vfs_select_start(). The wakeup implementation was only effective if it came after vfs_select_start, by the design of esp's select mechanism, since the wakeup semaphore only comes in vfs_select_start. Since the esp32's select() is very slow, this was actually a pretty big gap. The OpenMRN Device::select implementation does not suffer from this race condition, because the event group bits can be set at any time, even if Device::select is still in the setup phase. Added a comment to this effect. * Fixes another (smaller) race condition in ESP32's select wakeup. The wakeup_from_isr routine consults the pendingWakeup_ and inSelect_ variables. These variables need to be locked, because multi-core ESP32's could run an isr on one core and othercode on a different core. Moves the atomic lock from esp_wakeup_from_isr into OSSelectWakeup::wakeup_from_isr. * Optimizes unnecessary select iterations. When the application already knows about the executables in the queue, we don't need select to terminate with EINTR. We either ran the executable and the queue is empty (in which case we want select to sleep), or we know the queue is not empty and thus will run select with a timeout of 0.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
===
A correct implementation of a selectable fd driver has to check in vfs_select_start()
whether the fd is readable. This check was missing from the prior implementation.
We had an application level check of the queue being non-empty, but there is a
window of time between the application doing this check, and the select()
implementation of the esp32 getting to calling vfs_select_start(). The wakeup
implementation was only effective if it came after vfs_select_start, by the design
of esp's select mechanism, since the wakeup semaphore only comes in vfs_select_start.
Since the esp32's select() is very slow, this was actually a pretty big gap.
The OpenMRN Device::select implementation does not suffer from this race
condition, because the event group bits can be set at any time, even if
Device::select is still in the setup phase. Added a comment to this effect.
The wakeup_from_isr routine consults the pendingWakeup_ and inSelect_ variables.
These variables need to be locked, because multi-core ESP32's could run an isr
on one core and othercode on a different core.
Moves the atomic lock from esp_wakeup_from_isr into OSSelectWakeup::wakeup_from_isr.
When the application already knows about the executables in the queue, we don't
need select to terminate with EINTR. We either ran the executable and the queue
is empty (in which case we want select to sleep), or we know the queue is not
empty and thus will run select with a timeout of 0.