Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

embassy: enable usage of esp-wifi with riot-rs-threads #300

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

elenaf9
Copy link
Collaborator

@elenaf9 elenaf9 commented May 23, 2024

This implements an example that uses esp-wifi together with riot-rs-threads.

I don't intend for this to be merged, but rather to get feedback on the idea, and to start the discussion on how we can solve it directly inside riot-rs-embassy.

Problem when using esp-wifi and RIOT-rs threading together

As documented in #289, using the two components together is non trivial because esp-wifi internally runs a preemptive scheduler for wif-/ bluetooth related tasks. This scheduler uses the periodic systemtimer alarm0 to switch between its tasks, in addition to cooperative yielding using CPU interrupt 3.
The preemptive systemtimer-triggered context switch conflicts with our riot-rs-threads scheduler because it would interleave both schedulers, resulting in undefined behavior.

Proposed Solution

The idea in this PR is to solve the problem as follows:

  • We create one high priority thread (esp_wifi_thread) that drives the esp-wifi tasks in a loop. A loop iteration essentially consists of the following two steps:
    • yield the execution cooperatively to the esp-wifi scheduler, so that it can run all of its tasks once
    • go to sleep
  • As mentioned above, esp-wifi configures an period alarm0. In the original implementation, the interrupt from this alarm would cause a context switch, so that all esp-wifi tasks get a chance to run. In my implementation, I remapped the interrupt from this alarm0 to a new handler that instead wakes up the above esp_wifi_thread.
    • This maintains the functionality that every <alarm0-frequency> the esp-wifi tasks get a chance to run, but realized through our threading.
  • The application logic can then run in another thread.
  • The riots-rs-rt usually initializes embassy. With the threading feature enabled however, this doesn't happen anymore because threading is started before the call to riot_rs_embassy_init. Thus the esp-wifi-thread application thread has to explicitly call this initialization function. This will also spawn the executor that runs the tcp_echo task.

Limitations when using embassy-time

Not relevant for this PR anymore with the change described in #300 (comment). Nevertheless, I still think it's something to consider when we talk about spawning multiple embassy executors in different threads.

Independently of the conflict between the esp-wifi scheduler and our riot-rs-threads scheduler, there is an issue with timers when using embassy together with our threads on esp.
If embassy-time is required (which is the case in riot_rs_embassy::wifi::esp_wifi), the embassy executor tries to allocate a timer upon creation.

The concrete timer driver implementation for this is configured through a feature flag on esp-hal.
Right now, we are using the Timer Group Timer 0 (TIMG0) through the embassy-time-timg0 feature. The problem is that this gives exactly one timer. So as soon as we start a second embassy executor in a second riot-rs-thread Thread, the timer allocation panics.

I now switched to using the system timer. However, this also only gives us 3 timers in total. One is already used as described above by esp-wifi for alarm0. So only 2 remain, which means that we can only ever have two embassy executors if the time feature is enabled.
Thoughts on this @kaspar030 @ROMemories?

Fixes #289.

@kaspar030
Copy link
Collaborator

kaspar030 commented May 24, 2024

"usually" means, usually riot_rs_embassy::init() is called through an INIT_FUNCS entry, before threading is initialized. There, we set up an interrupt executor. That didn't work for esp (IIRC we assumed interrupts starved each other), so we settled with the esp provided thread mode executor. If we actually get threading to work with esp-wifi, we'll need to adapt riot-rs-embassy with support for starting a "reagular" Executor in a riot-rs-threading thread I guess. (which is what this PR does albeit in a hacky way).

@elenaf9
Copy link
Collaborator Author

elenaf9 commented May 24, 2024

@kaspar030 noted in an out-of-band discussion that the embassy initalization doesn't have to happen in the esp_wifi_thread, but instead could be done by the application thread.
With this change, there is no need to even have an embassy executor in esp_wifi_thread anymore, and we can directly run the loop.

This has multiple advantages:

  • We only have one executor, so the whole issue with the timers that is explained in the PR description isn't a problem
  • We don't put the thread to sleep from inside an embassy task, which was bad practice and violates the assumptions for embassy task
  • embassy: enable usage of esp-wifi with riot-rs-threads #300 (comment) isn't a problem anymore because the network stack is initialized and used in the same thread
  • The implementation doesn't require any changes inside riot-rs-embassy anymore

See commit fb9f68e.

@elenaf9
Copy link
Collaborator Author

elenaf9 commented May 27, 2024

I have now moved the logic for spawning the esp-wifi-thread into riot-rs-embassy: 69c69ad.
The advantage of this is that now users can use esp-wifi together with threading without additional effort.

The downside of it is that a thread running in the background is probably unexpected for the user and could cause some confusing (e.g. it implicitly reduces the number of threads a user can still spawn; it it high priority, ...).
Still, I think esp-wifi's behavior is unexpected anyway, and encapsulating their scheduler into its own thread seems like an acceptable solution to me. Wdyt?

@elenaf9 elenaf9 marked this pull request as ready for review May 27, 2024 09:24
@elenaf9 elenaf9 changed the title Test usage of esp-wifi with riot-rs-threads embassy: enable usage of esp-wifi with riot-rs-threads May 27, 2024
@kaspar030
Copy link
Collaborator

#304 is in, please rebase!

@kaspar030
Copy link
Collaborator

I'm testing this branch, with these changes: https://gist.github.com/kaspar030/833f672c398d20b18695210b08e265f0
(basically, I'm adding a busy-looping thread and a task).

I'm trying to figure out the behavior w.r.t. priorities of the executor vs. the thread.

Something seems off, this is the output:

...
I (197) boot: Disabling RNG early entropy source...                                                                                         12:46:50 [35/90393]
riot_rs_rt::startup()
expressif-esp32-c6-devkitc-1::init()
riot-rs-embassy::init(): using thread executor                             
riot-rs-embassy::arch::esp::init(): wifi                                                                                                                       
riot-rs-embassy::init_task()                                                   
riot-rs-embassy::init_task() done                                                                                                                              
start connection task                                                          
Device capabilities: Ok(EnumSet(Client))
Starting Wi-Fi                   
Wi-Fi started!                                                                 
About to connect...
tick                          
Listening on TCP:1234...                                                       
.                                                                              
.                                   
.                             
.                              
.
.               
.                                                                              
.                  
.                        
.                       

...

I'd expect the dots and the "tick" to be interleaved, also, the WiFi doesn't connect. I think the executor is not scheduled anymore.

@elenaf9
Copy link
Collaborator Author

elenaf9 commented Jun 20, 2024

Thank you for testing. I am currently debugging this, and it's super weird. What I observed is:

  • This issue only happens if the connection to the WIFI works. I tried with wrong SSID and PASSWORD and the issue doesn't appear in that case.
  • What is happening is that at some point when we yield_to_esp_wifi_scheduler, the esp scheduler switches the contexts of our threads. I observed this because instead of returning from the call, it suddenly continues running inside the thread_executor. It there then calls thread_flags::wait_any, but has the context from the esp_wifi thread. Of course this thread was never setup with an executor or pender, so it then never gets woken up again. Thus the wifi stack stops working. And your printer most likely stops working because something with its context was messed up as well.

I am trying to debug this further (already tried increasing stack sizes, didn't help)...

@kaspar030
Copy link
Collaborator

  • What is happening is that at some point when we yield_to_esp_wifi_scheduler, the esp scheduler switches the contexts of our threads.

ok, that is weird. Could have a lot of reasons ...

I'd ensure first that the task switching is triggered only when we actually expect it. E.g., that FROM_CPU_INTR3 only fires when inside the esp_wifi_thread loop after calling yield_to_esp_wifi_scheduler.

btw, where is the esp_wifi_thread representation on the esp-wifi side? how does the esp-wifi current_task() work with the riot thread?

This resolves the conflict between the esp_wifi internal scheduler for
the wifi background task, and the riot-rs-threads scheduling.
The esp_wifi thread is running at highest prio and frequently woken up
to run its tasks.
Disable CPU Interrupt 3 when not in `esp_wifi_thread`. This is necessary
so that the riot-rs-threads scheduler and the esp-wifi scheduler won't
interleave.
The `esp_wifi_thread` is frequently woken up by the systimer and will
then do the necessary yielding.
@elenaf9
Copy link
Collaborator Author

elenaf9 commented Aug 7, 2024

I'd ensure first that the task switching is triggered only when we actually expect it. E.g., that FROM_CPU_INTR3 only fires when inside the esp_wifi_thread loop after calling yield_to_esp_wifi_scheduler.

Yep, that was the issue. esp-wifi triggered the CPU Interrupt 3 from inside a ISR, and this interleaved with our own scheduler and caused the described behavior.

Disabling the CPU Interrupt outside of our esp-wifi-thread solves this issue. But as a result, esp-wifi can't directly yield to its tasks from inside an ISR anymore. Instead it only yields once the systimer fires again. Given that the timer fires very frequently, and from what I have tested, this restriction isn't necessarily a problem.

I have explored an alternative implementation, where we don't disable CPU interrupt 3 outside of the esp-wifi-thread. Instead we intercept it and wake up our esp-wifi-thread like we already do it on a systimer interrupt. But this requires some other changes as well, and results in a much more complex code: elenaf9/RIOT-rs@53ffe3b...ddc0975. I also noticed that this alternative implementation doesn't work as reliably, but couldn't fully debug why.

I am leaning towards the first implementation, which is the one currently pushed here. We can reiterate on this if an example comes up where it doesn't work anymore. Wdyt?


An orthogonal issue that I just want to mention as well:
If the execution is stuck in sched (because no thread is currently running) esp-wifi doesn't work properly. Most likely because its interrupts are at Priority1 and can't preempt our scheduler.
In the current implementation this is not an issue, because we frequently wake up the esp-wifi-thread, and therefore leave sched and give the esp-wifi interrupt a chance to fire.

Comment on lines +80 to +90

use super::*;
#[cfg(context = "esp32c6")]
use esp_hal::peripherals::INTPRI as SystemPeripheral;
#[cfg(context = "esp32c3")]
use esp_hal::peripherals::SYSTEM as SystemPeripheral;
use esp_hal::{
interrupt,
peripherals::{Interrupt, SYSTIMER},
};

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
use super::*;
#[cfg(context = "esp32c6")]
use esp_hal::peripherals::INTPRI as SystemPeripheral;
#[cfg(context = "esp32c3")]
use esp_hal::peripherals::SYSTEM as SystemPeripheral;
use esp_hal::{
interrupt,
peripherals::{Interrupt, SYSTIMER},
};
use esp_hal::{
interrupt,
peripherals::{Interrupt, SYSTIMER},
};
#[cfg(context = "esp32c6")]
use esp_hal::peripherals::INTPRI as SystemPeripheral;
#[cfg(context = "esp32c3")]
use esp_hal::peripherals::SYSTEM as SystemPeripheral;
use super::*;

Reordered as per the coding conventions

@@ -15,5 +15,8 @@ embassy-net = { workspace = true, features = ["tcp"] }
embassy-time = { workspace = true, default-features = false }
embedded-io-async = "0.6.1"
heapless = { workspace = true }
riot-rs = { path = "../../src/riot-rs", features = ["override-network-config"] }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me why static address configuration (what the network configuration does) needs to be disabled by default

.set(riot_rs_threads::current_pid().unwrap())
.unwrap();

// Wait until `embassy` was initialized.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Wait until `embassy` was initialized.
// Wait until `embassy` is initialized.

riot_rs_threads::thread_flags::wait_one(0b1);

// Bind the periodic systimer that is configured in esp-wifi to our own handler.
unsafe {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get a SAFETY comment here (I suppose that for instance the interrupt must only be bound to one thing at a time)?


// Handle the systimer alarm 0 interrupt, configured in esp-wifi.
extern "C" fn systimer_target0_() {
unsafe {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a SAFETY comment, at least to justify that SYSTIMER is not stolen multiple times.


// CPU Interrupt 3 triggers the scheduler in `esp-wifi`.
fn yield_to_esp_wifi_scheduler() {
unsafe {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get also a SAFETY comment here (maybe move the comment outside the function inside)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

riot-rs-threads: resolve threading ^ wifi on esp32
3 participants