Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple block_on that doesn't just spin #1476

Closed
Dominaezzz opened this issue Apr 19, 2024 · 16 comments
Closed

Simple block_on that doesn't just spin #1476

Dominaezzz opened this issue Apr 19, 2024 · 16 comments

Comments

@Dominaezzz
Copy link
Collaborator

Would esp-hal be interested in having a simple block_on that uses waiti/wfi to waits for futures?

I've been using embassy-futures::block_on for simple projects and tests, but I think I shouldn't have to pull in another dependency (which spins, 100% CPU, as it's hardware agnostic) to just do run some async code. I also don't want to opt in to the "whole" embassy framework just to wait on a single future (#1035). The HAL should have a simple way to efficiently wait on a single future on its hardware.

I want to contribute this but want to make sure it would be welcome, or if there are different plans, or if embassy is simply the way to do async with esp-hal.

Side note: I was wondering if it'd make sense for the blocking drivers to use some version of this. They shouldn't be spinning when waiti/wfi exists. Now that interrupt handlers can be registered at runtime, perhaps the blocking drivers can register one temporarily to just block efficiently, just an idea.

Another side note: With the right reactors set up, it may be possible to have a Rust program automatically go into proper light sleep if the set of all waiting wakers are waiting for light sleep compatible timers and/or gpio pins. (Though this is mostly out of scope of this issue)

@Dominaezzz
Copy link
Collaborator Author

Xtensa implementation.

pub fn block_on<F: Future>(mut fut: F) -> F::Output {
    let state = TaskState::new();

    let raw_waker = state.raw_waker();
    let waker = unsafe { Waker::from_raw(raw_waker) };

    let res = {
        let mut cx = Context::from_waker(&waker);
        let mut fut = pin!(fut);
        loop {
            if let Poll::Ready(res) = fut.as_mut().poll(&mut cx) {
                break res;
            } else {
                state.wait_until_woken();
            }
        }
    };

    let clone_count = state.clone_count.load(Ordering::SeqCst);
    // This method cannot return if there are clones since TaskState is created on the stack.
    assert_eq!(
        clone_count, 0,
        "Waker was leaked in {} places!",
        clone_count
    );

    res
}

struct TaskState {
    work_available: AtomicBool,
    clone_count: AtomicUsize,
    core: Cpu,
}

impl TaskState {
    fn new() -> Self {
        Self {
            work_available: AtomicBool::new(false),
            clone_count: AtomicUsize::new(0),
            core: get_core(),
        }
    }

    fn clone(&self) {
        self.clone_count.fetch_add(1, Ordering::SeqCst);
    }

    fn wake(&self) {
        self.work_available.store(true, Ordering::SeqCst);
        if self.core != get_core() {
            // This could be implemented with software interrupts.
            // SoftwareInterrupt<..> methods would need to take & instead of &mut.
            unimplemented!()
        }
    }

    fn drop(&self) {
        self.clone_count.fetch_sub(1, Ordering::SeqCst);
    }

    fn wait_until_woken(&self) {
        wait_for_interrupts_until(|| self.work_available.swap(false, Ordering::SeqCst));
    }

    const VTABLE: RawWakerVTable = RawWakerVTable::new(
        Self::raw_clone,
        Self::raw_wake,
        Self::raw_wake_by_ref,
        Self::raw_drop,
    );

    fn raw_waker(&self) -> RawWaker {
        RawWaker::new(self as *const Self as _, &Self::VTABLE)
    }

    fn raw_clone(src: *const ()) -> RawWaker {
        (unsafe { &*(src as *const Self) }).clone();
        RawWaker::new(src, &Self::VTABLE)
    }

    fn raw_wake(src: *const ()) {
        let state = unsafe { &*(src as *const Self) };
        state.wake();
        state.drop();
    }

    fn raw_wake_by_ref(src: *const ()) {
        unsafe { &*(src as *const Self) }.wake();
    }

    fn raw_drop(src: *const ()) {
        unsafe { &*(src as *const Self) }.drop();
    }
}

fn wait_for_interrupts_until(mut condition: impl FnMut() -> bool) {
    use core::arch::asm;

    if condition() {
        // In case of spin, skip any PS.INTLEVEL tinkering code.
        return;
    }

    loop {
        // Previous PS (Processor Status) Register.
        let prev_ps: u32;

        // RSIL - Read and Set Interrupt Level
        //
        // Read the current value of the PS register and set PS.INTLEVEL to 5 to prevent
        // interrupts (that we're about to wait for) from firing.
        // The highest level on the ESP32 is 5. Xtensa ISA supports up to 15.
        unsafe { asm!("rsil {0}, 5", out(reg) prev_ps) };

        if condition() {
            // WSR - Write Special Register
            //
            // Set PS.INTLEVEL back to what it was before.
            unsafe {
                asm!(
                    "wsr.ps {0}",
                    "rsync",
                    in(reg) prev_ps
                );
            }

            // We can stop waiting, there is work available.
            break;
        } else {
            // WAITI - Wait for Interrupt
            //
            // Set PS.INTLEVEL back to what it previously was and suspend this core until an
            // interrupt above the existing PS.INTLEVEL occurs.

            let int_level = prev_ps & 0x00000F;
            match int_level {
                0 => unsafe { asm!("waiti 0") },
                1 => unsafe { asm!("waiti 1") },
                2 => unsafe { asm!("waiti 2") },
                3 => unsafe { asm!("waiti 3") },
                4 => unsafe { asm!("waiti 4") },
                5 => unsafe { asm!("waiti 5") },
                _ => unreachable!(),
            }
        }
    }
}

@Dominaezzz
Copy link
Collaborator Author

When I come back to wanting this I'll just make a PR for it.

@bugadani
Copy link
Contributor

bugadani commented Nov 18, 2024

Xtensa implementation.

Isn't this implementation just broken? If I block_on(async { select(timeout, useful_work) }); will it not always panic?

@Dominaezzz
Copy link
Collaborator Author

So long as timeout and useful_work do the right thing and release the waker when they're finished/cancelled, it won't panic.

@bugadani
Copy link
Contributor

But nothing requires them to do so, as Future isn't unsafe and implementations can do whatever they want.

@Dominaezzz
Copy link
Collaborator Author

Sure, that's why the panic is there.

No reason esp_hal can't play nice and release the waker.

@bugadani
Copy link
Contributor

What about embassy-sync, and all the others not under our control? I create a signal before block_on, wait for it and a timeout, and I'll get a panic.

@Dominaezzz
Copy link
Collaborator Author

Those can get PRed to change them.

Still doesn't stop esp_hal from playing nice, and not making the problem worse.

@bugadani
Copy link
Contributor

bugadani commented Nov 18, 2024

Those can get PRed to change them.

All right, that means we can come back to this once everything has been fixed.

Alternatively, the implementation can take a &mut &'static mut instead of creating the task state on the stack, and side-step the whole problem without burning cpu cycles resetting state that doesn't need resetting IMO.

@liebman
Copy link
Contributor

liebman commented Nov 18, 2024

How would this work if the block_on were inside of a critical section (with interrupts disabled)?

@bugadani
Copy link
Contributor

bugadani commented Nov 18, 2024

I think there are certain assumptions these executors have to make, one of them being "is not ran inside a critical section". After all, they are scheduled by interrupts. That reminds me, we're still missing the check in esp-hal-embassy. The best thing these can do is detect and panic. While the waiti instruction can break out of a critical section (at least on single-core), that's most likely not what the user would want.

@liebman
Copy link
Contributor

liebman commented Nov 18, 2024

If one is using sequential_storage it has an async interface and I think one needs to do flash operations in a critical section (at least on multicore). So how would one block on the future from that?

@bugadani
Copy link
Contributor

I would not recommend running flash operations asynchronously in the first place, the CPU can't read code while the operations are in progress. It works only as long as code that is running is in cache or RAM.

@liebman
Copy link
Contributor

liebman commented Nov 18, 2024

This has been around for a while: https://crates.io/crates/sequential-storage

@bugadani
Copy link
Contributor

I'll rephrase, I'd not recommend doing this on ESP32. You'll set yourself up for a world of pain. What I did in my stuff, was a fake async, that ran the operations synchronously, then yielded once. Anything else resulted in random errors.

I'm sure the async interface works with internal flash, or when running from RAM, I have nothing against the concept. All I'm saying is that don't expect it to work on ESP32.

@liebman
Copy link
Contributor

liebman commented Nov 18, 2024

actually never mind - the esp-storage crate has a critical-section feature I failed to enable!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

3 participants