Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Builtin "print" in kernel should be an async rpc top avoid RTIO underflows #2680

Open
systemofapwne opened this issue Feb 13, 2025 · 5 comments

Comments

@systemofapwne
Copy link

Bug Report

Builtin print in kernel should be an async rpc to avoid RTIO underflows

Short summary.

When calling print within a kernel, e.g. for debugging a certain part of kernel code, it can cause an RTIO underflow. Especially, when print is called close to timing critical calls to hardware.
The nature of the cause is, that print is not an async rpc so the kernel waits for the function to return. The amount of time for this to happen varries and mostly depends on network delays.

Steps to Reproduce

Run this code with the print statement. It will throw an RTIO underflow.
Run the the code instead with my custom kernel_print method, that is an async RPC, it will work as expected

from artiq.language import *
from artiq.experiment import *

@rpc(flags={"async"})
def kernel_print(*args, **kwargs): print(*args,**kwargs)

class RTIOUnderflow(EnvExperiment):
    def build(self):
        self.setattr_device("core")
        self.setattr_device("ttl6")    

    def prepare(self):
        pass

    @kernel
    def run(self) -> TNone:
        self.core.break_realtime()

        self.ttl6.off()
        print("Hello there")            # This builtin function causes an underflow
        kernel_print("Hello there")     # This custom function works
        self.ttl6.on()

Expected Behavior

No RTIO underflow

Actual (undesired) Behavior

RTIO underflow thrown with a negative slack of roughly the network delay (e.g. a few ms)

Your System (omit irrelevant parts)

  • Operating System: Arch Linux
  • ARTIQ version: ARTIQ v9.9013+38e0d6b.beta
  • Version of the gateware and runtime loaded in the core device: Same as ARTIQ environment above
  • Hardware involved: Kasli with master & satellite crate
@Spaqin
Copy link
Collaborator

Spaqin commented Feb 13, 2025

Async RPC still takes time to be dispatched, so it's unlikely that it would help.

In your experiment, if you care for timing, you shouldn't rely on prints on anything. If you want to debug your experiment, you can use RTIO analyzer, if you need to carry data, there's datasets, and it would be best to wait until the end anyway.

If you really need prints in your code, you can break realtime or introduce a delay.

Can you tell us more what are you trying to achieve?

@thatschatt
Copy link
Contributor

Sure, but it wouldn't hurt to make print async would it? From what I remember the last time I checked, the difference (on a Kasli 2.0) async RPC takes hundreds of μs, versus several ms for synchronous. So in a lot of cases that would be fast enough to slip in a quick print() for debug or diagnostic purposes.

As far as I can tell, print is already just an RPC call to the standard python function, so this should just be a matter of putting an async rpc decorator on print() in the global namespace, right?

@systemofapwne
Copy link
Author

systemofapwne commented Feb 13, 2025

Async RPC still takes time to be dispatched, so it's unlikely that it would help.

Good note, that async RPCs still take some time.
However, print unecessarily eats precious time when used synchroneously instead of simply async (it will never return anything anyway). At least with our cases and especially with the example above, we never have seen any RTIO underflows by print anymore after declaring our own async version of it.

In your experiment, if you care for timing, you shouldn't rely on prints on anything. If you want to debug your experiment, you can use RTIO analyzer, if you need to carry data, there's datasets, and it would be best to wait until the end anyway.

Very good note. We will take a look at that. But also see [1]

If you really need prints in your code, you can break realtime or introduce a delay.

This is what we did in order to fight the issue here and there, before we figured, that synchroneously print was causing trouble and async solved it.

Can you tell us more what are you trying to achieve?

In short: An atoms optics experiment with rather slow control (sequences of several seconds with individual steps in the range of many ms to hundreds of ms). Prints in the kernel are rather rare.
[1] But since our lab is quite dynamic and flexible in experiments, staff and students it is not uncommon, that 'the new guy' (or someone, who is not an expert of ArtiQ, python and sync/async etc) slaps a print on some lines of code to debug things. This randomly causes sideeffects, that are hard to track (almost nobody expects print to cause it)

@systemofapwne
Copy link
Author

Sure, but it wouldn't hurt to make print async would it? From what I remember the last time I checked, the difference (on a Kasli 2.0) async RPC takes hundreds of μs, versus several ms for synchronous. So in a lot of cases that would be fast enough to slip in a quick print() for debug or diagnostic purposes.

This is also what my opinion is, no matter if using print for debugging is good practice or not.

As far as I can tell, print is already just an RPC call to the standard python function, so this should just be a matter of putting an async rpc decorator on print() in the global namespace, right?

Yes, it absolutely is. We already tried to simply overwrite print with our own definition but the compiler explicitly replaces the symbol with an rcp call to the inbuild version:

"print": builtins.fn_print(),

@sbourdeauducq
Copy link
Member

Async RPC still takes time to be dispatched, so it's unlikely that it would help.

It may help and there is no reason to make it a slower synchronous RPC, since it returns no data.

In your experiment, if you care for timing, you shouldn't rely on prints on anything.

There is also a core_log function that just writes to the core device log and should be faster than print. And rtio_log which adds user-defined data to the RTIO analyzer. Neither function does any host communication at the time of call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants