Design dilemma: Whether to force UIA method calls onto the UI main thread #68
Replies: 1 comment
-
After discussing this with a colleague, I'm inclined to stick with my original direction and not require UIA methods to run on the UI thread. The downside about remote operations not being guaranteed a consistent view of the tree isn't yet major; so far, only Narrator uses the remote operations API, and that API is only available on Windows 11. And even there, Narrator should be prepared to deal with changes to the UIA tree in the middle of a remote operation, because WPF doesn't use COM threading, so remote operations in a WPF application won't run on the UI thread. As for the hit-testing issue, I'll need to work through that, but the advantages of being free from the Ui thread are too big to discard IMO. |
Beta Was this translation helpful? Give feedback.
-
Currently, UIA Core is free to call our UIA provider methods on a thread other than the UI thread. And in most cases, UIA calls do happen away from the UI thread. This flexibility would be intractable in most UIA implementations, but it's straightforward in AccessKit, because AccessKit is working with accessibility tree snapshots that are pushed to it by the application, rather than live objects that are being mutated in-place by the application as with most accessibility implementations. This is good, because it frees up the UI thread for other things, and it means that if the UI thread is hung or busy, most UIA requests will continue to be handled promptly. I think this design is a good fit when integrating AccessKit into designs that don't have the typical GUI message loop, e.g. game engines. It also means that AccessKit behaves consistently gardless of how (or if) COM was initialized on the UI thread.
But, I'm now thinking about reversing this decision. First, there's at least one UIA operation, performing a hit test via
IRawElementProviderFragmentRoot::ElementProviderFromPoint
, that is probably best implemented by running an application-provided callback synchronously. (We could do an asynchronous request and do our best guess in the meantime, like Chromium does, but I've heard that this solution isn't entirely reliable.) Implementing a synchronous request to the UI thread, either in AccessKit or in the application itself, is doable. But it's tricky, because in some cases (depending on the assistive technology being used), our UIA provider methods do run on the UI thread. So the code that does the synchronous request to the UI thread has to be prepared for that case, meaning it would probably have to have two different code paths depending on whether the UIA method is running on the UI thread. So it would be simpler if the UIA method was guaranteed to be called on the UI thread in the first place.The other reason I'm thinking about forcing UIA methods to run on the UI thread is more esoteric. Windows 11 introduces a new UIA feature called Remote Operations, which allows UIA clients to make more complex requests in a single IPC round-trip. I worked on this feature when I was at Microsoft, and it's UIA's answer to a problem that has dogged Windows AT developers for decades -- how to have a clean accessibility API that doesn't require the AT to inject code in-process. A UIA "remote operation" is basically a bytecode program that can make multiple UIA requests, with conditionals, loops, and so on. And, most relevant for this discussion, if the UIA provider is configured to use COM threading, and the UIA thread is using a COM single-threaded apartment (STA), then each remote operation will run, uninterrupted, on the UI thread. That means that each remote operation is guaranteed to be working with a single consistent snapshot of the accessibility tree, over multiple UIA requests. Such a guarantee wouldn't hold if the remote operation is running on a thread other than the UI thread while the UI thread is updating the UI. Conversely, this does mean that the remote operation would tie up the UI thread while it's running, but UIA Core imposes a limit on how many iterations of the bytecode interpreter are allowed in a single operation. (I forgot what that limit is.)
To reiterate the downsides of forcing UIA method calls onto the UI thread: First, it requires that COM be initialized in STA mode on the UI thread, imposing another requirement on AccessKit users and leading to nasty reentrancy scenarios like the ones I worked around in #65 and #66. More obviously, but perhaps more importantly, it means that the responsiveness of the accessibility implementation is dependent on the responsiveness of the UI thread. If the UI thread hangs, then a blind person loses access to information that a sighted person could still see by looking at the last frame. And if the UI thread is busy or simply slow to pump messages (as in some game engines), responses to accessibility requests will be sluggish. The latter problem isn't theoretical and isn't limited to game engines or other non-standard GUI toolkits; when I was on the Windows accessibility team at Microsoft, we identified UI thread contention as a major cause of sluggishness in a popular application. To advance the state of the art in Windows accessibility, I would like to break free of that limitation in AccessKit.
So I'm faced with a dilemma. Both options have potential advantages for robustness in certain cases. But I don't know which set of tradeoffs is better in practice.
Beta Was this translation helpful? Give feedback.
All reactions