Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XAP (née QMK API) discovery and scoping #11567

Open
4 tasks done
zvecr opened this issue Jan 15, 2021 · 41 comments
Open
4 tasks done

XAP (née QMK API) discovery and scoping #11567

zvecr opened this issue Jan 15, 2021 · 41 comments

Comments

@zvecr
Copy link
Member

zvecr commented Jan 15, 2021

QMK API, aka VIA+OpenRGB+other rawhid all bundled into one

Feature Request Type

  • Core functionality
  • Add-on hardware support (eg. audio, RGB, OLED screen, etc.)
  • Alteration (enhancement/optimization) of existing feature(s)
  • New behavior

Description

@qmk qmk locked and limited conversation to collaborators Jan 15, 2021
@tzarc
Copy link
Member

tzarc commented Jan 18, 2021

QMK Live Config

Prior art:

  • VIA
    • App
    • via.c
    • "VIA 2.0" is coming, zero visibility of it so far
  • vial:
  • remap.keys
  • OpenRGB:
  • Glorious Core?
  • RAW HID:
    • Anyone can write a raw HID handler, new API requires that user-mode code can coexist with the API without requiring another endpoint

Thoughts:

  • Decouple "qmk kb api" and functionality, so that other clients can be designed

    • QMK API needs to be a superset of current via capabilities
    • e.g. via can upgrade to use the "approved" interface, decommission the bespoke one
  • Can we leverage the ZSA work for WebUSB?

    • Needs LUFA PR merge first
      • Or, extract/decouple the descriptors from LUFA as per previous fauxpark discussions
    • Can we prevent non-QMK infra from writing to devices? Do we need to? Critical areas?
    • Are we adding an attack vector for people creating malicious apps to wipe boards?
      • putting board into flashing mode for some operations would mitigate
      • unlock mode as per vial?
  • QMK API needs to work with Raw HID if selected instead

    • Decouple actual API from the underlying transport -- connect up to RAW HID, WebHID/WebUSB, CDC?
  • Reuse Configurator UI components, hook this stuff into normal QMK Configurator UI?

  • Should we embed info.json in some way?

    • Data-driven QMK works well with this
    • Compress the info.json and embed a blob?
      • Space issues for AVR
      • Seem to max out at about 1.5kB compressed with gzip
    • Should we embed just the git hash, derive the info.json URL on github? (what about forks? temp builds?)
    • Would allow querying of capabilities of the connected device, rather than relying on a git repo that may change
      • Selectable layouts, etc. etc.
      • Should add LED locations/names etc. and generate during build as per data-driven targets
    • Should we transparently invoke configurator to allow recompilation of firmware if we need to enable different subsystems, enabling rgblight or something
  • WebDFU for firmware updates?

    • Perhaps add reset procedure docs into info.json, so that we can pop it up in the UI?
  • Configuration API needs to be able to be multiplexed on raw hid / webusb, so that other protocols can be used by users without being mutually exclusive

    • Can we factor in OpenRGB? (see PR#10961)
    • QMK = 0x00 prefix, VIA = 0x01 prefix, etc. etc.
      • 1 byte? 2 bytes? The WebUSB PR has a 64-byte WEBUSB_EPSIZE
    • payload follows first byte "namespace"

@drashna
Copy link
Member

drashna commented Jan 22, 2021

Re: WebUSB, and the associated PR. It's probably not completely in line with what we'd want, but two big things in it: the only command that works without user intervention is the firmware version command. Otherwise, everything else REQUIRES that the keyboard be sent a pairing request, and for a button to be pushed on the keyboard to complete pairing.

Also, no matter what, I think that we absolutely need to have a "prefix" for commands over raw HID, so that multiple functions can use it, in parallel. Or some such solution, to properly share the endpoint. This is a MUST, IMO.

@tzarc
Copy link
Member

tzarc commented Jan 22, 2021

Thinking about this further, I think I'd like some sort of formal API specification that is able to be used as a "master list" -- produce docs, compile-time generation of kb-side parser code, as well as potentially generating multi-language clients so that people can grab an SDK and hook in directly.

Clearly a long-term initiative, but direction-wise that's where I'd like to head.

@tzarc
Copy link
Member

tzarc commented Feb 2, 2021

Another one:

https://github.com/Drugantibus/qmk-hid-rgb

@qmk qmk unlocked this conversation Feb 13, 2021
@tzarc
Copy link
Member

tzarc commented Feb 13, 2021

Opening this issue for comment... RFC from @Wilba6582 @xyzz @Kasper24 @Drugantibus -- thoughts would be great.

@tzarc tzarc self-assigned this Feb 13, 2021
@Drugantibus
Copy link

Drugantibus commented Feb 14, 2021

I agree with the proposed solution for multiplexing different functionalities on raw HID by using a "prefix" to dispatch incoming HID packets to the appropriate function, similarly to how I do it in my repo. IMO it's easy to understand, implement, and scale for future requirements. Using different usage pages/IDs should be possible (and probably "more correct" from a HID point of view), but adds unecessary complexity, clunkiness and potential compatibility issues. I for one wouldn't want a dozen hidraw devices when connecting my keyboard :)
The downside to this approach is that you can't know which features are enabled, unless maybe there is a "meta" endpoint?
Regarding this "header" size, I think 1 byte should be enough but it's probably better to go with 2 from the start for future proofing. Maybe the header can be stripped after dispatching to decouple the implementation from the multiplexing.

@CalcProgrammer1
Copy link

I like the idea of a prefix to send the packet to the correct functionality. Perhaps have byte 0 of any incoming raw packet be parsed as the function code.

I agree that there should be a meta function that always exists. This function should be able to provide the essentials - keyboard manufacturer, name, QMK version, serial number (if exists), description, and then a list of supported functions (VIA, OpenRGB, etc). Possibly also common commands like reboot into bootloader.

As for RGB control, I think it would make sense to have one function for RGB matrix. This endpoint should provide the matrix size (width and height) along with the LED-to-keycode mapping and a mapping table to map LEDs into a grid. It should provide a list of enabled RGB animations and their parameters. It should be able to set the selected animation/parameters. It should also provide a means to directly control the LEDs from PC-side software. This is already implemented in @Kasper24's OpenRGB protocol. It should also provide the ability to have more than one matrix, as some keyboards could have an underglow zone with multiple rows.

RGB lights not part of a matrix would be another function. It would define number of LEDs and possibly some form of zone mapping. Same as the matrix implementation, it would provide direct PC-side control as well as control over built-in animations.

Non-RGB functions should also be considered. Maybe a way to upload keymaps without re-flashing the firmware? A macro editor? Polling rate?

@sigprof
Copy link
Contributor

sigprof commented Feb 22, 2021

One possible reason to create separate HID interfaces is that various USB APIs may prevent concurrent usage of the same HID interface by different apps, therefore running an app which performs RGB control might prevent other apps (e.g., keymap configuration like VIA) from communicating with the device if the same HID interface is used by both apps. However, this can be hindered by the endpoint shortage issues (e.g., the OTG controller in F401/F411 can support at most 3 HID interfaces, and normally 2 of them are already occupied by the boot keyboard and shared, therefore adding anything more than a single rawhid interface is not possible without requiring KEYBOARD_SHARED_EP).

It could be possible to create a completely custom USB interface without any endpoints and communicate using just control transfers, but this introduces Windows driver issues (although it could be possible to implement Microsoft OS descriptors to install the WinUSB driver automatically; these descriptors would be needed for WebUSB too).

Also, in theory the USB HID Usage Table 1.21 defines the Lighting And Illumination Page (0x59), which looks like an attempt to make some universal mechanism for RGB LED control (although the InputBinding part does not really map well to QMK, because it refers to HID usages directly without taking the possibility of remapping into account). However, this page does not really provide any way to control RGB animations played by the keyboard itself (it specifies only a mechanism for the host to take complete control over all LEDs), so it is not as useful as it could be.

@tzarc
Copy link
Member

tzarc commented Mar 3, 2021

Adding https://remap-keys.app/ from @yoichiro
Source https://github.com/remap-keys/remap

@jkutianski
Copy link

https://github.com/BlankSourceCode/qmk-hid-display by @BlankSourceCode

@tzarc
Copy link
Member

tzarc commented Mar 6, 2021

Gripes raised during today's discussion:

  • Need to make sure we don't reorder or renumber keycodes in order to prevent breakages
  • If we do need to do so, then it's an API version break, and needs to be reflected in some sort of versioning number associated with that API
  • Need to also take into account other IDs, such as RGB effect IDs and the like
  • Need to be able to query the capabilities of the target device, to determine if certain subsystems or RGB effects are disabled, for example

@jkutianski
Copy link

@tzarc

  • Need to make sure we don't reorder or renumber keycodes in order to prevent breakages

I think is a good idea reflect the keycodes used on the firmware like shortname, longname and group.
Something like.
Keycode Shortname Logname Group
0x5700 "TD_BRKT" "TD_BRACKET" "Custom"
0x5CBB "BL_ON" "BACKLIGHT_ON" "Backlight"

  • Need to be able to query the capabilities of the target device, to determine if certain subsystems or RGB effects are disabled, for example

The same goes for some key codes. If you don't have a backlight, it doesn't make much sense, see a key code to turn on the backlight. Or MIDI

@tzarc
Copy link
Member

tzarc commented Mar 7, 2021

The same goes for some key codes. If you don't have a backlight, it doesn't make much sense, see a key code to turn on the backlight. Or MIDI

Yep, which is why the subsystem query commands must exist -- the "group" you mentioned allows mapping back to what should and shouldn't be shown. Very much a good idea.

I think I'd also prefer to see these definitions as part of a JSON document in QMK's repo, with build-time codegen much like what's happening with the data-driven builds. We should be able to get this sort of keycode definition in early, without requiring API work to be complete... and things like VIA/vial/remap.keys et.al. can leverage the definitions directly, instead of maintaining their own mapping.

@Drugantibus
Copy link

@sigprof I assume that most functions wouldn't keep using the interface continuously though? Or at least not frequently enough that this issue couldn't be resolved by some scheduling/collision avoidance algorithm... But tbh you seem much more knowledgeable than me so I would like to hear your thoughts on this :)

I guess that at this point my question is how we should approach modularity vs standardization. Is the plan to offer the community the possibility of implementing their own functions on the API or will it be a "core QMK thing" which is fully documented on the docs? Or somewhere in between?

@tzarc
Copy link
Member

tzarc commented Mar 7, 2021

Is the plan to offer the community the possibility of implementing their own functions on the API

The plan is to allow users to work within their own "namespace" without colliding with normal QMK functionality.

Ideally, if we've got the ability to mark certain APIs as "secure" (i.e. requiring some sort of handshake at runtime -- think entering bootloader or clearing EEPROM) then we could also provide the ability to user-mode APIs too.

@tzarc
Copy link
Member

tzarc commented Mar 7, 2021

For clarity, I mean that the interface for communicating with QMK will be common between the QMK functionality and the user-mode functionality -- "entering" user-mode will likely require a known prefix of bytes which then any payload of your choosing can be specified.

@infinityis
Copy link
Contributor

Regarding the lock vs no lock options, I can see the rationale for both for very good reasons.

Without the lock, I could make a malicious small USB host device made to do one job: randomize key layouts of QMK-compatible keyboards, made for pranking co-workers. I would only need brief physical access but it could be done. With the lock, it adds some steps, either resetting and reloading firmware or figuring out the unlock pattern, etc. I could also envision malicious software that just does something like that too. Why such things would exist, I don't know, but it is possible in theory. So the question becomes "is the risk worth it? What is the likelihood and severity of having no lock?"

Currently I would characterize it as low likelihood, with the potential for increased likelihood as QMK usage increases. Severity is moderate; it's annoying (best case) and a showstopper in a worst case scenario (a user that doesn't know how to fix, so keeb is dead until help arrives)

However, with the lock, it could be very frustrating to use and test keyboard PCBs during development, depending on which features require unlock. If my kepmap for a new keyboard has some problems I need to debug, the last thing I need is an unlocking key combo standing in my way, especially if I'm using tweezers to test.

Might I propose the following: Make the default to have the keyboard lock enabled, but allow a simple #define to disable the lock. Also, adopt a policy that any keyboard configuration submitted in a pull request must have the lock enabled.

If someone really wants to make and ship keyboards without the lock enabled, they can certainly do it, but the hex file generated by QMK source (and QMK configurator) will have locking capability. Thus, if a user knows what they're doing and doesn't want the lock enabled, more power to them.

@qmk qmk deleted a comment Mar 10, 2021
@infinityis
Copy link
Contributor

Also, for the locking feature, when and/or how to re-lock after an unlock should be considered. Should it re-lock automatically after a no-remapping-activity timeout (and if so, how long), or stay unlocked indefinitely unless the user re-locks it manually (through the same key combo? a different key combo? through the API?)?

@jkutianski
Copy link

I don't know how QMK will be deal with analog keys, joystick and potentiometers but I think is a good idea take this in care for the new API. Specially for MIDI, Mouse and Joystick keycode parameters.
I thinking on this
https://www.razer.com/gaming-keyboards/razer-huntsman-v2-analog/RZ03-03610200-R3U1
Or this
https://joytokey.net/en/overview

@palp
Copy link

palp commented Mar 19, 2021

One possible reason to create separate HID interfaces is that various USB APIs may prevent concurrent usage of the same HID interface by different apps, therefore running an app which performs RGB control might prevent other apps (e.g., keymap configuration like VIA) from communicating with the device if the same HID interface is used by both apps. However, this can be hindered by the endpoint shortage issues (e.g., the OTG controller in F401/F411 can support at most 3 HID interfaces, and normally 2 of them are already occupied by the boot keyboard and shared, therefore adding anything more than a single rawhid interface is not possible without requiring KEYBOARD_SHARED_EP).

It could be possible to create a completely custom USB interface without any endpoints and communicate using just control transfers, but this introduces Windows driver issues (although it could be possible to implement Microsoft OS descriptors to install the WinUSB driver automatically; these descriptors would be needed for WebUSB too).

Also, in theory the USB HID Usage Table 1.21 defines the Lighting And Illumination Page (0x59), which looks like an attempt to make some universal mechanism for RGB LED control (although the InputBinding part does not really map well to QMK, because it refers to HID usages directly without taking the possibility of remapping into account). However, this page does not really provide any way to control RGB animations played by the keyboard itself (it specifies only a mechanism for the host to take complete control over all LEDs), so it is not as useful as it could be.

I implemented this HID usage on my firmware, but unfortunately Windows API (UWP) "eats" the device and prevents normal USB HID enumeration. You can access it through the UWP APIs but I've not been able to get it to work unless the app has focus no matter what I try, and only being able to control lighting from the foreground is not very useful. I do have it working with a different usage code, so that Windows doesn't know what it is, and with a basic implementation I built for the host side of the HID spec, it works very well - I've been running it for months actively driving my LEDs.

Also, hilariously and painfully, if the device fails to confirm to the spec in any way - unexpected data, no data, etc - the entire windows UI freezes until it's killed. Not the app - the windows UI process. Had a load of fun debugging it. I filed a bug with Microsoft but I haven't gotten a response and don't expect one - UWP is all but abandoned in favor of cross-platform forms in .NET 6 so we'll probably have to wait for yet another new WinAPI for it to get fixed.

Since the work in #10961 was happening concurrently I left my POC alone and figured I'd regroup later - taking the commonalities between my interface and the OpenRGB PR would make a good starting point for a RGB API interface. Happy to see that's begun - I've laid out some useful notes and thoughts below to start but I'll be back with more.

I documented some of my journey in #692, It started as an extension of RAW HID, but I moved on and quickly came to the same conclusion that a mess was brewing and a consistent API was needed - and by the end of my POC I was sure. Unless you want to go read all the the gory details, my last comment summarizes my feelings on suitability of HID after finishing the POC . I won't bother repeating why we need an API; we know. For context, I was replying to a comment noting that virtsir was likely a better option than RAW_HID.

Virtser seems like the better choice for general communication over RAW HID, for sure. But for real-time control, which is primarily what I've been working with, bulk endpoints aren't going to be as reliable.

What's good/bad about the HID implementation I've got going is that it's using the Feature report types, as opposed to Input/Output, which I've learned means the requests come in on the control channel - there is no polling involved. In fact I'm not sure my interface needs endpoints since everything it does happens on the control channel. This seems ideal for situations where responsiveness is important - lighting may be a trivial one, but things like responding to keypresses or quickly lighting up status indicators. But some keyboards might not keep up, etc, so it wouldn't always be the best choice.

My description of how it works may not be fully accurate, this was my first foray into the protocol so I've picked it up as I went along. But I did note as you said that it seems possible to implement it without consuming an interface. Not knowing this at all at the onset anyway, I created a replacement for the RAW_HID interface that moves RAW_HID to its own usage page and gives it a report number, as others have mentioned is an option. I kept RAW_HID itself in place as they can both run on different interfaces if both are needed.

Regarding the InputBindings, I didn't really solve it, just worked around the gap by generating a keymap at boot to respond to the HID requests with the correct info (proper usage of the protocol dictates this is mandatory). However, my thought at solving for dynamic keymaps was to break protocol slightly by responding to update reports with the keycode map for all keys present in the report; this would ensure that at least the host-cached keymap get updated whenever that LED changes as it would be able to see the discrepancy and use the new code.

The physical characteristics in the reports is another gap, but they only matter if the host uses them. The WinAPI code is probably the only other host code in existence and as noted, if we support that nothing else can use our HID implementation on top of it not working well for many use cases. They can be optional/static to be filled in per-keyboard if desired/needed, and they don't have to be right.

I have a functional but incomplete/unpolished local PR palp#1. It's specific to drop alt, some of the code still lives in my keymap where it began - solving the abstraction and graceful degradation of features was one of the things that I felt an API was needed for.

I implemented the HID-based client inside the Aurora RGB lighting control app along with support for the layout, which let me test some pretty intense animations and quick reactivity, without doing my own mapping and framebuffers and all that noise. This is also in a PR that's even less polished, but it works well enough.
palp/Aurora#1

It's very much worth noting that my biggest finding from driving the LEDs themselves was that I needed to completely disengage the RGB matrix loop to be absolutely sure nothing was double-writing my PWM buffers for the LEDs - at least I think that was the problem, and doing so fixed a recurring crash. But I think that's a hardware-specific implementation to the drop keyboards. It should serve as a reminder, though, that safeguards need to be in place to protect against bad behavior from the USB host crashing the keyboard. For example, I ignore the HID LampArray command to flush the update; my code does so when it knows it should and never at the behest of the host. I had a bug in my client that was requesting updates far too often, and since prior to that change I would blindly obey, it was tripping over itself in some way. It's also entirely possible you can't access the PWM from an interrupt and that was another source of my problems - embedded systems is not my field so I'm guessing at the root cause. I know enough to understand that building for this protocol is different than most prior art in the codebase because you're processing data as it's received on the control channel, not polling and handling it on your own time.

@palp
Copy link

palp commented Mar 19, 2021

To follow up with and summarize some thoughts on future direction:

  • Aurora has integration code for various RGBLED SDKs/protocols that we'd probably all do well to study in depth - both for possible compatibility layers and as case studies. There's few if any projects with as wide a scope of hardware support for RGB peripherals.
  • As I mentioned, the union of features from my PR and Implement OpenRGB protctol for RGB Matrix keyboards #10961 is a good starting point for an RGB API. I tried to keep what I was doing somewhat in line so they shouldn't be far off.
  • We can have "real" HID LampArray support but most people wouldn't want it, since maybe not so surprisingly MS's own spec causes problems on their platform. But we can keep compatibility with it other than a different usage page/code; and we can add features as well, though we should probably put those in their own usage to avoid conflicts in the off chance the spec updates.
  • HID LampArray doesn't need to be "the" protocol (although I like the design of it overall), I imagined the API as an abstraction layer for both the hardware and the protocol, so multiple protocol implementations could be built against the internal QMK API and be enabled/disabled as needed.
    • This leaves the door open to things like emulating other RGB lighting systems or even entire keyboards
    • Even for a first party protocol, different methods may work better depending on use case and device - I expounded on this a bit in my note above and the linked thread, some examples:
      • Interrupts are great unless you're already strained and you want to the RGB stuff out of your control channel
      • If your USB packet size is too small you might have trouble even using the HID spec (I'm not sure how/if you can fragment?).
      • Realtime RGB control is one use case, but programming and controlling the "built in" RGB matrix is another, and though some concepts map, a lot don't
      • A case where you can't live without interrupts might be software defined macros or remaps - allowing the host software to react to a keypress in near-realtime instead of the keyboard sending the actual keypress on the "real" control channel. Not sure if anyone would ever want to build something like that, (the OS could do it normally, so it's sounding malicious now) but you'd certainly want it fast.
      • Data streams like the console could be remapped into the API and available via multiple channels.

@sigprof
Copy link
Contributor

sigprof commented Apr 3, 2021

I implemented this HID usage on my firmware, but unfortunately Windows API (UWP) "eats" the device and prevents normal USB HID enumeration. You can access it through the UWP APIs but I've not been able to get it to work unless the app has focus no matter what I try, and only being able to control lighting from the foreground is not very useful.

So Microsoft first pushed the Lighting and Illumination Page into the HID specification, and then made sure that it cannot be used in a useful way 😡

The fact that the API works only for the foreground app is actually documented:

An application can set lamp state at any time, but state will only be applied by the system while application is in focus.

As for virtser, one more disadvantage of it is that it requires two input endpoints (one bulk endpoint for received data, and one interrupt endpoint for status notifications), and the OTG USB controller in chips like STM32F4x1 can support only 3 input endpoints, therefore you can at most fit one HID interface and one virtser interface there, and then you are out of input endpoints. The HID spec also says that a HID interface must have an input endpoint, even when no input reports are defined. So at least on some devices an option to use a completely custom interface which uses only control transfers might be needed; this may work fine for controlling the keyboard state from the host, but sending asynchronous notifications from the keyboard to the host would be impossible.

Multiplexing the console stream over the API may look good, but would probably introduce the same issue that one process “eats” the device and makes it unavailable for other processes; then we would need some kind of “demux” process on the host (and some kind of protocol to connect to it, with access control on top…), so this should probably be considered as a really last resort solution.

It's also entirely possible you can't access the PWM from an interrupt

Yes, that could definitely be a problem; in general, most QMK code is not even thread safe, and the interrupt safety is even more questionable (and what happens inside the atsam code might be even worse). From the look at your LampArray code, seems that rgb_matrix_update_pwm_buffers() was called from some kind of USB driver callback, which the atsam code calls from the USB interrupt handler, therefore doing anything more than stuffing the received data into some buffers is probably not safe.

@stale stale bot removed the stale Issues or pull requests that have become inactive without resolution. label Sep 19, 2021
@ioquatix
Copy link

ioquatix commented Oct 6, 2021

I would like to play around with this, is this in a state where I can try it out?

@haata
Copy link

haata commented Oct 13, 2021

While it's not finished yet, I've been working on HID-IO for Keystone which is somewhat similar but is more of a full RPC mechanism: https://github.com/hid-io/hid-io-core

I do have C implementations but I've been moving everything to rust.

@drashna
Copy link
Member

drashna commented Oct 13, 2021

@ioquatix #13733

@sharpenedblade
Copy link

* [WebDFU](https://github.com/devanlai/webdfu) for firmware updates?
  
  * Perhaps add reset procedure docs into info.json, so that we can pop it up in the UI?

I dont think updating the firmware should be tied to using a browser with webusb, firefox does not support it, and electron doesn't work for cli apps either.

Also in general, we really shouldnt trust the OS to handle something. Microsoft probably wont care and third party stuff is usually buggy, Apple will probably actively avoid it, and it will probably end up as a resource hogging daemon on linux, it is probably better to keep things in qmk. Also, the host should not be trusted, because WebUSB lets random websites talk to the keyboard, and users are not very smart.

@ghost
Copy link

ghost commented May 20, 2023

How far is XAP from being usable? Should I just use rawhid for now? and has development for this stopped?

@romanr

This comment was marked as off-topic.

@ahmafi

This comment was marked as spam.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests