Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Homekit transient pairing #336

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

connorh315
Copy link

@connorh315 connorh315 commented Aug 30, 2024

It's an absolute mess, but I go back to work on monday and wanted for you all to be aware of this just in case I don't get to finish it.

Key notes:

  • Supports homekit transient pairing (X-Apple-HKP: 4)
  • In the process, I destroyed the legacy pairing by altering a lot of the existing SRP functions
  • Very weak TLV implementation that fails if the PKs that are sent are less than the required 384 bytes
  • Lots of random prints that I've left lying around
  • No support for unique pin codes
    - Video stream doesn't start, currently an error stopping it from reaching that point

I'll try and fix as much of this up as possible, but that's where it's currently at. Hopefully this will pave for things like H.265 in the future. Regardless though it should make the transport stream more secure since the streamConnectionID can't be sniffed from the plaintext handshake.

@connorh315
Copy link
Author

Seems triggering transient pairing leads to the eiv and ekey values not being sent to the server from the sender. How significant is this?

@connorh315
Copy link
Author

Okay, pushing through the missing eiv and ekey problem the stream sets up and the sender will send video packets, but the packets fail to decrypt. I think that there is a new key derivation when using Homekit.

@fduncanh
Copy link
Collaborator

fduncanh commented Aug 30, 2024

just saw this: great! This is an important missing item.

what is your information source? (pair_ap?) pair_ap implemented both server and client homekit, but only client legacy pairing (which it call "fruit")

legacy should also do encryption, but I missed that the last server pair-verify response needs to send some data to start the client using encryption. did you find out what this should be (must be buried in pair_ap, at least in homekit server)?
https://github.com/FDH2/UxPlay/wiki/crypto

ps should use openSSL v3 for encryption, not libsodium. If needed I can do the conversion.

@fduncanh
Copy link
Collaborator

the chachapoly stuff should be in crypto.c not srp.c

@connorh315
Copy link
Author

what is your information source? (pair_ap?) pair_ap implemented both server and client homekit, but only client legacy pairing (which it call "fruit")

Source is pair_ap + pyatv; A lot of the code I used is often a direct pull from pair_ap.

did you find out what this should be (must be buried in pair_ap, at least in homekit server)?

Pretty sure M5 is where the client sends the server a payload that the server can attempt to decrypt. The payload contains some unique information about the device if I remember correctly. In return (M6) the server should send a similar encrypted payload of information about the device. That's my rough understanding of the pair-verify stage immediately after the pair-setup.

the chachapoly stuff should be in crypto.c not srp.c

Will eventually move, was just being lazy.

@connorh315
Copy link
Author

Okay from what I can understand the data stream is encrypted using chacha as well. Keys can be derived using DataStream-Salt + streamConnectionId as the salt where DataStream-Output-Encryption-Key and DataStream-Input-Encryption-Key are the info inputs. Still wrapping my head around how to decrypt the data using chacha but I think it's to be separated into 64-byte chunks???

@fduncanh
Copy link
Collaborator

fduncanh commented Aug 30, 2024

what are your sources of information? EDIT: ok I see you are using pair_ap

@connorh315
Copy link
Author

what are your sources of information? EDIT: ok I see you are using pair_ap

It was pair_ap for the HomeKit pairing. Now it's this CarPlay source code that I've found (will link tomorrow as I'm not at my PC). It seems to use AES encryption if it uses the old pairing, or chacha for any HomeKit style-pairing from what I can tell

@fduncanh
Copy link
Collaborator

It seems to use AES encryption if it uses the old pairing,

That makes a lot of sense. The "legacy" pairing sets up and tests everything for an AES CTR 128 encypted conversation!

All is now in place for a switch to encrypted server-client communication using AES CTR 128, but it is not yet known what the server must do to trigger the client to start using it:perhaps the final Server RTSP response (above)should be sent encrypted?. The "CSeq" numbers in the message headers appear to indicate the AES CTR counter number when encryption is being used.

@connorh315
Copy link
Author

That makes a lot of sense. The "legacy" pairing sets up and tests everything for an AES CTR 128 encypted conversation!

Granted the encryption that I'm referencing here is with regards to the data channel.

I'm not sure where you'll have seen it but I have an Apple TV 2 and Apple TV 3 that both use Legacy Pairing and neither of them end up encrypting the control channel, only the data channel. Have you seen an implementation that ends up encrypting the control channel by using a pair-verify like you suggest?

@connorh315
Copy link
Author

Okay will try and have a crack at this later today, but it seems that the nonce used for the data channel is a standard 64-bit integer which is padded with 4 zeroes at the start (to bring the nonce length up to 96-bit) as seen in the control channel. The AAD appears to be the 128-byte header sent for each frame. The actual payload that gets decrypted is (I think) the entire body - 16 (take away the 16 I assume for the auth tag)?? Meaning that the auth tag is the last 16 bytes.

@fduncanh
Copy link
Collaborator

in RAOP mirror/ audio mode the RAOP "control channel" is a UDP connection used just for retransmission of (encrypted) audio packets that were lost. (The RAOP "data" channel is a UDP connection for the bulk of the encrypted audio, and
the is a TCP "mirror data" channel for transmission of encrypted video data.

The encryption we are trying to implement is of the main TCP RTSP "RAOP" channel which is plain text in legacy mode. This sets up pairing,then handles POST /setup etc etc. (There is also a TCP HTTP Channel for video streaming, which uses a reverse HTTP "event" channel (and more) but this implemented in the experimental branch).

Can you share a link to that CarPlay code, If I can inspect I may have a better idea.

@fduncanh
Copy link
Collaborator

Later Legacy mode goes to a lot of trouble to set up a shared secret that it tests for AES CTR encrypted communications.

I don't know if it ever got used, when encryption started. The CarPlay code may or may not be enlightening.

@fduncanh
Copy link
Collaborator

The encryption of video and audio data in "legacy" is described here:
https://github.com/FDH2/UxPlay/wiki/AirPlay2

I'd be surprised if this got changed, its got nothing to do with pairing.

@fduncanh
Copy link
Collaborator

I have an AppleTv 3 (updated to latest firmware) that encrypts when paired with a current iPad.

@connorh315
Copy link
Author

I have an AppleTv 3 (updated to latest firmware) that encrypts when paired with a current iPad.

That's probably why, I bought this thing for £12 from a local tech resell store, I doubt it's anywhere near latest firwmare.

@connorh315
Copy link
Author

Can you share a link to that CarPlay code, If I can inspect I may have a better idea.

https://github.com/45clouds/WirelessCarPlay/tree/master/source

It's a very stripped down version of airplay, I expect it should still be useful though.

@connorh315
Copy link
Author

Can't quite seem to wrap my head around the encryption used for the data channel, it's not ChaCha as I've set that all up and it fails to decrypt, not only that but I'd expect the length of the messages to have an extra 16 bytes on them and they don't. I think ChaCha is only used when there's proper HomeKit pairing (involving a pair-verify back-and-forth). Which means that instead it's encrypted using AES but UxPlay doesn't seem too interested as it claims the packets are invalid. Perhaps it's a h265 stream?

@connorh315
Copy link
Author

Okay, it looks like I was right, chacha on the actual mirroring stream is only active when there is a homekit pairing that involved pair-verify. The decryption of the videostream looks fine to me in terms of the data it produces, but for some reason it's not good enough for UxPlay. I don't have any knowledge of how video streams work so I think that might be a job for someone else. Also just for reference, when decrypting data[0] == 1 and as such UxPlay thinks it's been incorrectly decoded so that if statement needs to be removed/refined if you want to have a crack at this yourself.

@connorh315
Copy link
Author

Nevermind. I'm just stupid. Please ignore everything I've said up until this point. I've just been writing the wrong code the whole time.

Data is now encrypted over the data channel using chacha encryption. Using the original format that I suggested earlier today. I was using an incorrect pointer to the additonal data.

I have tested decrypting a packet in python, I now just need to implement it all into UxPlay.

@thiccaxe
Copy link

Take a look at my companiongames repo also it has some of the encryption stuff that was extracted from pyatv and other sources.

Also I believe I have some debugging of the apple tv3 in the YouTube video issue. I will check.

@connorh315
Copy link
Author

Take a look at my companiongames repo also it has some of the encryption stuff that was extracted from pyatv and other sources.

Also I believe I have some debugging of the apple tv3 in the YouTube video issue. I will check.

Hadn't seen this repo, probably since I was mostly searching for AirPlay/Homekit in GitHub rather than other HK driven protocols.

Thank you!

@fduncanh
Copy link
Collaborator

I finally have tcpdump intercepting ipad <=> appleTV 3. (next atv 4K) So a lot of questions can be answered.

No there is no encryption of the RAOP (RTSP) conversation.

@connorh315 once there is a working homekit pairing prototype, I'll see how to best integrate it into uxplay.

@connorh315
Copy link
Author

connorh315 commented Aug 31, 2024

I finally have tcpdump intercepting ipad <=> appleTV 3. (next atv 4K) So a lot of questions can be answered.

No there is no encryption of the RAOP (RTSP) conversation.

@connorh315 once there is a working homekit pairing prototype, I'll see how to best integrate it into uxplay.

When you say AppleTV 3, do you mean the 3rd Gen Apple TV (from like a decade ago) or the latest gen 4K Apple TV?

@fduncanh
Copy link
Collaborator

When you say AppleTV 3, do you mean the 3rd Gen Apple TV (from like a decade ago) or the latest gen 4K Apple TV?

ancient third gen, I have two. (plus a 4K first gen, I think)

@connorh315
Copy link
Author

When you say AppleTV 3, do you mean the 3rd Gen Apple TV (from like a decade ago) or the latest gen 4K Apple TV?

ancient third gen, I have two. (plus a 4K first gen, I think)

Ahh right, I've had a go at building a MITM proxy myself for the Apple TVs. I had success with the 2nd gen and the 3rd gen, they proxied beautifully. The HD 32GB model and the 1st gen 4K apple tv were a pain. When I tried to force legacy pairing by switching the features set in the proxy, the server wouldn't have a problem with it, but the client would refuse to start the video stream. I couldn't leave it on at the time as I didn't know how to decrypt the individual connection streams. Then when I found pyatv had a proxy that supported homekit pairing I used that and this time round the client would happily start the stream and send the video packets but the server would instead refuse to accept them (Now I understand that's likely because it couldn't decrypt the video stream as it would've been using the wrong derived session key).

@fduncanh
Copy link
Collaborator

great work!. My MITM is a router/wifi hotspot with dd-wrt that I can run tcpdump on, during the airplay session save out put on a usb stick on the router and analyse with wireshark on a regular desktop.

@fduncanh
Copy link
Collaborator

fduncanh commented Aug 31, 2024

AppleTV 3 has features -x5a7FFFF7, 0x1E

compared to current UxPlay this adds:

  • bit 0: AirPlayVideoSupport
  • bit 4: HLS support
  • bit 8 screen rotation supported
  • bit 33 Supports Air Play Video Play Queue
  • bit 34 Supports Air Play from cloud (requires that bit 6 is ON) (which it is)
  • bit 35 Supports TLS_PSK
  • bit 36 (unknown)

However, I don't see it doing the HLS when you tube videos run. I see a teardown of the RAOP channel after pair verify and fairplay. but I don't see the AirPlay channel at all.

@fduncanh
Copy link
Collaborator

fduncanh commented Sep 1, 2024

@connorh315
can you explain the plist entry for "edid"?

I see it also in the Apple TV gen 3 wireshark data : it must have got added at some point in an update:

<key>edid</key>
<data>
AP///////wBKizK8AAAAAAkaAQOAIRJ4Pu6Ro1RMmSYPUFQBCADR
wAEBAQEBAQEBAQEBAQEBAjqAGHE4LUBYLCUATLoQAAAeAAAA/ABG
SEQgSERSIAogICAgAAAA/wAAAAAAAAAAAAAAAAAAAAAA/QAuS1RU
EwEKICAgICAgAdECAyXyQhAE4gDV4wXAACMJfweDAQAAZwMMABAA
ODzmBgUBYmIAGjaAoHA4H0AwICUATLoQAAAeAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAKQ==
</data>

@connorh315
Copy link
Author

@fduncanh I added it when I was trying to fix the colour issue on the RPi. My understanding is that the EDID is what gets sent when a display and a device commit an initial handshake over HDMI. The display informs the graphics card of it's "features" (i.e. HDR, I asssume different resolutions it supports, 3D etc. ). It had no impact but I just thought I'd throw it in there anyway and then (I think) I just commented it out as it didn't really seem to do anything.

@connorh315
Copy link
Author

Okay, commit that I've just pushed to github supports video stream decoding now so that's back to normal. Next up is audio stream decoding but that should be same idea of just using chacha rather than AES.

@fduncanh
Copy link
Collaborator

fduncanh commented Sep 1, 2024

It seems that edid is specific to a video monitor, so we should NOT include it in the GET /info response.

presumable the edid I saw in wiresharking appleTV gen3 is got from the monitor directly attached to it.

but I don't think gstreamer interrogates the monitor.

@connorh315
Copy link
Author

It seems that edid is specific to a video monitor, so we should NOT include it in the GET /info response.

presumable the edid I saw in wiresharking appleTV gen3 is got from the monitor directly attached to it.

but I don't think gstreamer interrogates the monitor.

No you're right about GStreamer: I think the OS pulls and manages the information automatically, as I know on the Raspberry Pi you can get that information in the boot logs somewhere. I doubt it has any major impact.

@thiccaxe
Copy link

thiccaxe commented Sep 1, 2024

I see it also in the Apple TV gen 3 wireshark data : it must have got added at some point in an update:

It has been there since mid 2022. There is a database online of EDIDs, I think you can extract it on linux using ls /sys/class/drm/*/edid. I think the main thing it is used for is limiting fps. If the monitor only supports 30fps, apple tv will only output that much (no evidence to support this, but why else would they send it, analytics?).

@connorh315
Copy link
Author

connorh315 commented Sep 1, 2024

Okay, from what I can tell, because the sender sees this as a more enhanced receiver now because of HKP, it's sending more enhanced audio to go with it. It appears to be sending "ct": 8 in the streams list. This is "AAC-ELD" compression according to this. I think for now I'll work on cleaning up a lot of the code and removing the bugs. I might leave this to someone else to solve instead but we'll see how I get on.

@connorh315
Copy link
Author

I see it also in the Apple TV gen 3 wireshark data : it must have got added at some point in an update:

It has been there since mid 2022. There is a database online of EDIDs, I think you can extract it on linux using ls /sys/class/drm/*/edid. I think the main thing it is used for is limiting fps. If the monitor only supports 30fps, apple tv will only output that much (no evidence to support this, but why else would they send it, analytics?).

I think so too, maybe to send HDR range? I'm not too sure what an EDID contains.

@fduncanh
Copy link
Collaborator

fduncanh commented Sep 1, 2024

If there was any use in sending an EDID to the iOS client, there could be an option -edid that reads an edid from a file and adds it to the GET /info plist. It would seem that there are tools that can read an EDID from a monitor and write it to a file.

@connorh315
Copy link
Author

Something I need to look into (probably this weekend) is why the sender takes ~5 seconds to attempt a connection to the receiver. I remember reading in an issue thread about this and a discovery that was made to get around this. Unfortunately I do wonder if transient pairing just suffers from this delay and it has to be a pairing method that leads to a pair-verify (which doesn't happen with transient pairing)

@fduncanh
Copy link
Collaborator

fduncanh commented Sep 11, 2024

The 5sec delay is when pair setup/pair verify is used. If you run current uxplay without the -pin option it switches off the "supports legacy pairing" flag and no pair-setup/pair-verify takes place (so no 5sec delay before connecting) Uxplay moves directly to fpsetup after GET /info. This first appeared in UxPlay-1.65

1.65 2023-06-03 Eliminate pair_setup part of connection protocol to allow faster connections with clients (thanks to @shuax #176 for this discovery); to revert, uncomment a line in lib/dnssdint.h. ......

Maybe Homekit transient pairing will behave like this, if it doesn't do pair setup/pair verify???

@connorh315
Copy link
Author

The 5sec delay is when pair setup/pair verify is used. If you run current uxplay without the -pin option it switches off the "supports legacy pairing" flag and no pair-setup/pair-verify takes place (so no 5sec delay before connecting) Uxplay moves directly to fpsetup after GET /info. This first appeared in UxPlay-1.65

1.65 2023-06-03 Eliminate pair_setup part of connection protocol to allow faster connections with clients (thanks to @shuax #176 for this discovery); to revert, uncomment a line in lib/dnssdint.h. ......

Maybe Homekit transient pairing will behave like this, if it doesn't do pair setup/pair verify???

I think that unfortunately there is a 5 second delay in place for transient pairing. It seems intentional to avoid attempts at overwhelming the device as I noticed there is a lockout in place after 10 pairing attempts in the space of 15 minutes (can't remember the exact figure but it's something like that). I think that a less-transient form of pairing is required to remove this 5 second delay. As such probably HKP type 6? Applying the relevant features bit for that appears to remove this delay and make the connection instant.

@fduncanh
Copy link
Collaborator

fduncanh commented Sep 23, 2024

When non-transient pairing is used, the SRP5 pin protocol is used for the first pairing. From looking at pair_ap it is similar to what is done in non-transient legacy pairing (with pin) , but with a step up from 2048 to 3096 3072 bit SRP "large prime". After the initial verified pairing, the client records the server's public key and will thereafter pair without 5 sec delay.

SRP5 won't be used in transient pairing, I believe (?).

@shuax
Copy link

shuax commented Sep 24, 2024

I also think that X-Apple-HKP is not helpful in eliminating the 5-second wait, because during those 5 seconds, the server does not receive any network messages at all. We still don't know what the Apple client is doing during those 5 seconds. Perhaps it’s really just to prevent attacks from excessive or rapid connections.

@connorh315
Copy link
Author

I also think that X-Apple-HKP is not helpful in eliminating the 5-second wait, because during those 5 seconds, the server does not receive any network messages at all. We still don't know what the Apple client is doing during those 5 seconds. Perhaps it’s really just to prevent attacks from excessive or rapid connections.

It does eliminate the 5 second wait. I do think that the whole 5 seconds is literally there just to mitigate the ability to spam the receiver.

@fduncanh fduncanh force-pushed the master branch 2 times, most recently from 0f7948d to 1a77a9f Compare October 4, 2024 05:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants