Last active
April 12, 2024 03:31
-
-
Save leptos-null/5ae739d2a561f5d1910fd9af3bb8a945 to your computer and use it in GitHub Desktop.
Revisions
-
leptos-null revised this gist
Jan 9, 2019 . 1 changed file with 2 additions and 2 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -26,7 +26,7 @@ The API key was passed as the sole query parameter in the URL requests. After ru ### Step 2 (Device-Auth) The `X-Goog-Device-Auth` HTTP header field consisted of a commas delimited dictionary with three constant keys: "device_id", "data", and "content". The values appeared to be some kind of encoded value. Using [Hopper](https://www.hopperapp.com/), I was able to find the cross references to the "X-Goog-Device-Auth" string. I found the `YTApiaryDeviceCrypto` class, and reverse engineered it. My implementation is [LMApiaryDeviceCrypto](https://gist.github.com/leptos-null/8792b9c50fddc00cf525ed5055a872dc#file-lmapiarydevicecrypto-h). The HTTP body and URL get encrypted using a “device key” which was mapped to the “device_id” on Google’s servers. This is explained in more detail in the above article, and the code is attached as well (the header contains information on how to obtain device keys and IDs). On November 27th, I [tweeted](https://twitter.com/leptos_null/status/1067648428648226816?s=21) that this field is not correctly validated on the server. To clarify, I can’t say that it’s not “correct”, as there’s no public specification to compare against. That mentioned, I don’t see a reason not to validate in this manner. The issue the tweets outline is that nil, or otherwise malformed data, passes validation tests (these scenarios should not pass validation tests). @@ -48,6 +48,6 @@ I wrote a MobileSubstrate tweak to log `GPBMessage` encodes and decodes, as well ### Conclusion Thanks! I hope this blog was fun or helpful. I have a mostly working YouTube Music client, and hope to make that project public on GitHub too. If you have any questions, please reach out to me on [Twitter @leptos_null](https://twitter.com/leptos_null), or `email = "leptos.%[email protected]", NULL` This is my first time doing a write up on a project like this. If you have any recommendations, please let me know! -
leptos-null revised this gist
Jan 9, 2019 . 1 changed file with 6 additions and 6 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -26,23 +26,23 @@ The API key was passed as the sole query parameter in the URL requests. After ru ### Step 2 (Device-Auth) The `X-Goog-Device-Auth` HTTP header field consisted of a commas delimited dictionary with three constant keys: “device_id”, “data”, and “content”. The values appeared to be some kind of encoded value. Using [Hopper](https://www.hopperapp.com/), I was able to find the cross references to the “X-Goog-Device-Auth” string. I found the `YTApiaryDeviceCrypto` class, and reverse engineered it. My implementation is [LMApiaryDeviceCrypto](https://gist.github.com/leptos-null/8792b9c50fddc00cf525ed5055a872dc#file-lmapiarydevicecrypto-h). The HTTP body and URL get encrypted using a “device key” which was mapped to the “device_id” on Google’s servers. This is explained in more detail in the above article, and the code is attached as well (the header contains information on how to obtain device keys and IDs). On November 27th, I [tweeted](https://twitter.com/leptos_null/status/1067648428648226816?s=21) that this field is not correctly validated on the server. To clarify, I can’t say that it’s not “correct”, as there’s no public specification to compare against. That mentioned, I don’t see a reason not to validate in this manner. The issue the tweets outline is that nil, or otherwise malformed data, passes validation tests (these scenarios should not pass validation tests). ### Step 3 (Authorization) I originally thought that the Authorization field was another hard coded string. I didn’t see any reference in UserDefaults, and no network requests were going out when the app lunched. I had doubts when I couldn’t find the string in the binary. Google’s [public OAuth API documentation](https://developers.google.com/identity/protocols/OAuth2WebServer) was helpful in figuring this out. The process Google apps use is mostly public, however portions are private. The Google Account login page, for example, is different than the public method. The private version allows access to private auth scopes, which is required to consume InnerTube API. The refresh token provided by the login page is saved in Keychain. Using the refresh token, an access token for the OAuth scope can be requested. These are valid for 24 hours. Using an OAuth scoped access token, an access token for InnerTube API’s can be requested. These are valid for 60 minutes at a time. Once either of these token expire, they have to be refreshed using whichever dependency is required. Refresh tokens are discussed in Google’s public API, and do not expire. Standard (acquired via public API) refresh tokens can be revoked by the user. Refresh tokens retrieved via the private method are not visible to the user, and can only be revoked through the same private REST API. ### Step 4 (Body) I think the most difficult part of this project was recreating the HTTP body contents. The HTTP content header was marked `application/x-protobuf`. I had never used [Protobuf](https://github.com/protocolbuffers/protobuf) before, so this was very intimidating. After some poking around, I found out `protoc` had a `—raw_decode` argument. This helped to find out the meaning of binary messages. I started working on a tool, [ProtoDump](https://github.com/leptos-null/ProtoDump), to get the original proto files of `GPBMessage` subclasses, however at the time of writing this, it’s not fully working. I instead wrote a small tool that copies the descriptor data of each message. With an Objective-C header dumping tool, I was able to reconstruct all 6577 message classes YouTube Music had. Originally I tried using only the classes needed for the requests I wanted, however this didn’t work out, because of the runtime class check. The entire class tree had to be available, otherwise the Protobuf [runtime library would raise](https://github.com/protocolbuffers/protobuf/blob/4c559316e0a623872172a6665367a7c6339ac223/objectivec/GPBDescriptor.m#L532) an exception. I wrote a MobileSubstrate tweak to log `GPBMessage` encodes and decodes, as well as Requests signed by `YTApiaryDeviceCrypto`. Using these three log points, I was able to put together which messages were being sent to which endpoints, and then how to decode the response. Fortunetly, Google made this fairly straight forward. The `browse` endpoint took a `BrowseRequest` and returned a `BrowseResponse`, etc. In this example the REST call looks like `POST https://youtubei.googleapis.com/youtubei/v1/browse?key=AIzaSyDK3iBpDP9nHVTk2qL73FLJICfOC3c51Og`. -
leptos-null created this gist
Jan 9, 2019 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,53 @@ ## Writing an iOS YouTube Music client I’ve been using YouTube Music as my main music streaming service for almost a year and a half. The iOS client is great- I’ve never had a single complaint. It’s potentially one of the most bug free apps I’ve ever used, it has an extremely friendly, and simple, graphical interface, and the service itself is great. I was curious how the client worked in terms of networking, and while curiosity may treat cats poorly, it ~~lands researchers in black sites~~ can provide a lot of insight. ### Step 0 The first thing I do when reverse engineering a client is monitor HTTP requests while the application starts up, and when doing the tasks interested in. On a jailbroken iOS device, I use [FLEX](https://github.com/Flipboard/FLEX) by FlipBoard. In retrospect, I should have looked up if YouTube had a public API for getting videos and playlist, but I didn’t, until later. It turns out they didn’t, so this research was still helpful. The first thing I noticed while watching the HTTP requests were that there were any. There was a very real possibility that YouTube directly used TCP. In [Steps to Step 1](https://gist.github.com/leptos-null/8792b9c50fddc00cf525ed5055a872dc#file-readme-md), an article I posted a few days after I began research, I enumerated some of the authentication mechanisms I observed at this stage. The below list is comprised of general components of the request. - API key - HTTP `Authorization` header field - HTTP `X-Goog-Device-Auth` header field - HTTP `X-Goog-Visitor-Id` header field - HTTP body (`Content-Type: application/x-protobuf`) At this point, I don’t believe the `X-Goog-Visitor-Id` HTTP header field is required, and it’s not included in the below list for this reason. ### Step 1 (API key) The API key was passed as the sole query parameter in the URL requests. After running the app twice, and seeing the same API key, I ran `strings` over the binary, and saw the key, so I decided to it was safe to hardcore this value. ### Step 2 (Device-Auth) The `X-Goog-Device-Auth` HTTP header field consisted of a commas delimited dictionary with three constant keys: “device_id”, “data”, and “content”. The values appeared to be some kind of encoded value. Using [Hopper](https://www.hopperapp.com/) , I was able to find the cross references to the “X-Goog-Device-Auth” string. I found the `YTApiaryDeviceCrypto` class, and reverse engineered it. My implementation is [LMApiaryDeviceCrypto](https://gist.github.com/leptos-null/8792b9c50fddc00cf525ed5055a872dc#file-lmapiarydevicecrypto-h). The HTTP body and URL were being encrypted using a “device key” which was mapped to the “device_id” on Google’s servers. This is explained in more detail in the above article, and the code is attached as well (the header contains information on device keys and IDs). On November 27th, I [tweeted](https://twitter.com/leptos_null/status/1067648428648226816?s=21) that this field is not correctly validated on the server. To clarify, I can’t say that it’s not “correct”, as there’s no public specification to compare against. That mentioned, I don’t see a reason not to validate in this manner. The issue the tweets outline is that nil, or otherwise malformed data, pass validation tests (these scenarios should not pass validation tests). ### Step 3 (Authorization) I originally thought that the Authorization field was another hard coded string. I didn’t see any reference in UserDefaults, and no network requests were going out when the app lunched. I had doubts when I couldn’t find the string in the binary. Google’s [public OAuth API documentation](https://developers.google.com/identity/protocols/OAuth2WebServer) was helpful in figuring this out. The process Google apps use, is mostly public, however portions are private. The Google login page, for example, is different than the public method. I still don’t understand how it’s different, but however it is, allows access to private auth scopes. These private auth scopes are required for InnerTube API. The refresh token is saved in Keychain. Using the refresh token, an access token for the OAuth scope can be requested. These are valid for 24 hours. Using an OAuth scoped access token, an access token for InnerTube API’s can be requested. These are valid for 60 minutes at a time. Once either of these token expire, they have to be refreshed using whichever dependency is required. Refresh tokens are discussed in Google’s public API, and do not expire. Standard (acquired via public API) refresh tokens can be revoked by the user. Refresh tokens retrieved via the private method are not visible to the user, and can only be revoked through the same private REST API. ### Step 4 (Body) I think the most difficult part of this project was getting the HTTP body contents. The content HTTP header was marked `application/x-protobuf`. I had never used [Protobuf](https://github.com/protocolbuffers/protobuf) before, so this was very intimidating. After some poking around, I found out `protoc` had a `—raw_decode` argument. This helped to find out what a message said. I started working on a tool, [ProtoDump](https://github.com/leptos-null/ProtoDump), to get the original proto files of `GPBMessage` subclasses, however at the time of writing this, it’s not fully working. I instead wrote a small tool that copies the descriptor data, and with an Obj-C header dumping tool, I was able to reconstruct all 6577 message classes YouTube Music had. Originally I tried using only the classes needed for the requests I wanted, however this didn’t work out, because of the runtime class validation. The entire class tree had to be available. I wrote a MobileSubstrate tweak to log `GPBMessage` encodes and decodes, as well as Requests signed by `YTApiaryDeviceCrypto`. Using these three log points, I was able to put together which messages were being sent to which endpoints, and then how to decode the response. Fortunetly, Google made this fairly straight forward. The `browse` endpoint took a `BrowseRequest` and returned a `BrowseResponse`, etc. In this example the REST call looks like `POST https://youtubei.googleapis.com/youtubei/v1/browse?key=AIzaSyDK3iBpDP9nHVTk2qL73FLJICfOC3c51Og`. ### Conclusion Thanks! I hope this blog was fun or helpful. I have a mostly working YouTube Music client, and hope to make that project public on GitHub too. If you have any questions, please reach out to me on [Twitter @leptos_null](https://twitter.com/leptos_null), or `email = “leptos.%[email protected]”, NULL` This is my first time doing a write up on a project like this. If you have any recommendations, please let me know!