Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Examples with PHP / Symfony #93

Open
ToshY opened this issue Dec 30, 2024 · 4 comments
Open

[Question] Examples with PHP / Symfony #93

ToshY opened this issue Dec 30, 2024 · 4 comments

Comments

@ToshY
Copy link

ToshY commented Dec 30, 2024

I've been trying to setup curl-impersonate with Symfony's CurlHttpClient while using Docker, but I'm failing to get it to work. The binaries work fine on there own so it seems related to the extension/how to use it.

I've tried according to the original docs to install, but without any help. Also tried a comment I saw in the original repo, but also didn't work.

The question is if there are any users of curl-impersonate that got this working in PHP/Symfony to have the curl client use curl-impersonate internally. Any bit of imformation would be appreciated.


Dockerfile (part for installing curl-impersonate)

RUN <<EOT bash
  set -ex
  curl -L --retry 3 --retry-connrefused --retry-delay 2 --fail-with-body -o /tmp/curl-impersonate.tar.gz https://github.com/lexiforest/curl-impersonate/releases/download/v0.8.2/curl-impersonate-v0.8.2.x86_64-linux-gnu.tar.gz
  tar -xzf /tmp/curl-impersonate.tar.gz -C /usr/local/bin --no-same-owner
  curl -L --retry 3 --retry-connrefused --retry-delay 2 --fail-with-body -o /tmp/libcurl-impersonate.tar.gz https://github.com/lexiforest/curl-impersonate/releases/download/v0.8.2/libcurl-impersonate-v0.8.2.x86_64-linux-gnu.tar.gz
  tar -xzf /tmp/libcurl-impersonate.tar.gz -C /tmp --no-same-owner
  mv /tmp/libcurl-impersonate-* /usr/lib/x86_64-linux-gnu/
  cp /usr/lib/x86_64-linux-gnu/libcurl-impersonate-chrome.so.4.8.0 /usr/lib/x86_64-linux-gnu/libcurl.so.4.8.0
  rm -f /tmp/*
  ldconfig
EOT

Test binary

curl_chrome131 -s -X POST https://kitsu.app/api/graphql -H "Content-Type: application/json" -d '{"query": "query { findAnimeById(id: 1) { id } }"}'

Respone

{
   "data":{
      "findAnimeById":{
         "id":"1"
      }
   }
}

Test application
Symfony POST request.

$result = $this->httpClient->request(
    'POST',
    'https://kitsu.app/api/graphql',
    [
        'json' => ['query' => 'query { findAnimeById(id: 1) { id } }"}'],
    ]
);

Response
Returns 403 due to CloudFlare (likely fingerprint?)
image

Standalone cURL

php snippet.php

<?php
$url = 'https://kitsu.app/api/graphql';
$query = '{"query": "query { findAnimeById(id: 1) { id } }"}';

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, [
    'Content-Type: application/json',
]);
curl_setopt($ch, CURLOPT_POSTFIELDS, $query);

$response = curl_exec($ch);
if ($response === false) {
    echo 'cURL Error: ' . curl_error($ch);
} else {
    echo "Response: " . $response;
}

curl_close($ch);

Response

Returns 403 again.

docker compose exec phpfpm php snippet.php
php: /lib/x86_64-linux-gnu/libcurl.so.4: no version information available (required by php)
Response: <!DOCTYPE html><html lang="en-US"><head><title>Just a moment...</title><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta http-equiv="X-UA-Compatible" content="IE=Edge"><meta name="robots" content="noindex,nofollow"><meta name="viewport" content="width=device-width,initial-scale=1"><style>*

php -r 'print_r(curl_version());' (in container)

docker compose exec phpfpm php -r 'print_r(curl_version());'
php: /lib/x86_64-linux-gnu/libcurl.so.4: no version information available (required by php)
Array
(
    [version_number] => 526080
    [age] => 10
    [features] => 1437139613
    [feature_list] => Array
        (
            [AsynchDNS] => 1
            [CharConv] =>
            [Debug] =>
            [GSS-Negotiate] =>
            [IDN] =>
            [IPv6] => 1
            [krb4] =>
            [Largefile] => 1
            [libz] => 1
            [NTLM] => 1
            [NTLMWB] =>
            [SPNEGO] =>
            [SSL] => 1
            [SSPI] =>
            [TLS-SRP] =>
            [HTTP2] => 1
            [GSSAPI] =>
            [KERBEROS5] =>
            [UNIX_SOCKETS] => 1
            [PSL] =>
            [HTTPS_PROXY] => 1
            [MULTI_SSL] =>
            [BROTLI] => 1
            [ALTSVC] => 1
            [HTTP3] =>
            [UNICODE] =>
            [ZSTD] => 1
            [HSTS] => 1
            [GSASL] =>
        )

    [ssl_version_number] => 0
    [version] => 8.7.0-DEV
    [host] => x86_64-pc-linux-gnu
    [ssl_version] => BoringSSL
    [libz_version] => 1.3
    [protocols] => Array
        (
            [0] => dict
            [1] => file
            [2] => ftp
            [3] => ftps
            [4] => gopher
            [5] => gophers
            [6] => http
            [7] => https
            [8] => imap
            [9] => imaps
            [10] => mqtt
            [11] => pop3
            [12] => pop3s
            [13] => rtsp
            [14] => smb
            [15] => smbs
            [16] => smtp
            [17] => smtps
            [18] => telnet
            [19] => tftp
            [20] => ws
            [21] => wss
        )

    [ares] =>
    [ares_num] => 0
    [libidn] =>
    [iconv_ver_num] => 0
    [libssh_version] =>
    [brotli_ver_num] => 16781312
    [brotli_version] => 1.1.0
)
@lexiforest
Copy link
Owner

Did you set the CURL_IMPERSONATE environment variable?

@ToshY
Copy link
Author

ToshY commented Jan 8, 2025

Did you set the CURL_IMPERSONATE environment variable?

Well ... no 🤦Thanks for that.

Still, after adding it to the Dockerfile though (with ENV CURL_IMPERSONATE='chrome131') it works for the basic snippet example above:

<?php
$url = 'https://kitsu.app/api/graphql';
$query = '{"query": "query { findAnimeById(id: 1) { id } }"}';

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, [
    'Content-Type: application/json',
]);
curl_setopt($ch, CURLOPT_POSTFIELDS, $query);

$response = curl_exec($ch);
if ($response === false) {
    echo 'cURL Error: ' . curl_error($ch);
} else {
    echo "Response: " . $response;
}

curl_close($ch);

Result

(not sure about the version warning here though)

php: /lib/x86_64-linux-gnu/libcurl.so.4: no version information available (required by php)
Response: {"data":{"findAnimeById":{"id":"1"}}}

But not for existing Symfony HTTP clients calls

$result = $this->httpClient->request(
    'POST',
    'https://kitsu.app/api/graphql',
    [
        'json' => ['query' => 'query { findAnimeById(id: 1) { id } }"}'],
    ]
);

dump($result->getContent());

Result

image


It seems to follow the pattern of failing from AsyncResponse.

image

I was not sure, but it seems related to getting the (compressed) data (by looking on what inflate_add does).


After trying several things, I've found that when adding the header Accept-Encoding: identity, it will return uncompressed response, and it works without problems.

Now adding that header is just a workaround and I can't even convince myself it's a good one at that.

@lexiforest If you have ideas on why this occurs, or have additional ideas/suggestion please let me know 👍

@lexiforest
Copy link
Owner

lexiforest commented Jan 9, 2025

By default, curl-impersonate add the browser default headers, which include Accept-Encoding: gzip.... I guess the client you are using conflicts with the implictly added headers. You may want to disable that behavior with CURL_IMPERSONATE_HEADERS, see the docs on README for details.

@ToshY
Copy link
Author

ToshY commented Jan 10, 2025

@lexiforest Thanks for the response.

Let me first start of by saying that I've finally have the given example working with the default HTTP client, without additional Accept-Encoding headers or the need to use CURL_IMPERSONATE_HEADERS=no.

This now works:

$result = $this->httpClient->request(
    'POST',
    'https://kitsu.app/api/graphql',
    [
        'json' => ['query' => 'query { findAnimeById(id: 1) { id } }"}'],
    ]
);

dump($result->getContent());

With the following in the Dockerfile:

ENV CURL_IMPERSONATE=chrome131

COPY --from=ghcr.io/tarampampam/curl:8.11.1 /bin/curl /bin/curl

RUN <<EOT bash
  set -ex
  curl -L --retry 3 --retry-connrefused --retry-delay 2 --fail-with-body -o /tmp/curl-impersonate.tar.gz https://github.com/lexiforest/curl-impersonate/releases/download/v0.8.2/curl-impersonate-v0.8.2.x86_64-linux-gnu.tar.gz
  tar -xzf /tmp/curl-impersonate.tar.gz -C /usr/local/bin --no-same-owner
  curl -L --retry 3 --retry-connrefused --retry-delay 2 --fail-with-body -o /tmp/libcurl-impersonate.tar.gz https://github.com/lexiforest/curl-impersonate/releases/download/v0.8.2/libcurl-impersonate-v0.8.2.x86_64-linux-gnu.tar.gz
  tar -xzf /tmp/libcurl-impersonate.tar.gz -C /tmp --no-same-owner
  mv /tmp/libcurl-impersonate-* /usr/lib/x86_64-linux-gnu/
  rm -f /tmp/*
  ldconfig
EOT

ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libcurl-impersonate-chrome.so.4.8.0

After the above was working, I've found myself clicking through my application, and finding that calls that were made internally through other services, like Meilisearch, were now also failing with inflate_add(): data error (from PSR stream).

image

I guess the client you are using conflicts with the implictly added headers

After further debugging the CurlHttpClient and CurlResponse that Symfony uses, I agree with you that there is something causing conflict, and yes it might be related to headers but could also be a bug/flaw in the inflate logic of the response chunks. I will not further bother you with these details and create a bug report in the Symfony repo instead for this to see how that works out. I will report back when I've figured it out over there.

Sidenote: I've created a temporary patch for myself for this for the CurlHttpClient.

--- CurlHttpClient.php	2025-01-09 21:57:43.446365726 +0000
+++ CurlHttpClient.php	2025-01-09 21:57:20.102366512 +0000
@@ -211,6 +211,7 @@

         if (\extension_loaded('zlib') && !isset($options['normalized_headers']['accept-encoding'])) {
             $options['headers'][] = 'Accept-Encoding: gzip'; // Expose only one encoding, some servers mess up when more are provided
+            $options['normalized_headers']['accept-encoding'] = 'gzip';
         }
         $body = $options['body'];

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants