Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle apache DocumentRoot cyrillic encoding #17083

Closed
wants to merge 7 commits into from

Conversation

divinity76
Copy link
Contributor

@divinity76 divinity76 commented Dec 8, 2024

When Apache's DocumentRoot contains cyrillic characters like

DocumentRoot /home/hans/web/cyrillicрф.ratma.net/public_html

and PHP is invoked with SetHandler (*PS not applicable to ProxyPassMatch, the problem occurs with SetHandler specifically) like

DocumentRoot /home/hans/web/cyrillicрф.ratma.net/public_html
<FilesMatch \.php$>
    SetHandler "proxy:unix:/run/php/php8.3-fpm-cyrillicрф.ratma.net.sock"
</FilesMatch>

then apache will url-encode the cyrillic characters before sending it to fpm, so env_script_filename will contain

/home/hans/web/cyrillic%D1%80%D1%84.ratma.net/public_html/index.php

and we need to url-decode it to

/home/hans/web/cyrillicрф.ratma.net/public_html/index.php

otherwise we hit that

zlog(ZLOG_DEBUG, "Primary script unknown");
SG(sapi_headers).http_response_code = 404;
PUTS("File not found.\n");

error code path.

When DocumentRoot contains cyrillic characters like

DocumentRoot /home/hans/web/cyrillicрф.ratma.net/public_html

and PHP is invoked with SetHandler (*PS not applicable to ProxySetMatch, the problem occurs with SetHandler specifically) like

    DocumentRoot /home/hans/web/cyrillicрф.ratma.net/public_html
    <FilesMatch \.php$>
        SetHandler "proxy:unix:/run/php/php8.3-fpm-cyrillicрф.ratma.net.sock"
    </FilesMatch>

then apache will url-encode the cyrillic characters before sending it to fpm, so env_script_filename will contain

/home/hans/web/cyrillic%D1%80%D1%84.ratma.net/public_html/index.php

and we need to url-decode it to 
/home/hans/web/cyrillicрф.ratma.net/public_html/index.php

otherwise we hit that 
					zlog(ZLOG_DEBUG, "Primary script unknown");
					SG(sapi_headers).http_response_code = 404;
					PUTS("File not found.\n");

error code path.
@divinity76
Copy link
Contributor Author

divinity76 commented Dec 8, 2024

hmm interesting, the patch seems to break 2 tests, both related to ProxyPass (which I know handles cyrillic differently than SetHandler 🤔 ):

FPM: FastCGI env var path info fix for Apache ProxyPass SCRIPT_NAME stripping with encoded path (bug #74129) [sapi/fpm/tests/fcgi-env-pif-apache-pp-sn-strip-encoded.phpt]
FPM: FastCGI env var path info fix for Apache ProxyPass SCRIPT_NAME encoded path and plush sign (GH-12996) [sapi/fpm/tests/fcgi-env-pif-apache-pp-sn-strip-encoded-plus.phpt]

improves compatibility with sapi/fpm/tests/fcgi-env-pif-apache-pp-sn-strip-encoded-plus.phpt

does not fix it entirely, but it does help 🤔
@bukka
Copy link
Member

bukka commented Dec 8, 2024

I will check this later next week.

@divinity76
Copy link
Contributor Author

Found a way to not trigger

sapi/fpm/tests/fcgi-env-pif-apache-pp-sn-strip-encoded.phpt
sapi/fpm/tests/fcgi-env-pif-apache-pp-sn-strip-encoded-plus.phpt

All checks have passed
10 successful checks

Nice.

PS if you're checking this issue on a linux system and your locale | head -n1 does not mention utf8, you're going to have a hard time checking this issue, and should probably run some variant of

sudo locale-gen C.utf8;
echo LANG="C.utf8" | sudo tee /etc/default/locale;
sudo update-locale;
export LANG=C.utf8;

before starting. (Even Ubuntu24.04 server installer does not default to utf-8 locale for some reason :'( )

@divinity76
Copy link
Contributor Author

ping @bukka got time?

@bukka
Copy link
Member

bukka commented Dec 14, 2024

Ok so it took me a little bit to realise that this is most likely that bug in httpd that just got fixed: apache/httpd#470 (it's part of 2.4.x branch so it will be released in the next httpd version). It's kind of different variant of #15246 .

This is not really a PHP bug and changing that is in theory a BC break (even though quite unlikely to hit anyone). Maybe we could consider add this to master in case something similar happen in the future.

Copy link
Member

@bukka bukka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said maybe it makes sense to add this as a feature and protection for future failures. It needs also a test though - look to other apache tests in fpm/tests for inspiration.

Comment on lines 1129 to 1130
ptr = &pt[len]; // php_raw_url_decode() writes a trailing null byte, &pt[len] is that null byte.
goto apache_cyrillic_jump;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite hacky and not sure if it's going to work for another loop part. You should maybe alloc new space and replace pt in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite hacky

yeah well, i couldn't really find a better way without jump🤔

for example we could do

bool skip_first=false;
if(apache_was_here && memchr(pt, '%', len)) {
len = php_raw_url_decode(pt, len);
ptr = &pt[len]; // php_raw_url_decode() writes a trailing null byte, &pt[len] is that null byte.
skip_first=true;
}
while (skip_first || (ptr = strrchr(pt, '/')) || (ptr = strrchr(pt, '\\'))){
skip_first=false;
  • now we check a flag with every iteration, and we disable a flag with every iteration, a flag we don't need at all if we use jmp

or we could do

if(apache_was_here && memchr(pt, '%', len)) {
len = php_raw_url_decode(pt, len);
ptr = &pt[len]; // php_raw_url_decode() writes a trailing null byte, &pt[len] is that null byte.
} else {
     (ptr = strrchr(pt, '/')) || (ptr = strrchr(pt, '\\'))
}
do{...} 
while((ptr = strrchr(pt, '/')) || (ptr = strrchr(pt, '\\'));

now we avoid the goto and the overhead-per-iteration and the flag but now we have to duplicate the (ptr = strrchr(pt, '/')) || (ptr = strrchr(pt, '\\')) logic

not sure if it's going to work for another loop part.

i don't understand what you meant, what may not work?

You should maybe alloc new space and replace pt in this case.

not needed, php_raw_url_decode will never lengthen the string. it may shrink the string, or leave it with the same length, but never grow it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry it's fine

@bukka
Copy link
Member

bukka commented Dec 14, 2024

I just thought about it and I think we should not change it even in master because httpd will not be encoding it so it would make impossible to have some paths with % accesible. It would really make sense if httpd kept encoding the path but this going to change soon...

@divinity76
Copy link
Contributor Author

divinity76 commented Dec 14, 2024

it would make impossible to have some paths with % accesible

didn't know this before testing it, but this patch does not break documentRoots with % in them.
It is already broken.
This PR fixes it.
Apache url-encode % in DocumentRoot to %25, and php-fpm needs to url-decode it back to %.
TIL this PR adds support for

DocumentRoot /home/test/web/test.ratma.net/public_html/test%lol

probably been broken for years, but nobody(*) encountered it

  • nobody who bothered complaining on a bugtracker, when the simple workaround is to just remove the % from the path

It would really make sense if httpd kept encoding the path but this going to change soon...

hmm that is concerning 🤔

% is also affected. I suspect a great deal of other characters are affected as well
@bukka
Copy link
Member

bukka commented Dec 14, 2024

So there was a bug in httpd that changed it from not encoding the path to encoding the path. This was also backported to stable versions of various distros as I think it might have been a security fix. Are you able to test this with latest 2.4.x branch if this is fixed there or you can still see the issue there? Also can you check with tag 2.4.59 and check if you can see issue there?

If you haven't compiled httpd before here are my build steps (spend a bit of time to figure it out so sharing it with out to make it quicker - you need to change /path/to/apr ofc):

./buildconf --with-apr=/path/to/apr/apr-1.6.x --with-apr-util=/path/to/apr/apr-util-1.6.x
./configure --enable-mpms-shared=all --enable-unixd --enable-log-debug --enable-mods-static='unixd log_config logio' --enable-http2 --enable-so --enable-debugger-mode --enable-ssl

@divinity76
Copy link
Contributor Author

divinity76 commented Dec 15, 2024

EDIT: nvm i had wrong apache config when writing this, i need to re-run tests, sorry

~~hmmm tested with apache2.4.59 with mod_fcgid-2.3.9, and this patch fixes both the % issue and the cyrillic, not break things, on that specific apache+mod-fcgi version, here are the build steps I used (which will replace apache installs so be careful)

#!/bin/bash
set -e
set -x
wget 'https://archive.apache.org/dist/httpd/httpd-2.4.59.tar.bz2'
tar xfv httpd-2.4.59.tar.bz2;
cd httpd-2.4.59/;
svn co http://svn.apache.org/repos/asf/apr/apr/trunk srclib/apr;
./buildconf;
./configure --prefix=/usr/local/apache2 --enable-so --enable-ssl --with-mpm=event --enable-mods-shared=all --with-included-apr;
make -j$(nproc);
sudo make install;
wget 'https://dlcdn.apache.org/httpd/mod_fcgid/mod_fcgid-2.3.9.tar.bz2'
tar xfv mod_fcgid-2.3.9.tar.bz2;
cd mod_fcgid-2.3.9/;
APXS=/usr/local/apache2/bin/apxs ./configure.apxs;
make -j$(nproc);
sudo make install;
service apache2 restart;

here php-fpm fail with both cyrillic and % without this PR, but succeed with both when this PR is applied.

Also tested applying the diff from apache/httpd#470 , unfortunately

root@test:/hans/httpd-2.4.59# wget 'https://github.com/apache/httpd/pull/470.diff'
(...)
2024-12-15 00:35:21 (84.5 MB/s) - '470.diff' saved [3573]
root@test:/hans/httpd-2.4.59# git apply 470.diff
error: patch failed: modules/proxy/mod_proxy.c:1240
error: modules/proxy/mod_proxy.c: patch does not apply

so I had to partially apply 470 manually, which leaves room for error, here is my httpd-2.4.59.tar.bz2 + 470.diff
git diff:

$ git diff
diff --git a/modules/proxy/mod_proxy.c b/modules/proxy/mod_proxy.c
index c9cef7c..22be239 100644
--- a/modules/proxy/mod_proxy.c
+++ b/modules/proxy/mod_proxy.c
@@ -1320,6 +1320,7 @@ static int proxy_handler(request_rec *r)
             strncmp(r->filename, "proxy:", 6) != 0) {
             r->proxyreq = PROXYREQ_REVERSE;
             r->filename = apr_pstrcat(r->pool, r->handler, r->filename, NULL);
+            apr_table_setn(r->notes, "proxy-sethandler", "1");
         }
         else {
             return DECLINED;
diff --git a/modules/proxy/mod_proxy_fcgi.c b/modules/proxy/mod_proxy_fcgi.c
index d420df6..50f443e 100644
--- a/modules/proxy/mod_proxy_fcgi.c
+++ b/modules/proxy/mod_proxy_fcgi.c
@@ -63,6 +63,8 @@ static int proxy_fcgi_canon(request_rec *r, char *url)
     apr_port_t port, def_port;
     fcgi_req_config_t *rconf = NULL;
     const char *pathinfo_type = NULL;
+    fcgi_dirconf_t *dconf = ap_get_module_config(r->per_dir_config,
+                                                 &proxy_fcgi_module);
 
     if (ap_cstr_casecmpn(url, "fcgi:", 5) == 0) {
         url += 5;
@@ -92,9 +94,30 @@ static int proxy_fcgi_canon(request_rec *r, char *url)
         host = apr_pstrcat(r->pool, "[", host, "]", NULL);
     }
 
-    if (apr_table_get(r->notes, "proxy-nocanon")
+    if (apr_table_get(r->notes, "proxy-sethandler")
+        || apr_table_get(r->notes, "proxy-nocanon")
         || apr_table_get(r->notes, "proxy-noencode")) {
-        path = url;   /* this is the raw/encoded path */
+        char *c = url;
+
+        /* We do not call ap_proxy_canonenc_ex() on the path here, don't
+         * let control characters pass still, and for php-fpm no '?' either.
+         */
+        if (FCGI_MAY_BE_FPM(dconf)) {
+            while (!apr_iscntrl(*c) && *c != '?')
+                c++;
+        }
+        else {
+            while (!apr_iscntrl(*c))
+                c++;
+        }
+        if (*c) {
+            ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r, APLOGNO(10414)
+                          "To be forwarded path contains control characters%s (%s)",
+                          FCGI_MAY_BE_FPM(dconf) ? " or '?'" : "", url);
+            return HTTP_FORBIDDEN;
+        }
+
+        path = url;  /* this is the raw path */
     }
     else {
         core_dir_config *d = ap_get_core_module_config(r->per_dir_config);
@@ -106,16 +129,6 @@ static int proxy_fcgi_canon(request_rec *r, char *url)
             return HTTP_BAD_REQUEST;
         }
     }
-    /*
-     * If we have a raw control character or a ' ' in nocanon path,
-     * correct encoding was missed.
-     */
-    if (path == url && *ap_scan_vchar_obstext(path)) {
-        ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r, APLOGNO(10414)
-                      "To be forwarded path contains control "
-                      "characters or spaces");
-        return HTTP_FORBIDDEN;
-    }
 
     r->filename = apr_pstrcat(r->pool, "proxy:fcgi://", host, sport, "/",

and with that compiled in.. everything works with php-master, even without this PR 😮 well i'm surprised
seems apache/httpd#470 fixes everything so this PR isn't needed when 470 lands
EDIT: again, accidentally had wrong apache configs when testing this, i need to re-run these tests

@divinity76
Copy link
Contributor Author

divinity76 commented Dec 15, 2024

Edit: this whole answer seems wrong. done a bunch more tests, see next message <.<

okay, re-did the test with vanilla php-src and vanilla httpd2.4.59+mod_fcgid-2.3.9 using the

#!/bin/bash
set -e
set -x
wget 'https://archive.apache.org/dist/httpd/httpd-2.4.59.tar.bz2'
tar xfv httpd-2.4.59.tar.bz2;
cd httpd-2.4.59/;
svn co http://svn.apache.org/repos/asf/apr/apr/trunk srclib/apr;
./buildconf;
./configure --prefix=/usr/local/apache2 --enable-so --enable-ssl --with-mpm=event --enable-mods-shared=all --with-included-apr;
make -j$(nproc);
sudo make install;
wget 'https://dlcdn.apache.org/httpd/mod_fcgid/mod_fcgid-2.3.9.tar.bz2'
tar xfv mod_fcgid-2.3.9.tar.bz2;
cd mod_fcgid-2.3.9/;
APXS=/usr/local/apache2/bin/apxs ./configure.apxs;
make -j$(nproc);
sudo make install;
service apache2 restart;

script again, and neither cyrillic nor % worked.
then i applied this PR, and everything worked. then I reverted this PR.
then I tried httpd2.4.59+mod_fcgid-2.3.9 + the 740 patch, which again I had to partially apply manually, leaving room for error on my part, and even the git diff in my previous message got corrupted, so this time i include a base64 version just to be sure:

$ git diff | gzip -9 | base64        
H4sIAAAAAAACA61Wf1PbOBD9P59i65mCHdv5TXopw7VQQo85CjSld9e5udEIWU48YyxXEg2dlu9+
K9lJnARCer3AJLK8elq9fburKIljCMNxooE2b0R0m3LVzKW4+2qeiB01GFw//q6WZBG/AzZgPH7B
Go1O55p3ugNot1r9Xq8WhuEm5Jrv+xvRX7+GsN3ttII++Pb3BeCU0lQnDJJMgzUkE5pFKZeu5J9v
udJEcgZ16dWg+lFaZuwmd2X4a5ykPKM3PADHArx0Auh78OwAWh58W16G5tYGseEALkcXf30aDd+T
0fCP4ejDcH/NeIaNxjSXJMdtGdVm11yINDAmpbsBLLly/vHszNuv+VU8g6DpdcqJ4jozIJnQXM38
DnG2BMMTOG3Hq/hzvxjyVPG1c3F9KzM4Hr45Oz0fHu/Xoi3EQGI2TjYpojQoZRH1Oq0o7jcae624
1+vyrWQxg9iojZmREUi/a+SB3788KA5jSRjNRLaijwDYhEqo38q0lIoNmJCa4Fr8CSDisZ0oabVQ
CEKYyOJkjGZ1aYYYaxO90gpnlC6xc6onSRYLor/mfG7mz8GixAIYpKhEojkZc02Kg5c7WflwaczL
mWBZKFt9diqEFPBGL4XTSQwu7sxQrkiW4pgpmYvMoKyMvcmQPQ8OVjMELcA/gL19G4lBJxiAP+gF
3dbPhsJ8JgJ5fCSNnL/RI2OAw3+cefbMlF8LF4eapRCyup5BmbD+OF7B5xYrKjnnLYLw/TtsvRP8
yCKeMRFxxzO0h7OFRlbIDNK1j0/NOqDKFOC/nnCQdNosVkWFYb258LPgmZWLa/7iDaL8ySESgB4A
o2lqlFjEzTqOkITfuR6IzG5joSdcYumKRLarK4KsQ8q1yQMtRWq3pExzqXCNUiiLJMUIIocQCwn5
JA/j/Aa3hd1Xu8ATBJeNKlrFfROfkzdvT8m7w0/kaEhOLt+5NnEsP0spMZ1gaQX3mWE4UQx9Sd06
82Bnx5wfaz3u5q1nEfP9ShG+XwzLIrrNHj8Ca05k3Pq2WvlJKsaYHlIK6R5enl28xTOPfg+gGA9H
owBa2ELKifMLt93qtXveprrgXAm45ob1KZVzeZg40SRTDwTsuQL3ufKcjdXm4XjAK3AAw4ssO/AS
HMxRk90rDa7sQr9dXV2Sk4vR0enx8fB8majFw5Lq10U/F3u1/a22PiYkr5RRLLuLkmvfPVF3vaLS
tVv9oG0uJZ0BNp//odatknF0eIx3jPcfhx+u1nv6fVEJmvWyItThNIYphwn9woFaLtZiaWJBYRf/
kgzKWmQZC+YgeH50UYMtHkk2hilVcJMoxaPGzKi5qKxFOGw8bFIhiwphyRezJxHXSvM7ba2Wi9d/
l3b4U7J2Hl1fqVDIksop48rcpcInZBrOus22l7/yumk7arM572CquG04OFN7oofbiM263b9pZmhN
wAsAAA==
$ git diff
diff --git a/modules/proxy/mod_proxy.c b/modules/proxy/mod_proxy.c
index c9cef7c..22be239 100644
--- a/modules/proxy/mod_proxy.c
+++ b/modules/proxy/mod_proxy.c
@@ -1320,6 +1320,7 @@ static int proxy_handler(request_rec *r)
             strncmp(r->filename, "proxy:", 6) != 0) {
             r->proxyreq = PROXYREQ_REVERSE;
             r->filename = apr_pstrcat(r->pool, r->handler, r->filename, NULL);
+            apr_table_setn(r->notes, "proxy-sethandler", "1");
         }
         else {
             return DECLINED;
diff --git a/modules/proxy/mod_proxy_fcgi.c b/modules/proxy/mod_proxy_fcgi.c
index d420df6..50f443e 100644
--- a/modules/proxy/mod_proxy_fcgi.c
+++ b/modules/proxy/mod_proxy_fcgi.c
@@ -63,6 +63,8 @@ static int proxy_fcgi_canon(request_rec *r, char *url)
     apr_port_t port, def_port;
     fcgi_req_config_t *rconf = NULL;
     const char *pathinfo_type = NULL;
+    fcgi_dirconf_t *dconf = ap_get_module_config(r->per_dir_config,
+                                                 &proxy_fcgi_module);
 
     if (ap_cstr_casecmpn(url, "fcgi:", 5) == 0) {
         url += 5;
@@ -92,9 +94,30 @@ static int proxy_fcgi_canon(request_rec *r, char *url)
         host = apr_pstrcat(r->pool, "[", host, "]", NULL);
     }
 
-    if (apr_table_get(r->notes, "proxy-nocanon")
+    if (apr_table_get(r->notes, "proxy-sethandler")
+        || apr_table_get(r->notes, "proxy-nocanon")
         || apr_table_get(r->notes, "proxy-noencode")) {
-        path = url;   /* this is the raw/encoded path */
+        char *c = url;
+
+        /* We do not call ap_proxy_canonenc_ex() on the path here, don't
+         * let control characters pass still, and for php-fpm no '?' either.
+         */
+        if (FCGI_MAY_BE_FPM(dconf)) {
+            while (!apr_iscntrl(*c) && *c != '?')
+                c++;
+        }
+        else {
+            while (!apr_iscntrl(*c))
+                c++;
+        }
+        if (*c) {
+            ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r, APLOGNO(10414)
+                          "To be forwarded path contains control characters%s (%s)",
+                          FCGI_MAY_BE_FPM(dconf) ? " or '?'" : "", url);
+            return HTTP_FORBIDDEN;
+        }
+
+        path = url;  /* this is the raw path */
     }
     else {
         core_dir_config *d = ap_get_core_module_config(r->per_dir_config);
@@ -106,16 +129,6 @@ static int proxy_fcgi_canon(request_rec *r, char *url)
             return HTTP_BAD_REQUEST;
         }
     }
-    /*
-     * If we have a raw control character or a ' ' in nocanon path,
-     * correct encoding was missed.
-     */
-    if (path == url && *ap_scan_vchar_obstext(path)) {
-        ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r, APLOGNO(10414)
-                      "To be forwarded path contains control "
-                      "characters or spaces");
-        return HTTP_FORBIDDEN;
-    }
 
     r->filename = apr_pstrcat(r->pool, "proxy:fcgi://", host, sport, "/",
                               path, NULL);

and.. cyrillic and % were still broken with the 470 compiled in.
then i applied this PR again, and both cyrillic and % works 🤔

so.. it seems apache/httpd#470 does not affect this PR on apache2.4.59 with mod_fcgid-2.3.9 , or my testing was insufficient..

anyway, testing this stuff manually was exhausting, ideally should write some automated tests to run whatever apache versions and apache PR's seems relevant, but not sure if i'm up for that task

@divinity76
Copy link
Contributor Author

divinity76 commented Dec 15, 2024

... did a bunch more testing, and this time wrote a script to automate all the testing, and it turns out that apache2.4.59 does NOT url-encode special characters, contrary to what my previous tests showed (wtf?),

and that this PR would break compatibility with apache2.4.59. here is the test script i came up with, but the important part is that it has 6 tests, and all 6 of them pass (i expected only the first 2 to pass, but no, all 6 of them pass),
if this PR was to be merged, it would break at least 2 of the tests <.<

<?php

declare(strict_types=1);
error_reporting(E_ALL);
set_error_handler(function ($errno, $errstr, $errfile, $errline) {
    if (error_reporting() & $errno) {
        throw new ErrorException($errstr, 0, $errno, $errfile, $errline);
    }
});
function quote(string $str): string
{
    // escapeshellarg() does not handle unicode correctly: https://3v4l.org/Hkv7h
    if (str_contains($str, "\x00")) {
        throw new RuntimeException("unix shell arguments cannot contain null bytes");
    }
    return "'" . str_replace("'", "'\\''", $str) . "'";
}
function e(string $cmd)
{
    echo $cmd . PHP_EOL;
    passthru($cmd, $ret);
    if ($ret !== 0) {
        throw new RuntimeException("Command returned non-zero $ret: $cmd");
    }
}
function fsleep(float $seconds)
{
    $int = (int)$seconds;
    $frac = (int) (($seconds - $int) * 1e9);
    time_nanosleep($int, $frac);
}
function sh(string $cmd)
{
    e("/bin/bash -c " . quote($cmd));
}
if (DIRECTORY_SEPARATOR !== '/') {
    throw new RuntimeException("Must be run on Unix");
}
function compile_apache_string(): string
{
    $sh = <<<'SH'
#!/bin/bash
set -e
set -x
set -o pipefail

cd httpd-2.4.59/
mkdir -p install_dir
# CFLAGS='-DBIG_SECURITY_HOLE'
./configure --enable-so --enable-ssl --with-mpm=event --enable-mods-shared=all --with-included-apr --enable-proxy-fcgi --enable-proxy-http --prefix=$(pwd)/install_dir
make -j$(nproc)
make install
cd ..
SH;
    return $sh;
}
function compile_apache(): void
{
    sh(compile_apache_string());
}

function generatehttpdConf(string $relativeDocumentRoot): void
{
    $httpdConf = <<<'EOF'
ErrorLog "$(pwd)/apache_error.log"
ServerRoot "$(pwd)/httpd-2.4.59/install_dir"
LoadModule unixd_module modules/mod_unixd.so
LoadModule rewrite_module modules/mod_rewrite.so
LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_fcgi_module modules/mod_proxy_fcgi.so
LoadModule proxy_http_module modules/mod_proxy_http.so
# authz host
LoadModule authz_core_module modules/mod_authz_core.so
LoadModule authz_host_module modules/mod_authz_host.so

# if unixd is loaded
<IfModule unixd_module>
#    User $(whoami)
#    Group $(whoami)
    User nobody
    Group nogroup
</IfModule>
<IfModule !mpm_netware_module>
    PidFile "$(pwd)/httpd.pid"
</IfModule>
# Funfact: apache does not support listening on unix sockets :(
# requested this in 2013: https://bz.apache.org/bugzilla/show_bug.cgi?id=55898
# 9123: random port hopefully not in use
Listen 127.0.0.1:9123
DocumentRoot "$(pwd)/htdocs/$relativeDocumentRoot$"
<Directory "$(pwd)/htdocs/$relativeDocumentRoot$">
    Options Indexes FollowSymLinks
    AllowOverride All
    Require all granted
</Directory>
# php-fpm
<FilesMatch \.php$>
    SetHandler "proxy:unix:$(pwd)/php-fpm.sock|fcgi://localhost/"
</FilesMatch>
EOF;
    $httpdConf = strtr($httpdConf, [
        '$(pwd)' => getcwd(),
        '$relativeDocumentRoot$' => $relativeDocumentRoot,
    ]);
    var_dump($httpdConf);
    file_put_contents("httpd.conf", $httpdConf, LOCK_EX);
    $relative = "htdocs/$relativeDocumentRoot/test.txt";
    touch($relative);
    $realpath = realpath($relative);
    file_put_contents($realpath, $realpath, LOCK_EX);
    file_put_contents("htdocs/$relativeDocumentRoot/test.php", "<?php echo(__FILE__);", LOCK_EX);
}
class httpd
{
    public $apacheDescriptorSpec = [
        0 => ['pipe', 'rb'],  // default is to inherit our stdin, we don't want that.
        //1 => ['pipe', 'wb'],
        //2 => ['pipe', 'wb'],
    ];
    public $apachePipes = [];
    public $apacheProc;

    private function start()
    {
        $this->apacheProc = proc_open('httpd-2.4.59/install_dir/bin/httpd -f ' . quote(getcwd() . "/httpd.conf") . ' -X', $this->apacheDescriptorSpec, $this->apachePipes);
        if ($this->apacheProc === false) {
            throw new RuntimeException("Failed to start apache");
        }
        fclose($this->apachePipes[0]);
        unset($this->apachePipes[0]);
        assert(empty($this->apachePipes));
        fsleep(1);
    }
    public function __construct()
    {
        $this->start();
    }
    public function __destruct()
    {
        $status = proc_get_status($this->apacheProc);
        if ($status['running']) {
            proc_terminate($this->apacheProc);
        }
        proc_close($this->apacheProc);
    }
    public function restart()
    {
        $status = proc_get_status($this->apacheProc);
        $pid1 = $status['pid'];
        $pid2 = (int)file_get_contents("httpd.pid");
        if ($status['running']) {
            proc_terminate($this->apacheProc, SIGTERM);
        }
        if(posix_kill($pid2, 0)) {
            posix_kill($pid2, SIGTERM);
        }
        echo "Waiting for apache pid1 ($pid1) to exit...";
        while (($status = proc_get_status($this->apacheProc))['running']) {
            fsleep(0.1);
            echo ".";
        }
        echo "Waiting for apache pid2 ($pid2) to exit...";
        while (posix_kill($pid2, 0)) {
            fsleep(0.1);
            echo ".";
        }
        proc_close($this->apacheProc);
        echo "Restarting apache...";
        $this->start();
    }
};
$sh = <<<'SH'
#!/bin/bash
set -e
set -x
set -o pipefail

if [ -d "php-src" ]; then
    echo "php-src directory already exists. Skipping cloning."
else
    git clone 'https://github.com/php/php-src.git' --depth 1
    cd php-src
    ./buildconf
    ./configure --disable-all --enable-fpm
    make -j$(nproc)
    cd ..
fi
# if php-fpm.pid exists, kill the process
if [ -f "php-fpm.pid" ]; then
    pid=$(cat php-fpm.pid)
    kill $pid || true
    # wait for the process to exit..
    while kill -0 $pid 2>/dev/null; do
        sleep 0.1
    done
fi
# make a basic php-fpm.conf
# warning: php-fpm.conf is hardcoded to relative directory /usr/local/var
# for some unknown reason, we need to specify the full paths (not even ./ relative works)
cat > php-fpm.conf <<EOF
[global]
pid = $(pwd)/php-fpm.pid
error_log = $(pwd)/php-fpm.log
daemonize = no
[www]
; user and group is "optional" according to docs, but it gives a warning if not set..
user = $(whoami)
group = $(whoami)
; unix socket
listen = $(pwd)/php-fpm.sock
listen.owner = $(whoami)
listen.group = $(whoami)
listen.mode = 0777
pm = static
pm.max_children = 2
EOF
# kill apache if it's running
if [ -f "httpd.pid" ]; then
    pid=$(cat httpd.pid)
    kill $pid || true
    # wait for the process to exit..
    while kill -0 $pid 2>/dev/null; do
        printf .
        sleep 0.1
    done
    # unlike php-fpm, apache does not clean up its pid file after exiting
    rm -f httpd.pid
fi

if [ -d "httpd-2.4.59" ]; then
    echo "httpd-2.4.59 directory already exists. Skipping cloning."
else
    wget 'https://archive.apache.org/dist/httpd/httpd-2.4.59.tar.bz2'
    tar xfv httpd-2.4.59.tar.bz2
    cd httpd-2.4.59/
    mkdir -p install_dir
    svn co http://svn.apache.org/repos/asf/apr/apr/trunk srclib/apr
    %compile_apache_string%
fi
# make a basic apache config
mkdir -p htdocs
SH;
$sh = strtr($sh, [
    '%compile_apache_string%' => compile_apache_string(),
]);
sh($sh);

$fpmDescriptorSpec = [
    0 => ['pipe', 'rb'],  // default is to inherit our stdin, we don't want that.
    //1 => ['pipe', 'wb'],
    //2 => ['pipe', 'wb'],
];
$fpmPipes = [];
$fpmProc = proc_open('php-src/sapi/fpm/php-fpm -y php-fpm.conf -c php-src/php.ini-development --allow-to-run-as-root --nodaemonize --force-stderr', $fpmDescriptorSpec, $fpmPipes);
if ($fpmProc === false) {
    throw new RuntimeException("Failed to start php-fpm");
}
fclose($fpmPipes[0]);
var_dump(proc_get_status($fpmProc));
$apacheDescriptorSpec = [
    0 => ['pipe', 'rb'],  // default is to inherit our stdin, we don't want that.
    //1 => ['pipe', 'wb'],
    //2 => ['pipe', 'wb'],
];
generatehttpdConf("");

register_shutdown_function(function () use ($fpmProc, &$httpd) {
    echo "Shutting down..." . PHP_EOL;
    $status = proc_get_status($fpmProc);
    if ($status['running']) {
        proc_terminate($fpmProc);
    }
    proc_close($fpmProc);
    unset($httpd); // should force __destruct...
    echo "Done." . PHP_EOL;
});
generatehttpdConf("");
$httpd = new httpd();
fsleep(0.1); // initialize sleep
function httpget(string $url): string
{
    $ch = curl_init($url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    $response = curl_exec($ch);
    if ($response === false) {
        throw new RuntimeException("curl failed: " . curl_error($ch));
    }
    curl_close($ch);
    return $response;
}
function return_var_dump(...$args)
{
    ob_start();
    var_dump(...$args);
    return ob_get_clean();
}
echo "first just a simple test to check that apache+php-fpm is working\n";
$tests = array(
    "http://localhost:9123/test.txt" => getcwd() . "/htdocs/test.txt",
    "http://localhost:9123/test.php" => getcwd() . "/htdocs/test.php",
);
foreach ($tests as $url => $expected) {
    $response = httpget($url);
    var_dump(["status" => ($response === $expected) ? "OK" : "FAIL", "url" => $url, "expected" => $expected, "response" => $response]);
}
echo "Testing cyrillic docroot: cyrillicрф\n";
@mkdir("htdocs/cyrillicрф", 0777, true);
generatehttpdConf("cyrillicрф/");
$httpd->restart();
fsleep(0.1); // initialize sleep
$tests = array(
    "http://localhost:9123/test.txt" => getcwd() . "/htdocs/cyrillicрф/test.txt",
    "http://localhost:9123/test.php" => getcwd() . "/htdocs/cyrillicрф/test.php",
);
foreach ($tests as $url => $expected) {
    $response = httpget($url);
    var_dump(["status" => ($response === $expected) ? "OK" : "FAIL", "url" => $url, "expected" => $expected, "response" => $response]);
}
echo "testing % docroot\n";
@mkdir("htdocs/%", 0777, true);
generatehttpdConf("%/");
$httpd->restart();
fsleep(0.1); // initialize sleep
$tests = array(
    "http://localhost:9123/test.txt" => getcwd() . "/htdocs/%/test.txt",
    "http://localhost:9123/test.php" => getcwd() . "/htdocs/%/test.php",
);
foreach ($tests as $url => $expected) {
    $response = httpget($url);
    var_dump(["status" => ($response === $expected) ? "OK" : "FAIL", "url" => $url, "expected" => $expected, "response" => $response]);
}

and the important output is

DocumentRoot "/hans/cyrillic/htdocs/"
first just a simple test to check that apache+php-fpm is working
array(4) {
  ["status"]=>
  string(2) "OK"
  ["url"]=>
  string(30) "http://localhost:9123/test.txt"
  ["expected"]=>
  string(30) "/hans/cyrillic/htdocs/test.txt"
  ["response"]=>
  string(30) "/hans/cyrillic/htdocs/test.txt"
}
array(4) {
  ["status"]=>
  string(2) "OK"
  ["url"]=>
  string(30) "http://localhost:9123/test.php"
  ["expected"]=>
  string(30) "/hans/cyrillic/htdocs/test.php"
  ["response"]=>
  string(30) "/hans/cyrillic/htdocs/test.php"
}
Testing cyrillic docroot: cyrillicрф
DocumentRoot "/hans/cyrillic/htdocs/cyrillicрф/"
array(4) {
  ["status"]=>
  string(2) "OK"
  ["url"]=>
  string(30) "http://localhost:9123/test.txt"
  ["expected"]=>
  string(43) "/hans/cyrillic/htdocs/cyrillicрф/test.txt"
  ["response"]=>
  string(43) "/hans/cyrillic/htdocs/cyrillicрф/test.txt"
}
array(4) {
  ["status"]=>
  string(2) "OK"
  ["url"]=>
  string(30) "http://localhost:9123/test.php"
  ["expected"]=>
  string(43) "/hans/cyrillic/htdocs/cyrillicрф/test.php"
  ["response"]=>
  string(43) "/hans/cyrillic/htdocs/cyrillicрф/test.php"
}
testing % docroot
DocumentRoot "/hans/cyrillic/htdocs/%/"
array(4) {
  ["status"]=>
  string(2) "OK"
  ["url"]=>
  string(30) "http://localhost:9123/test.txt"
  ["expected"]=>
  string(32) "/hans/cyrillic/htdocs/%/test.txt"
  ["response"]=>
  string(32) "/hans/cyrillic/htdocs/%/test.txt"
}
array(4) {
  ["status"]=>
  string(2) "OK"
  ["url"]=>
  string(30) "http://localhost:9123/test.php"
  ["expected"]=>
  string(32) "/hans/cyrillic/htdocs/%/test.php"
  ["response"]=>
  string(32) "/hans/cyrillic/htdocs/%/test.php"
}

@divinity76 divinity76 closed this Dec 15, 2024
@bukka
Copy link
Member

bukka commented Dec 15, 2024

Thanks for testing. So just to clarify that patch got arleady applied to httpd 2.4.x branch which is kind of the main thing to test to see if it's going to work fine there and in the future httpd versions. But in general you should see the same results as with 2.4.59 because the main logic should be the same. I have got some automated test in my testing tool: https://github.com/wstool/wst-php-fpm/blob/563e061dbb7aa6fbe43979e78869b9b107c1b1b2/spec/instances/httpd-proxy-fcgi-handler-uds-basic.yaml . I will extend it later with your scenarios so it's covered too.

divinity76 added a commit to divinity76/hestiacp that referenced this pull request Dec 15, 2024
This reverts commit 5136018.

This makes us go back to using SetHandler instead of ProxyPassMatch

The SetHandler unicode problems appears to have been fixed
in upstream apache httpd 2.4.59 ,
so the solution to running domains with unicode/cyrillic characters should
just be to update to apache>=2.4.59

ref php/php-src#17083 (comment)

The ProxyPassMatch solution added new problems with OpenCart like https://forum.hestiacp.com/t/php-file-not-found-but-it-exists/16694/4

AH01136: Unescaped URL path matched ProxyPass; ignoring unsafe nocanon

 [proxy:error] [pid 204010:tid 204023] (111)Connection refused: AH00957: FCGI: attempt to connect to 127.0.0.1:8000 (localhost:8000) failed
divinity76 added a commit to hestiacp/hestiacp that referenced this pull request Dec 16, 2024
This reverts commit 5136018.

This makes us go back to using SetHandler instead of ProxyPassMatch

The SetHandler unicode problems appears to have been fixed
in upstream apache httpd 2.4.59 ,
so the solution to running domains with unicode/cyrillic characters should
just be to update to apache>=2.4.59

ref php/php-src#17083 (comment)

The ProxyPassMatch solution added new problems with OpenCart like https://forum.hestiacp.com/t/php-file-not-found-but-it-exists/16694/4

AH01136: Unescaped URL path matched ProxyPass; ignoring unsafe nocanon

 [proxy:error] [pid 204010:tid 204023] (111)Connection refused: AH00957: FCGI: attempt to connect to 127.0.0.1:8000 (localhost:8000) failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants