Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle apache DocumentRoot cyrillic encoding #17083

Closed
wants to merge 7 commits into from
Closed
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions sapi/fpm/fpm/fpm_main.c
Original file line number Diff line number Diff line change
Expand Up @@ -1103,6 +1103,7 @@ static void init_request_info(void)
script_path_translated = __unixify(script_path_translated, 0, NULL, 1, 0);
#endif


/*
* if the file doesn't exist, try to extract PATH_INFO out
* of it by stat'ing back through the '/'
Expand All @@ -1118,8 +1119,19 @@ static void init_request_info(void)
char *ptr;

if (pt) {
// If DocumentRoot contains cyrillic characters and PHP is invoked with SetHandler (not applicable to ProxyPassMatch),
// then the cyrillic characters are urlencoded by apache, and we need to decode them, for example with
// DocumentRoot /home/hans/web/cyrillicрф.ratma.net/public_html
// env_script_filename contains /home/hans/web/cyrillic%D1%80%D1%84.ratma.net/public_html/index.php.
// and we must decode it to /home/hans/web/cyrillicрф.ratma.net/public_html/index.php.
if(apache_was_here && memchr(pt, '%', len)) {
len = php_raw_url_decode(pt, len);
ptr = &pt[len]; // php_raw_url_decode() writes a trailing null byte, &pt[len] is that null byte.
goto apache_cyrillic_jump;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite hacky and not sure if it's going to work for another loop part. You should maybe alloc new space and replace pt in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite hacky

yeah well, i couldn't really find a better way without jump🤔

for example we could do

bool skip_first=false;
if(apache_was_here && memchr(pt, '%', len)) {
len = php_raw_url_decode(pt, len);
ptr = &pt[len]; // php_raw_url_decode() writes a trailing null byte, &pt[len] is that null byte.
skip_first=true;
}
while (skip_first || (ptr = strrchr(pt, '/')) || (ptr = strrchr(pt, '\\'))){
skip_first=false;
  • now we check a flag with every iteration, and we disable a flag with every iteration, a flag we don't need at all if we use jmp

or we could do

if(apache_was_here && memchr(pt, '%', len)) {
len = php_raw_url_decode(pt, len);
ptr = &pt[len]; // php_raw_url_decode() writes a trailing null byte, &pt[len] is that null byte.
} else {
     (ptr = strrchr(pt, '/')) || (ptr = strrchr(pt, '\\'))
}
do{...} 
while((ptr = strrchr(pt, '/')) || (ptr = strrchr(pt, '\\'));

now we avoid the goto and the overhead-per-iteration and the flag but now we have to duplicate the (ptr = strrchr(pt, '/')) || (ptr = strrchr(pt, '\\')) logic

not sure if it's going to work for another loop part.

i don't understand what you meant, what may not work?

You should maybe alloc new space and replace pt in this case.

not needed, php_raw_url_decode will never lengthen the string. it may shrink the string, or leave it with the same length, but never grow it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry it's fine

}
while ((ptr = strrchr(pt, '/')) || (ptr = strrchr(pt, '\\'))) {
*ptr = 0;
apache_cyrillic_jump:
if (stat(pt, &st) == 0 && S_ISREG(st.st_mode)) {
/*
* okay, we found the base script!
Expand Down
Loading