Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Site not exporting properly - maybe not compatible with "bedrock" framework? #179

Closed
FredericoC opened this issue Nov 28, 2018 · 22 comments
Closed

Comments

@FredericoC
Copy link

Hi Leon,

I have used your plugin before with no issue, but it seems after I started using the "Bedrock" framework from roots.io the export doesn't work at all. The crawler doesn't discover links properly and the Wordpress assets don't come across;

Bedrock - https://roots.io/bedrock/

Maybe something to do with the way they handle paths?

├── composer.json
├── config
│ ├── application.php
│ └── environments
│ ├── development.php
│ ├── staging.php
│ └── production.php
├── vendor
└── web
├── app
│ ├── mu-plugins
│ ├── plugins
│ ├── themes
│ └── uploads
├── wp-config.php
├── index.php
└── wp

I would love to use your plugin with S3 as it is perfect for our website. Let me know if you have any insight into this issue.

@leonstafford
Copy link
Contributor

Hi @FredericoC,

Thanks for reporting this. I haven't tested with Bedrock yet, but if any issues, will definitely want them resolved.

Could you please try this latest build zip here: https://github.com/leonstafford/wordpress-static-html-plugin/releases/tag/latest_snapshot

And do the usual of disabling all plugins/switching themes to isolate the cause. It could be a different environmental issue, too. Anything in the PHP error logs?

Cheers,

Leon

@FredericoC
Copy link
Author

FredericoC commented Nov 28, 2018

edit: Redacted links
Hi Leon,

Thanks for the fast response.

The issue is the same with the new build;

  • Deactivated all plugins
  • Used a default theme

This was working before I started using the bedrock boilerplate, so I'm assuming it is some incompatibility with the crawler...

Clicking "Preview initial crawl List" returns:

https://example.test/link1-est-software/
https://example.test/link2-est-software/
https://example.test/link3-est-software/
https://example.test/link4-est-software/
https://example.test/link5-est-software/
https://example.test/link6-est-software/
https://example.test/link7-est-software/
https://example.test/link8-est-software/
https://example.test/category/uncategorized/

Those links don't exist on my website, and I have about 60 more links which are not listed.

Nginx Log:


PHP message: STARTING EXPORT: PHP VERSION 7.2.12-1+ubuntu16.04.1+deb.sury.org+1

PHP message: STARTING EXPORT: PHP MAX EXECUTION TIME 120

PHP message: STARTING EXPORT: OS VERSION Linux groundplan 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 x86_64

PHP message: STARTING EXPORT: WP VERSION 4.9.8

PHP message: STARTING EXPORT: WP URL https://example.test

PHP message: STARTING EXPORT: WP SITEURL https://example.test/wp

PHP message: STARTING EXPORT: WP HOME https://example.test

PHP message: STARTING EXPORT: WP ADDRESS https://example.test/wp

PHP message: STARTING EXPORT: PLUGIN VERSION 5.9

PHP message: STARTING EXPORT: VIA CLI? 

PHP message: STARTING EXPORT: STATIC EXPORT URL https://example.test/wp/mystaticsite/" while reading upstream, client: 192.168.50.1, server: example.test, request: "POST /wp/wp-admin/admin-ajax.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm-wordpress.sock:", host: "example.test", referrer: "https://example.test/wp/wp-admin/admin.php?page=wp2static"
2018/11/28 05:01:06 [error] 12027#12027: *5111 FastCGI sent in stderr: "PHP message: BAD RESPONSE STATUS (404): https://example.test/" while reading response header from upstream, client: 192.168.50.1, server: example.test, request: "POST /wp/wp-admin/admin-ajax.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm-wordpress.sock:", host: "example.test", referrer: "https://example.test/wp/wp-admin/admin.php?page=wp2static"
2018/11/28 05:01:07 [error] 12027#12027: *5111 FastCGI sent in stderr: "PHP message: BAD RESPONSE STATUS (404): https://example.test/link1-est-software/" while reading response header from upstream, client: 192.168.50.1, server: example.test, request: "POST /wp/wp-admin/admin-ajax.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm-wordpress.sock:", host: "example.test", referrer: "https://example.test/wp/wp-admin/admin.php?page=wp2static"
2018/11/28 05:01:07 [error] 12027#12027: *5111 FastCGI sent in stderr: "PHP message: BAD RESPONSE STATUS (404): https://example.test/link2-est-software/" while reading response header from upstream, client: 192.168.50.1, server: example.test, request: "POST /wp/wp-admin/admin-ajax.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm-wordpress.sock:", host: "example.test", referrer: "https://example.test/wp/wp-admin/admin.php?page=wp2static"
2018/11/28 05:01:07 [error] 12027#12027: *5111 FastCGI sent in stderr: "PHP message: BAD RESPONSE STATUS (404): https://example.test/link3-est-software/" while reading response header from upstream, client: 192.168.50.1, server: example.test, request: "POST /wp/wp-admin/admin-ajax.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm-wordpress.sock:", host: "example.test", referrer: "https://example.test/wp/wp-admin/admin.php?page=wp2static"
2018/11/28 05:01:08 [error] 12027#12027: *5111 FastCGI sent in stderr: "PHP message: BAD RESPONSE STATUS (404): https://example.test/link4-est-software/" while reading response header from upstream, client: 192.168.50.1, server: example.test, request: "POST /wp/wp-admin/admin-ajax.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm-wordpress.sock:", host: "example.test", referrer: "https://example.test/wp/wp-admin/admin.php?page=wp2static"
2018/11/28 05:01:08 [error] 12027#12027: *5111 FastCGI sent in stderr: "PHP message: BAD RESPONSE STATUS (404): https://example.test/link5-est-software/" while reading response header from upstream, client: 192.168.50.1, server: example.test, request: "POST /wp/wp-admin/admin-ajax.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm-wordpress.sock:", host: "example.test", referrer: "https://example.test/wp/wp-admin/admin.php?page=wp2static"
2018/11/28 05:01:09 [error] 12027#12027: *5111 FastCGI sent in stderr: "PHP message: BAD RESPONSE STATUS (404): https://example.test/link6-est-software/" while reading response header from upstream, client: 192.168.50.1, server: example.test, request: "POST /wp/wp-admin/admin-ajax.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm-wordpress.sock:", host: "example.test", referrer: "https://example.test/wp/wp-admin/admin.php?page=wp2static"
2018/11/28 05:01:09 [error] 12027#12027: *5111 FastCGI sent in stderr: "PHP message: BAD RESPONSE STATUS (404): https://example.test/link7-est-software/" while reading response header from upstream, client: 192.168.50.1, server: example.test, request: "POST /wp/wp-admin/admin-ajax.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm-wordpress.sock:", host: "example.test", referrer: "https://example.test/wp/wp-admin/admin.php?page=wp2static"
2018/11/28 05:01:09 [error] 12027#12027: *5111 FastCGI sent in stderr: "PHP message: BAD RESPONSE STATUS (404): https://example.test/link8-est-software/" while reading response header from upstream, client: 192.168.50.1, server: example.test, request: "POST /wp/wp-admin/admin-ajax.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm-wordpress.sock:", host: "example.test", referrer: "https://example.test/wp/wp-admin/admin.php?page=wp2static"
2018/11/28 05:01:10 [error] 12027#12027: *5111 FastCGI sent in stderr: "PHP message: BAD RESPONSE STATUS (404): https://example.test/category/uncategorized/" while reading response header from upstream, client: 192.168.50.1, server: example.test, request: "POST /wp/wp-admin/admin-ajax.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm-wordpress.sock:", host: "example.test", referrer: "https://example.test/wp/wp-admin/admin.php?page=wp2static"

This is also strange, returning 404 on https://example.test/ which definitely works..

@leonstafford
Copy link
Contributor

leonstafford commented Nov 28, 2018 via email

@FredericoC
Copy link
Author

FredericoC commented Nov 28, 2018

Created a very simple example with Trellis (roots.io provisioning and deployment system) and Bedrock;

https://github.com/3stack-software/bedrock-wpstatic-dev

Instructions in the readme, if you're using linux it's very easy to get going. I just tested that example and the latest wp static dev is not working properly.

This is based on their own example, but I've done a couple of small changes to the Vagrant setup file to play well with encrypted filesystems.

Let me know if you need any assistance

@FredericoC
Copy link
Author

As for the folder structure found this;
https://roots.io/bedrock-vs-regular-wordpress-install/

I'll try to play around with symlinks...

@FredericoC
Copy link
Author

FredericoC commented Dec 11, 2018

@leonstafford
I dug around the source code and found the possible root of the major issues when using the Bedrock framework. I'm far from a Wordpress expert, so here's my findings for your review;

PROBLEM 1 - Not generating the list of page URL's properly;

It seems that using the site_url function with bedrock is adding wp at the end of the URL e.g. https://example.test/wp/, which is not the actual home url.
https://github.com/leonstafford/wp2static/blob/master/library/StaticHtmlOutput/WPSite.php#L10

This caused WP2Static to POST a "subdirectory", which in turn ran the code below, removing links from the list in the process;
https://github.com/leonstafford/wp2static/blob/master/library/StaticHtmlOutput/FilesHelper.php#L401

POSSIBLE SOLUTION FOR 1

Use get_home_url on WPSite.php instead of site_url. I have no idea if it has any side effects but worked for me.

PROBLEM 2 - Not generating the list of assets (stylesheets, scripts, uploads, images etc);

I was sure this was a path issue, so I had a look at the FilesHelper.php file. The ABSPATH global constant is returning the path to the Wordpress installation, which is fine for vanilla wordpress, but bedrock has all the assets in the "app" folder at the same level as the "wp" (wordpress installation) folder.
E.g.: This code will return /srv/www/example/current/web/wp/app/uploads, when in Bedrock it should be returning /srv/www/example/current/web/app/uploads (notice i removed wp).

POSSIBLE SOLUTION FOR 2

I didn't get far with a solution for this one... But I hope my findings will steer you in the right direction.
I know bedrock sets the ABSPATH constant in their config/application.php file, but they explicitly append the wp to the path, so I didn't want to break something else by changing it.
So I tried to find some config that I could use to detect if the site was using the bedrock framework, and this is what I noticed;

        // The `Bedrock` framework (https://roots.io/bedrock/) has a different path structure.
        // I believe CONTENT_DIR is a custom constant from bedrock, so I'll use that to detect the actual absPath
        if (defined('CONTENT_DIR') && CONTENT_DIR === '/app') {
            // The `app` segment is already in the "URL", so remove it from the absolute path
            $absPath = str_replace(CONTENT_DIR, '/', WP_CONTENT_DIR);
        } else {
            $absPath = ABSPATH;
        }

Adding this code to FilesHelper.php (replace ABSPATH with $absPath) made all the assets discoverable by WP2Static and generated a proper "WP-STATIC-INITIAL-CRAWL-LIST.txt", but when actually exporting it still didn't do it correctly (exported the 404 page content). I'm assuming the ABSPATH constant is used everywhere, so that might be the issue. I would suggest to create an abstraction for ABSPATH for cases like Bedrock (if that's the issue).

OTHER SUGGESTION

Just a suggestion, but using the Sage templating framework, there are other system folders/files that could be added to the default "exclusion list", like;

  • node_modules
  • bower_components
  • vendor
  • .yarn
  • gulpfile.js
  • package*.json
  • bower.json
  • composer.json
  • *.map

I hope this helps!

@leonstafford
Copy link
Contributor

Hi @FredericoC,

FYI, I've got a Trellis + Bedrock install running now and will start debugging this tonight. It should solve it for Debian package-managed WP installs and anyone with custom paths.

Fingers crossed!

@leonstafford
Copy link
Contributor

Debugging log:

WPSite injected into plugin's JS for admin screen:

var wp_site ={
   "uploads_url":"http://example.com/app/uploads",
   "site_url":"http://example.com/wp/",
   "site_path":"/srv/www/example.com/current/web/wp/",
   "plugins_path":"/srv/www/example.com/current/web/app/plugins",
   "wp_uploads_path":"/srv/www/example.com/current/web/app/uploads",
   "wp_includes_path":"/srv/www/example.com/current/web/wp/wp-includes",
   "wp_contents_path":"",
   "theme_root_path":"/srv/www/example.com/current/web/app/themes",
   "parent_theme_path":"/srv/www/example.com/current/web/wp/wp-content/themes/twentyseventeen",
   "child_theme_path":"/srv/www/example.com/current/web/wp/wp-content/themes/twentyseventeen",
   "child_theme_active":false,
   "permalink_structure":"/%postname%/",
   "wp_inc":"/wp-includes",
   "wp_content":"//srv/www/example.com/current/web/app",
   "wp_uploads":"/srv/www/example.com/current/web/app/uploads",
   "wp_plugins":"/srv/www/example.com/current/web/app/plugins",
   "wp_themes":"/srv/www/example.com/current/web/app/themes",
   "wp_active_theme":"/wp/wp-content/themes/twentyseventeen",
   "subdirectory":"/wp",
   "uploads_writable":true,
   "permalinks_set":12,
   "curl_enabled":true
}

Default export with TwentySeventeen theme results in the following in the export folder:

category -content -includes index.html

@leonstafford
Copy link
Contributor

site_url incorrectly shown as "site_url":"http://example.com/wp/", vs example.com

needs to be grabbing the other one, as in the settings:

siteurlvshome

@leonstafford
Copy link
Contributor

A working site's reference:

var wp_site ={
   "uploads_url":"http://secperf.wp2static.com/wp-content/uploads",
   "site_url":"http://secperf.wp2static.com/",
   "site_path":"/var/www/html/",
   "plugins_path":"/var/www/html/wp-content/plugins",
   "wp_uploads_path":"/var/www/html/wp-content/uploads",
   "wp_includes_path":"/var/www/html/wp-includes",
   "wp_contents_path":"",
   "theme_root_path":"/var/www/html/wp-content/themes",
   "parent_theme_path":"/var/www/html/wp-content/themes/twentyseventeen",
   "child_theme_path":"/var/www/html/wp-content/themes/twentyseventeen",
   "child_theme_active":false,
   "permalink_structure":"/%year%/%monthnum%/%day%/%postname%/",
   "wp_inc":"/wp-includes",
   "wp_content":"//var/www/html/wp-content",
   "wp_uploads":"/wp-content/uploads",
   "wp_plugins":"/wp-content/plugins",
   "wp_themes":"/wp-content/themes",
   "wp_active_theme":"/wp-content/themes/twentyseventeen",
   "subdirectory":false,
   "uploads_writable":true,
   "permalinks_set":36,
   "curl_enabled":true
}

@leonstafford
Copy link
Contributor

We can also see the "wp_content":"//srv/www/example.com/current/web/app", is completely wrong and should be /srv/www/example.com/current/web/wp/wp-content/

@leonstafford
Copy link
Contributor

We also see the theme root and actual theme dirs differ, when seeing the themes set by default WP install vs any loaded by Bedrock

"theme_root_path":"/srv/www/example.com/current/web/app/themes",
   "parent_theme_path":"/srv/www/example.com/current/web/wp/wp-content/themes/twentyseventeen",
   "child_theme_path":"/srv/www/example.com/current/web/wp/wp-content/themes/twentyseventeen",

@leonstafford
Copy link
Contributor

As there is nothing else in a Bedrock's /web/wp/ dir besides these, we may be OK to deduce the /wp/ dir based on the current theme's parent's parent's path, ie

from /srv/www/example.com/current/web/wp/wp-content/themes/twentyseventeen

we can go 2 levels up

But at this point, do we even need the wp-content directory for anything?

if we are rewiring the full path the the plugins dir and the full path to the themes, uploads, etc. What links would be left pointing to wp-contents?

Also, get_home_path() yields /srv/www/example.com/current/web/ which is not really enough to guess where the user's wp-content may be...

@leonstafford
Copy link
Contributor

proposed change is:

  • rm all references to wp-content, don't leave it as a rewrite option

  • rm all references to theme_root, don't leave it as a rewrite option

  • rewrite fields become full paths, no funky concatenation, ie wp-content/themes/twentyseventeen

  • add rewrite theme for child theme if active

  • these rewrite rules are just that, rewrite rules, allowing for users to add any other arbitrary

    • original => new
  • plugins and wp-includes rewrite options remain, but as full paths

  • rewrites just as a single text area, newlines and commas delimiting

  • path rewriting as an additional textarea, both loaded with defaults

Summary

  • correct paths otherwise required behind the scenes
  • remove fields specific to key WP rewrites
  • sane defaults given to user for flexible rewrites functions (source code and directories)
  • Restore default settings to get defaults back
  • execute rewrites by order in list (ie, to handle subdirectory creation and moving)

example rules for renaming directories:

In order to move wp-content/themes/twentyseventeen to contents/ui/mytheme, we'd need:

wp-content,contents
contents/themes,contents/ui
contents/ui/twentyseventeen,contents/ui/mytheme

At each iteration, it will be checking for existence of the source dir and throw an error if not found.

@leonstafford
Copy link
Contributor

leonstafford commented Dec 16, 2018

Update: went with the former option. Less code, less potential for errors.

noting another issue during implementation:

When plugin is trying to just copy a file that doesn't need parsing, it's assuming the standard WP directory structure. We can either make this optimization an option or put in some better detection to find where the files are... (latter preferred)

ERROR: trying to copy local file: /srv/www/example.com/current/web/wp//favicon.ico to: /srv/www/example.com/current/web/app/uploads/wp-static-html-output-1544952318/favicon.ico in archive dir: /srv/www/example.com/current/web/app/uploads/wp-static-html-output-1544952318/ (FILE NOT FOUND/UNREADABLE)
ERROR: trying to copy local file: /srv/www/example.com/current/web/wp//wp/wp-content/themes/twentyseventeen/assets/images/svg-icons.svg to: /srv/www/example.com/current/web/app/uploads/wp-static-html-output-1544952318/wp/wp-content/themes/twentyseventeen/assets/images/svg-icons.svg in archive dir: /srv/www/example.com/current/web/app/uploads/wp-static-html-output-1544952318/ (FILE NOT FOUND/UNREADABLE)
ERROR: trying to copy local file: /srv/www/example.com/current/web/wp//wp/wp-content/themes/twentyseventeen/assets/images/espresso.jpg to: /srv/www/example.com/current/web/app/uploads/wp-static-html-output-1544952318/wp/wp-content/themes/twentyseventeen/assets/images/espresso.jpg in archive dir: /srv/www/example.com/current/web/app/uploads/wp-static-html-output-1544952318/ (FILE NOT FOUND/UNREADABLE)
ERROR: trying to copy local file: /srv/www/example.com/current/web/wp//wp/wp-content/themes/twentyseventeen/assets/images/header.jpg to: /srv/www/example.com/current/web/app/uploads/wp-static-html-output-1544952318/wp/wp-content/themes/twentyseventeen/assets/images/header.jpg in archive dir: /srv/www/example.com/current/web/app/uploads/wp-static-html-output-1544952318/ (FILE NOT FOUND/UNREADABLE)
ERROR: trying to copy local file: /srv/www/example.com/current/web/wp//wp/wp-content/themes/twentyseventeen/assets/images/sandwich.jpg to: /srv/www/example.com/current/web/app/uploads/wp-static-html-output-1544952318/wp/wp-content/themes/twentyseventeen/assets/images/sandwich.jpg in archive dir: /srv/www/example.com/current/web/app/uploads/wp-static-html-output-1544952318/ (FILE NOT FOUND/UNREADABLE)
ERROR: trying to copy local file: /srv/www/example.com/current/web/wp//wp/wp-content/themes/twentyseventeen/assets/images/coffee.jpg to: /srv/www/example.com/current/web/app/uploads/wp-static-html-output-1544952318/wp/wp-content/themes/twentyseventeen/assets/images/coffee.jpg in archive dir: /srv/www/example.com/current/web/app/uploads/wp-static-html-output-1544952318/ (FILE NOT FOUND/UNREADABLE)
ERROR: trying to copy local file: /srv/www/example.com/current/web/wp//wp/wp-content/themes/twentyseventeen/screenshot.png to: /srv/www/example.com/current/web/app/uploads/wp-static-html-output-1544952318/wp/wp-content/themes/twentyseventeen/screenshot.png in archive dir: /srv/www/example.com/current/web/app/uploads/wp-static-html-output-1544952318/ (FILE NOT FOUND/UNREADABLE)

@leonstafford
Copy link
Contributor

@FredericoC this part is now working - w00t!

renamingbedrocksite

Quite a bit of (needed) refactoring came out of this task, so still a bit more work/tidying up, but I'm really excited about this stuff coming out!

@leonstafford
Copy link
Contributor

@FredericoC happy to close this now 😸

I've put an example guide here: https://forum.wp2static.com/-42/how-to-export-when-using-bedrock-by-rootsio

And this is in the latest build zips here: https://forum.wp2static.com/-30/how-to-install-the-latest-preview-of-version-59-of-wp2static

Please re-open with any issues you find

@FredericoC
Copy link
Author

FredericoC commented Dec 17, 2018

Great work Leon!

Just a question, I haven't tried it yet but it seems you were focusing on the web/wp/wp-content/* folder, but with Bedrock the idea is to put the themes, uploads and plugins folders under web/app/*, so the "custom code", uploads etc is separate from the wp installation. I assume with the new rewrite config this is not an issue anyway, but I think it would be good to clarify documentation.

I'll have a look anyway and get back to you.

Cheers!

@leonstafford
Copy link
Contributor

Hi @FredericoC,

I think that's the case, my dir structure is here: https://forum.wp2static.com/-42/how-to-export-when-using-bedrock-by-rootsio

Just whatever defaults they give out of the box. Wasn't completely sure where to put the other static stuff, like a static export test, it's under /wp in my example, but could just as easily be in /app, with the plugins and uploads, etc

I'll make another task for those default excludes, good thinking!

@alexandre-tobia
Copy link

Hi @leonstafford your link does not exist anymore, and the export still doesn't work with bedrock, do you plan to look on this ?

@leonstafford
Copy link
Contributor

Hi @FredericoC, thanks for your effort in reporting. Sorry that no progress has been made with the Bedrock setup here. As the issue is open, I mean to address it (or someone may send a patch in the meantime if we're lucky!).

I remember spending time to resolve Bedrock paths years ago, so you may find it works in this project vs WP2Static:

https://github.com/leonstafford/static-html-output

Else, we can try Simply Static (on wp.org, getting some updates recently) or I can provide a few other alternatives which may place nicer with the roots path setup until I get this fixed here.

@alexandre-tobia
Copy link

Hi @leonstafford thanks for your answer, but static-html-output does not work too. I've opened an issue with more information at
#789

Do you have some times to look at it ?

Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants