Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check external links in the main loop #143

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

papandreou
Copy link
Collaborator

@papandreou papandreou commented Mar 31, 2018

TODO:

  • Warn for long redirect chains
  • Reinstate retrying logic (including the switch to GET when the first attempt fails)
  • Maybe reintroduce the "Crawling X outgoing urls" message that I've commented out in the tests. The problem is that we don't have an up front count because it doesn't happen in a separate phase now. Maybe we could just switch it to the past tense ("Crawled X outgoing urls") and output it at the end?
  • Make sure that the use of asset.keepUnpopulated does not block the discovery of outgoing HttpRedirect and FileRedirect relations
  • Avoid infinite loop in the redirect checking code when there's a redirect cycle

@papandreou papandreou added the WIP label Mar 31, 2018
lib/index.js Outdated
}
} else {
follow = true;
}

if (follow) {
if (follow || metadataOnly) {
if (assetTypesWithoutRelations.includes(relation.to.type)) {
// If we are handling local file-urls, follow but mark as end-of-line in processing
if (!recursive && relation.from.protocol === 'file:' && relation.to.protocol === 'file:') {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch can probably die if assetgraph handles the loading

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in fbaa61b

test/index.js Outdated
operator: 'content-type-missing',
name: 'content-type-missing https://example.com/hey.png',
expected: 'image/png',
operator: 'error',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the generic error operator. Can't be properly filtered on. We should probably handle that error explicitly and trigger the right operator

@Munter
Copy link
Owner

Munter commented Mar 31, 2018

I don't think the separate phase and message about switching to outgoing urls is super important. If there's no good way to get it back I'm fine with dropping it

@papandreou papandreou force-pushed the feature/loadMetadataOnly branch 3 times, most recently from 742d302 to a94ed21 Compare April 1, 2018 22:18
@papandreou papandreou force-pushed the feature/loadMetadataOnly branch from a94ed21 to f489f19 Compare April 4, 2018 20:11
Instead of

  push(
    null,
    {
      ok: true,
      operator: 'external-redirect',
      name: 'external-redirect https://elsewhere.com/',
      at: 'https://example.com/ (1:39) <script src="https://elsewhere.com/">...</script>',
      expected: '302 https://elsewhere.com/ --> 200 http://elsewhere.com/redirectTarget'
    }
  ); at reportTest (lib/index.js:104:9)
  push(
    null,
    {
      ok: true,
      operator: 'external-redirect',
      name: 'external-redirect http://elsewhere.com/redirectTarget',
      at: 'https://elsewhere.com/',
      expected: '302 http://elsewhere.com/redirectTarget --> 200 https://elsewhere.com/redirectTarget'
    }
  ); at reportTest (lib/index.js:104:9)

we now get

  push(
    null,
    {
      ok: false,
      operator: 'external-redirect',
      name: 'external-redirect https://elsewhere.com/',
      at: 'https://example.com/ (1:39) <script src="https://elsewhere.com/">...</script>',
      expected: '302 https://elsewhere.com/ --> 200 https://elsewhere.com/redirectTarget',
      actual: '302 https://elsewhere.com/ --> 302 http://elsewhere.com/redirectTarget --> 200 https://elsewhere.com/redirectTarget'
    }
  ); at reportTest (lib/index.js:114:7)

which is correct
…stopProcessing)

This is now handled in assetgraph by having asset.load({metadataOnly: true})
issue an fs.stat call.

#143 (comment)
@papandreou
Copy link
Collaborator Author

@Munter, I got it working now and addressed the feedback :)

@papandreou papandreou removed the WIP label Apr 5, 2018
@papandreou
Copy link
Collaborator Author

@Munter, could you take another look at this?

…hough the asset would normally not be populated. Same as #147
@Munter
Copy link
Owner

Munter commented Jul 23, 2018

I tried adding the current functionality from master where an external asset with an incoming fragment link will be populated in order to verify the correctness of the fragment, but no other external relations from that asset will be populated.

This should put the branch on par with master and hopefully simplify the merge conflict between the two branches

@papandreou
Copy link
Collaborator Author

Great! The solution for that problem looks fine.

@papandreou
Copy link
Collaborator Author

@Munter, I'd like to get hyperlink upgraded to assetgraph 5+ and fix that bug we ran into while checking unexpected.js.org: https://gitter.im/unexpectedjs/unexpected?at=5c2e6fba5a0a8058be23948d

That weird thing where hyperlink complains about the response body of the HTTP redirect for an image:

not ok 44 content-type-mismatch https://coveralls.io/repos/unexpectedjs/unexpected/badge.svg
  ---
    operator: content-type-mismatch
    expected: "text/html"
    actual:   "Asset is used as both Html and Image"
    at: http://localhost:5000 (48:73) <img src="https://coveralls.io/repos/unexpectedjs/unexpected/badge.svg" alt="Coverage Status">
  ...
# Connecting to 0 hosts (checking <link rel="preconnect" href="...">
# Looking up 0 host names (checking <link rel="dns-prefetch" href="...">

First it would be nice to get this PR landed as it makes some quite fundamental changes. It's now conflicting with a new feature that landed on master. If you are otherwise OK with this PR I can fix that up?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants