root_domain fails on "unusual" URI-schemes #98

derflocki · 2014-07-31T23:51:52Z

Want to use $LAB inside a chrome-extension, however the regex that extracts the root_domain does not match the href chrome-extension:mmjeglkkcnglojkjoomifokhekiiiioe/www/index.html. According to RFC3986, Section 3.1:

Scheme names consist of a sequence of characters beginning with a
letter and followed by any combination of letters, digits, plus
("+"), period ("."), or hyphen ("-").

so the correct regex would be /^[a-z][\w\+-\/]+\:\/\/\/?[^\/]+/.

The text was updated successfully, but these errors were encountered:

getify · 2014-08-01T00:25:54Z

First, this is somewhat related to #93.

Also, your proposed regex doesn't actually work:

/^[a-z][\w\+-\/]+\:\/\/\/?[^\/]+/.test("chrome-extension:mmjeglkkcnglojkjoomifokhekiiiioe/www/index.html"); // false

We'd have to tweak the regex to make the // part optional, or something like that.

Moreover, the question is: could we actually assume to detect against URLs that don't look like normal web URL's and not break other important behavior (namely absolute/relative URL handling)?

In addition to the root_domain matching with /^\w+\:\/\/\/?[^\/]+/, loaded script URLs are canonicalized to the page by testing against absolute_regex: /^\w+\:\/\//. So, if we're going to make the change to one regex, we have to change both regexes.

But, imagine if we did that, and then someone had this URL: foo:bar/baz.js to load some script, while inside a chrome extension.

Any generic regex that matches your chrome-extension URL would have to also match that foo:bar.. URL. However, that URL might in fact be intended as a relative URL, where they want it to be canonicalized to the root chrome-extension URI, so that it would be loaded as chrome-extension:aosnsldk24rl2ksada/foo:bar/baz.js.

How could we know that this URL should have been taken to be as absolute instead: foo:bar/baz.js, rather than relative?

We couldn't. Right now, "absolute" requires something like http:// at the beginning, which is an easy assumption. But that assumption would totally go away with the proposed change.

Unless, and this is the part I'm dubious on, the regex only allowed a specific safe-list of scheme names, not just a w+ type regex match... like chrome-extension:, about:, etc.

How would we decide which scheme names should be explicitly listed? What happens if someone later comes along and says, "hey, my page is running under this wxyz:.. scheme, can you whitelist it?" And what happens if that accidentally collides with someone else's relative URL design?

Seems like a future maintenance nightmare, that ensures we have to keep updating LABjs rather than keeping it stable and unchanging as much as possible, and that we're asking for breakages.

Not sure what a good general solution to this would be.

My initial reaction is similar to what I said in #93, which is that this is a case where you should tweak LABjs for your own needs, since it's a pretty niche use-case, and the pain (of this one specifically) is disproportionately high.

getify · 2014-08-01T00:34:24Z

OTOH, perhaps the more appropriate logic is, if the current page is a scheme type, you couldn't do an absolute regex that was also a scheme (unless it was an exact match). You'd have to do a normal web URL like "http://", or otherwise the URL could be reasonably assumed to definitely be relative to the scheme URI.

Even if we could construct such special case logic, I'm not sure if the niche'ness of this deserves to be elevated to the core code.

Undecided for now.

derflocki · 2014-08-01T00:46:41Z

Thanks for the extremely long explanation! The URL is actually properly formatted chrome-extension://mmjeglkkcnglojkjoomifokhekiiiioe/www/index.html. The slashes got lost in copy and paste, sorry for wasting your time. So i guess the propsed fix should work...

getify · 2014-08-01T01:31:11Z

Interesting. So the only thing missing is the allowance for - in it?

derflocki · 2014-08-01T05:54:47Z

Correct. The scheme was the problem. Additionally, as mentioned im my
initial post, + and / would also be legal characters according to the
RFC.

And since I tend to take these things seriously, the RFC also states that
the first char is an ALPHA, thus the [a-z].

What I just now realize, is that the check should actually be case
insensitive.

getify · 2014-08-01T13:23:39Z

OK, this will be queued up to be fixed. But, just to set expectations, we're collecting several small niche things like this together before releasing again. As you can tell, I release quite rarely, as stability is one of the main features of this library.

I would recommend for now that you patch your own copy of LABjs. I'll leave this issue open until such a time as it's appropriate to fix for that next release (probably 2.1).

Thanks!

fixed root_domain regex, fixes getify#98

mttcr · 2016-01-29T14:45:29Z

Hello,
To add some interests to this issue, LAB does not correctly handle the blob: URL scheme used here: https://developer.mozilla.org/en-US/docs/Web/API/URL/createObjectURL
This issue corrects that behavior I guess.
While waiting for 3.0, I tweaked the regex like that : var absolute_regex = /^(?:\w+\:\/\/|blob\:)/

getify · 2016-01-29T17:09:51Z

@mttcr thanks! to clarify, you're talking about?

using LABjs to load a blob URL? --OR--
using LABjs inside a page that came from a blob URL?

mttcr · 2016-01-29T17:42:35Z

The first one. I have some raw source in a string that I convert into a blob, then I load it with LABjs.

let blob = new Blob( [plainSource], [{ type: "text/javascript" }] );
let objectURL = window.URL.createObjectURL( blob );
$LAB
    .script( objectURL )
    .wait(() =>
       {
            window.URL.revokeObjectURL( objectURL );
       });

getify · 2016-01-29T17:55:20Z

thanks!

Improved on LABjs by adding: -> conditional loading based on element availability -> test multiple elements with AND-like condition (eg: $O.test().test().js().wait()) -> test multiple elements with OR-like condition (eg: $O.test('this, that').js().wait()) -> custom javascript attributes -> leaner code (6.3kb minified, 2.9kb gzipped) -> replaces .no-js with .js on html element -> queue test via $O.queueTest() -> end() to enable explicit end of test chain without breaking the chain (unlike sandbox()) -> loading initial js (entry point js - like app.js) via data attribute with data-only -> adjusted absolute uri regex gotten from a LABjs issue (getify/LABjs#98) -> automatically load polyfill service (polyfill.io) via options -> inject script tags at end of head tag opposed to LABjs inject at top (getify/LABjs#115) -> conditional script execution of inline scripts via $O.test().script() -> add snippet in docs for lazy css since this library will not handle that -> support CommonJS (not tested)

derflocki added a commit to derflocki/LABjs that referenced this issue Aug 1, 2014

fixed root_domain regex, fixes getify#98

d987c1c

derflocki added a commit to derflocki/LABjs that referenced this issue Aug 1, 2014

fixed root_domain regex, fixes getify#98

5859a04

getify added the accepted label Aug 1, 2014

derflocki added a commit to derflocki/LABjs that referenced this issue Oct 31, 2014

Merge pull request #1 from derflocki/wp-scheme-fix

79675d6

fixed root_domain regex, fixes getify#98

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

root_domain fails on "unusual" URI-schemes #98

root_domain fails on "unusual" URI-schemes #98

derflocki commented Jul 31, 2014

getify commented Aug 1, 2014

getify commented Aug 1, 2014

derflocki commented Aug 1, 2014

getify commented Aug 1, 2014

derflocki commented Aug 1, 2014

getify commented Aug 1, 2014

mttcr commented Jan 29, 2016

getify commented Jan 29, 2016

mttcr commented Jan 29, 2016

getify commented Jan 29, 2016

root_domain fails on "unusual" URI-schemes #98

root_domain fails on "unusual" URI-schemes #98

Comments

derflocki commented Jul 31, 2014

getify commented Aug 1, 2014

getify commented Aug 1, 2014

derflocki commented Aug 1, 2014

getify commented Aug 1, 2014

derflocki commented Aug 1, 2014

getify commented Aug 1, 2014

mttcr commented Jan 29, 2016

getify commented Jan 29, 2016

mttcr commented Jan 29, 2016

getify commented Jan 29, 2016