Skip to content
This repository has been archived by the owner on Dec 31, 2022. It is now read-only.

root_domain fails on "unusual" URI-schemes #98

Open
derflocki opened this issue Jul 31, 2014 · 10 comments
Open

root_domain fails on "unusual" URI-schemes #98

derflocki opened this issue Jul 31, 2014 · 10 comments
Labels

Comments

@derflocki
Copy link

Want to use $LAB inside a chrome-extension, however the regex that extracts the root_domain does not match the href chrome-extension:mmjeglkkcnglojkjoomifokhekiiiioe/www/index.html. According to RFC3986, Section 3.1:

Scheme names consist of a sequence of characters beginning with a
letter and followed by any combination of letters, digits, plus
("+"), period ("."), or hyphen ("-").

so the correct regex would be /^[a-z][\w\+-\/]+\:\/\/\/?[^\/]+/.

derflocki added a commit to derflocki/LABjs that referenced this issue Aug 1, 2014
derflocki added a commit to derflocki/LABjs that referenced this issue Aug 1, 2014
@getify
Copy link
Owner

getify commented Aug 1, 2014

First, this is somewhat related to #93.

Also, your proposed regex doesn't actually work:

/^[a-z][\w\+-\/]+\:\/\/\/?[^\/]+/.test("chrome-extension:mmjeglkkcnglojkjoomifokhekiiiioe/www/index.html"); // false

We'd have to tweak the regex to make the // part optional, or something like that.

Moreover, the question is: could we actually assume to detect against URLs that don't look like normal web URL's and not break other important behavior (namely absolute/relative URL handling)?

In addition to the root_domain matching with /^\w+\:\/\/\/?[^\/]+/, loaded script URLs are canonicalized to the page by testing against absolute_regex: /^\w+\:\/\//. So, if we're going to make the change to one regex, we have to change both regexes.

But, imagine if we did that, and then someone had this URL: foo:bar/baz.js to load some script, while inside a chrome extension.

Any generic regex that matches your chrome-extension URL would have to also match that foo:bar.. URL. However, that URL might in fact be intended as a relative URL, where they want it to be canonicalized to the root chrome-extension URI, so that it would be loaded as chrome-extension:aosnsldk24rl2ksada/foo:bar/baz.js.

How could we know that this URL should have been taken to be as absolute instead: foo:bar/baz.js, rather than relative?

We couldn't. Right now, "absolute" requires something like http:// at the beginning, which is an easy assumption. But that assumption would totally go away with the proposed change.

Unless, and this is the part I'm dubious on, the regex only allowed a specific safe-list of scheme names, not just a w+ type regex match... like chrome-extension:, about:, etc.

How would we decide which scheme names should be explicitly listed? What happens if someone later comes along and says, "hey, my page is running under this wxyz:.. scheme, can you whitelist it?" And what happens if that accidentally collides with someone else's relative URL design?

Seems like a future maintenance nightmare, that ensures we have to keep updating LABjs rather than keeping it stable and unchanging as much as possible, and that we're asking for breakages.

Not sure what a good general solution to this would be.

My initial reaction is similar to what I said in #93, which is that this is a case where you should tweak LABjs for your own needs, since it's a pretty niche use-case, and the pain (of this one specifically) is disproportionately high.

@getify
Copy link
Owner

getify commented Aug 1, 2014

OTOH, perhaps the more appropriate logic is, if the current page is a scheme type, you couldn't do an absolute regex that was also a scheme (unless it was an exact match). You'd have to do a normal web URL like "http://", or otherwise the URL could be reasonably assumed to definitely be relative to the scheme URI.

Even if we could construct such special case logic, I'm not sure if the niche'ness of this deserves to be elevated to the core code.

Undecided for now.

@derflocki
Copy link
Author

Thanks for the extremely long explanation! The URL is actually properly formatted chrome-extension://mmjeglkkcnglojkjoomifokhekiiiioe/www/index.html. The slashes got lost in copy and paste, sorry for wasting your time. So i guess the propsed fix should work...

@getify
Copy link
Owner

getify commented Aug 1, 2014

Interesting. So the only thing missing is the allowance for - in it?

@derflocki
Copy link
Author

Correct. The scheme was the problem. Additionally, as mentioned im my
initial post, + and / would also be legal characters according to the
RFC.

And since I tend to take these things seriously, the RFC also states that
the first char is an ALPHA, thus the [a-z].

What I just now realize, is that the check should actually be case
insensitive.

@getify getify added the accepted label Aug 1, 2014
@getify
Copy link
Owner

getify commented Aug 1, 2014

OK, this will be queued up to be fixed. But, just to set expectations, we're collecting several small niche things like this together before releasing again. As you can tell, I release quite rarely, as stability is one of the main features of this library.

I would recommend for now that you patch your own copy of LABjs. I'll leave this issue open until such a time as it's appropriate to fix for that next release (probably 2.1).

Thanks!

derflocki added a commit to derflocki/LABjs that referenced this issue Oct 31, 2014
@mttcr
Copy link

mttcr commented Jan 29, 2016

Hello,
To add some interests to this issue, LAB does not correctly handle the blob: URL scheme used here: https://developer.mozilla.org/en-US/docs/Web/API/URL/createObjectURL
This issue corrects that behavior I guess.
While waiting for 3.0, I tweaked the regex like that : var absolute_regex = /^(?:\w+\:\/\/|blob\:)/

@getify
Copy link
Owner

getify commented Jan 29, 2016

@mttcr thanks! to clarify, you're talking about?

  1. using LABjs to load a blob URL? --OR--
  2. using LABjs inside a page that came from a blob URL?

@mttcr
Copy link

mttcr commented Jan 29, 2016

The first one. I have some raw source in a string that I convert into a blob, then I load it with LABjs.

let blob = new Blob( [plainSource], [{ type: "text/javascript" }] );
let objectURL = window.URL.createObjectURL( blob );
$LAB
    .script( objectURL )
    .wait(() =>
       {
            window.URL.revokeObjectURL( objectURL );
       });

@getify
Copy link
Owner

getify commented Jan 29, 2016

thanks!

Ema4rl added a commit to Ema4rl/Only.js that referenced this issue May 26, 2016
Improved on LABjs by adding:
-> conditional loading based on element availability
-> test multiple elements with AND-like condition (eg:
$O.test().test().js().wait())
-> test multiple elements with OR-like condition (eg: $O.test('this,
that').js().wait())
-> custom javascript attributes
-> leaner code (6.3kb minified, 2.9kb gzipped)
-> replaces .no-js with .js on html element
-> queue test via $O.queueTest()
-> end() to enable explicit end of test chain without breaking the chain
(unlike sandbox())
-> loading initial js (entry point js - like app.js) via data attribute
with data-only
-> adjusted absolute uri regex gotten from a LABjs issue
(getify/LABjs#98)
-> automatically load polyfill service (polyfill.io) via options
-> inject script tags at end of head tag opposed to LABjs inject at top
(getify/LABjs#115)
-> conditional script execution of inline scripts via $O.test().script()
-> add snippet in docs for lazy css since this library will not handle
that
-> support CommonJS (not tested)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants