Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JavaScript: change generated API to support circular/out-of-order imports #1074

Closed
generalmimon opened this issue Nov 12, 2023 · 3 comments · Fixed by kaitai-io/kaitai_struct_compiler#264

Comments

@generalmimon
Copy link
Member

generalmimon commented Nov 12, 2023

tl;dr Exporting a plain object instead a function (as it is now) from generated JS format modules enables circular imports and removes the need to load modules in a specific order within a browser global context (like in the Web IDE), but at the cost of breaking backward compatibility - everyone using KS-generated JS modules will have to adapt their code.

Currently, the JS code generated by KSC is wrapped in a UMD envelope borrowed from https://github.com/umdjs/umd/blob/36fd113/templates/returnExports.js#L17-L37 and overall looks like this:

(function (root, factory) {
  if (typeof define === 'function' && define.amd) {
    define(['kaitai-struct/KaitaiStream', './VlqBase128Le'], factory);
  } else if (typeof module === 'object' && module.exports) {
    module.exports = factory(require('kaitai-struct/KaitaiStream'), require('./VlqBase128Le'));
  } else {
    root.ImportsAbs = factory(root.KaitaiStream, root.VlqBase128Le);
  }
}(typeof self !== 'undefined' ? self : this, function (KaitaiStream, VlqBase128Le) {
var ImportsAbs = (function() {
  function ImportsAbs(_io, _parent, _root) {
    // ...
  }
  ImportsAbs.prototype._read = function() {
    this.len = new VlqBase128Le(this._io, this, null);
    // ...
  }

  return ImportsAbs;
})();
return ImportsAbs;
}));

When this code is executed, it first resolves the dependencies - the KS runtime library and the VlqBase128Le imported spec. For the "browser globals" else case, it fetches whatever is currently under the keys KaitaiStream and VlqBase128Le in the global object (falling back to undefined if they haven't been set), and immediately calls the factory function (the function (KaitaiStream, VlqBase128Le) { ... } one), which receives the resolved dependencies as arguments. The factory function returns what we want to export from the module - in our case, the root "class" constructor function function ImportsAbs(_io, _parent, _root). In other words, typeof root.ImportsAbs will be 'function'.

However, this means that when loading ImportsAbs, its dependency root.VlqBase128Le must be already set on the global object to the correct constructor function that creates a new instance of the VlqBase128Le type. If it's not, the result of this access is the value undefined, which means that once we try using ImportsAbs, it will fail on attempting to use VlqBase128Le as if it was correctly resolved.

  1. This implies that in the browser globals context, we must take extra care to load the specs in the correct order. This is something the Web IDE doesn't do, which results in (fortunately temporary in case of Web IDE) errors like TypeError: {ImportedSpec} is not a constructor, see Empty object tree and error on opening ide with zip example kaitai_struct_webide#159 and Problem with imports on local storage kaitai_struct_webide#59.

  2. Another limitation is that circular imports cannot work in any JavaScript environment.

    Node.js, for example, has support for circular require() calls, but the application must consider the implications of this to work correctly, see https://nodejs.org/docs/latest-v20.x/api/modules.html#cycles:

    When there are circular require() calls, a module might not have finished executing when it is returned.

    Careful planning is required to allow cyclic module dependencies to work correctly within an application.

    I understood from Compiler produces circular imports in Python #337 that we want circular imports to work in Kaitai Struct. Also, risk of circular import when using custom processing routines on python #691 can help us realize that the module participating in the dependency cycle doesn't have to be another KS-generated format parser, it can also be a custom processor or an opaque type.

    Although it seems circular dependencies are somewhat frowned upon in the programming community and it's mostly advised to avoid them, I think they should work nonetheless. If they don't, it forces you to refactor something just to work around this arbitrary limitation of the system and therefore makes the system harder to work with.

Both of these limitations are due to the fact that the generated format modules export the constructor function directly. You can't have an "unfinished" constructor function that could be finished later - once you "distribute" a function, you cannot mutate its body. In contrast, you can distribute an empty object and only set additional properties on it later - this is the crucial principle that enables circular imports in JavaScript in general.

The fact that the variant of the UMD envelope we're currently using (https://github.com/umdjs/umd/blob/36fd113/templates/returnExports.js#L17-L37) doesn't support circular imports is mentioned in its header comment - see returnExports.js:1-4:

// Uses Node, AMD or browser globals to create a module.

// If you want something that will work in other stricter CommonJS environments,
// or if you need to create a circular dependency, see commonJsStrict.js

Therefore, to remove these limitations of the current UMD envelope, I suggest adapting the commonJsStrict.js template instead:

(function (root, factory) {
    if (typeof define === 'function' && define.amd) {
        // AMD. Register as an anonymous module.
        define(['exports', 'b'], factory);
    } else if (typeof exports === 'object' && typeof exports.nodeName !== 'string') {
        // CommonJS
        factory(exports, require('b'));
    } else {
        // Browser globals
        factory((root.commonJsStrict = {}), root.b);
    }
}(typeof self !== 'undefined' ? self : this, function (exports, b) {
    // Use b in some fashion.

    // attach properties to the exports object to define
    // the exported module properties.
    exports.action = function () {};
}));

AMD and CommonJS cases will already work as expected (even in the case of circular imports), but unfortunately, the "Browser globals" apparently still won't work for circular dependencies or when the modules are loaded in an incorrect order. As in the returnExports.js we've been using so far, the root.b access requires that the b dependency is already available in the global scope at the time of module inclusion. However, it's quite easy to avoid this drawback:

     } else {
         // Browser globals
-        factory((root.commonJsStrict = {}), root.b);
+        factory(root.commonJsStrict || (root.commonJsStrict = {}), root.b || (root.b = {}));
     }

This ultimately solves both the circular import problem and out-of-order loading problem. Provided that the module b also follows this pattern, you can load the modules commonJsStrict and b in any order relative to each other, and there's no problem even if the b module depends circularly on commonJsStrict.


The only problem is that this is a breaking change that affects all code that uses KS-generated JavaScript parsers. I don't think there is a sane way around this - the existing approach of exporting directly the constructor function of the root type simply does not allow for circular dependencies (or out-of-order loading in browser globals context).

Compare the old and new usage:

Old

const HelloWorld = require('./HelloWorld');
// ...
const r = new HelloWorld(new KaitaiStream(buf));

New

const HelloWorld_ = require('./HelloWorld');
// ...
const r = new HelloWorld_.HelloWorld(new KaitaiStream(buf));
@generalmimon
Copy link
Member Author

@GreyCat Any comments? Do you think this BC break is acceptable, and/or do you have any ideas to make it less "painful" for the users?

@generalmimon
Copy link
Member Author

@GreyCat Thoughts?

generalmimon added a commit to kaitai-io/kaitai_struct_compiler that referenced this issue Feb 14, 2024
Resolves kaitai-io/kaitai_struct#1074

This change breaks backward compatibility with 0.10 and older, but
allows for circular imports and out-of-order module loading in a
"browser globals" context (the latter is relevant in the Web IDE, as
explained at kaitai-io/kaitai_struct#1074). In
short, it does this by switching from the UMD envelope
[returnExports.js](https://github.com/umdjs/umd/blob/36fd113/templates/returnExports.js#L17-L37)
to modified
[commonjsStrict.js](https://github.com/umdjs/umd/blob/36fd113/templates/commonjsStrict.js#L19-L36).

The BC break is that until now the generated modules exported the
constructor function directly, whereas now they export the object
containing the constructor function under the only object key that
matches the format module name. The same behavior is expected from
imported opaque types and custom processors as well.
generalmimon added a commit to kaitai-io/kaitai_struct_compiler that referenced this issue Feb 14, 2024
Resolves kaitai-io/kaitai_struct#1074

This change breaks backward compatibility with 0.10 and older, but
allows for circular imports (in all JavaScript module systems) and
out-of-order module loading in a "browser globals" context (the latter
is particularly relevant in the Web IDE, as explained at
kaitai-io/kaitai_struct#1074). In short, it
does this by switching from the UMD envelope
[returnExports.js](https://github.com/umdjs/umd/blob/36fd113/templates/returnExports.js#L17-L37)
to modified
[commonjsStrict.js](https://github.com/umdjs/umd/blob/36fd113/templates/commonjsStrict.js#L19-L36).

The BC break is that until now the generated modules exported the
constructor function directly, whereas now they export the object
containing the constructor function under the only object key that
matches the format module name. The same behavior is expected from
imported opaque types and custom processors as well.
generalmimon added a commit to kaitai-io/kaitai_struct_compiler that referenced this issue Feb 15, 2024
Resolves kaitai-io/kaitai_struct#1074

This change breaks backward compatibility with 0.10 and older, but
allows for circular imports (in all JavaScript module systems) and
out-of-order module loading in a "browser globals" context (the latter
is particularly relevant in the Web IDE, as explained at
kaitai-io/kaitai_struct#1074). In short, it
does this by switching from the UMD envelope
[returnExports.js](https://github.com/umdjs/umd/blob/36fd113/templates/returnExports.js#L17-L37)
to modified
[commonjsStrict.js](https://github.com/umdjs/umd/blob/36fd113/templates/commonjsStrict.js#L19-L36).

The BC break is that until now the generated modules exported the
constructor function directly, whereas now they export the object
containing the constructor function under the only object key that
matches the format module name. The same behavior is expected from
imported opaque types and custom processors as well.
@generalmimon
Copy link
Member Author

generalmimon commented Feb 15, 2024

This was implemented in kaitai-io/kaitai_struct_compiler#264.

Basically the only disadvantage of this whole idea is the BC break, which is not mitigated in kaitai-io/kaitai_struct_compiler#264 in any way. In theory, there are a few things we could do to lessen the impact (which doesn't necessarily mean we should; personally, I don't intend to do any of this unless someone expresses interest):

  • Add a compiler CLI option that brings the old behavior back.

  • Limit the scope of the change by only changing what the format modules generated from .ksy files export, without expecting that opaque types and custom processors follow the same pattern (i.e. exporting a wrapper object, not a function directly).

    It is worth mentioning that expecting that opaque types follow the old pattern (i.e. export the constructor function directly) while following the new pattern in generated modules would be inconsistent and could potentially lead to problems. For example, having one generated module use another generated module as an opaque type would no longer work. So in any case, this is probably not a good state to have, also considering that an opaque type is meant to be a 1:1 replacement for a .ksy type when a certain type cannot be implemented in a .ksy specification for some reason.

    Nevertheless, we could add a bit of runtime code that would detect whether a module we're importing exports an object or a function (using something like typeof ImportedModule === 'function'), and act upon that. I don't see obvious downsides, other than that it could support inconsistencies in users' code or potentially lead to confusion. But I'd still rather not add this compatibility code, or at least remove it at some point.

generalmimon added a commit to kaitai-io/kaitai_struct_webide that referenced this issue Feb 15, 2024
Fixes #59

Fixes #159 -
this is a particularly notable instance of the bug, since `TypeError:
DosDatetime is not a constructor` is an error that initially everyone
gets when they open the Web IDE in a fresh browser (this is because
`zip.ksy` is selected by default, and it is affected by the bug since it
contains an import).

See kaitai-io/kaitai_struct#1074 and
kaitai-io/kaitai_struct_compiler#264 - the
actual fix was done in the KSC-generated JS code, which was changed to
allow interdependent format modules to be loaded (and added to the
global context in the case of the Web IDE) in any order. However, this
was a backwards-incompatible change, so this set of Web IDE changes just
updates our code to work with the new compiler.
generalmimon added a commit to kaitai-io/kaitai_struct_tests that referenced this issue Feb 16, 2024
generalmimon added a commit to kaitai-io/kaitai_struct_compiler that referenced this issue Feb 21, 2024
Resolves kaitai-io/kaitai_struct#1074

This change breaks backward compatibility with 0.10 and older, but
allows for circular imports (in all JavaScript module systems) and
out-of-order module loading in a "browser globals" context (the latter
is particularly relevant in the Web IDE, as explained at
kaitai-io/kaitai_struct#1074). In short, it
does this by switching from the UMD envelope
[returnExports.js](https://github.com/umdjs/umd/blob/36fd113/templates/returnExports.js#L17-L37)
to modified
[commonjsStrict.js](https://github.com/umdjs/umd/blob/36fd113/templates/commonjsStrict.js#L19-L36).

The BC break is that until now the generated modules exported the
constructor function directly, whereas now they export the object
containing the constructor function under the only object key that
matches the format module name. The same behavior is expected from
imported opaque types and custom processors as well.
generalmimon added a commit to kaitai-io/kaitai_struct_webide that referenced this issue Feb 21, 2024
Fixes #59

Fixes #159 -
this is a particularly notable instance of the bug, since `TypeError:
DosDatetime is not a constructor` is an error that initially everyone
gets when they open the Web IDE in a fresh browser (this is because
`zip.ksy` is selected by default, and it is affected by the bug since it
contains an import).

See kaitai-io/kaitai_struct#1074 and
kaitai-io/kaitai_struct_compiler#264 - the
actual fix was done in the KSC-generated JS code, which was changed to
allow interdependent format modules to be loaded (and added to the
global context in the case of the Web IDE) in any order. However, this
was a backwards-incompatible change, so this set of Web IDE changes just
updates our code to work with the new compiler.
generalmimon added a commit to kaitai-io/kaitai_struct_tests that referenced this issue Sep 29, 2024
Related to kaitai-io/kaitai_struct#1074

Until this commit, any generated JS test spec from a `.kst` file
containing an enum literal would have a syntax error because of lines
like `const ./Enum0 = require(...);`. It seems that this bug exists
since 66e75b3. In that commit, I tried
to fix the build error of KST translator due to the compiler API change
in kaitai-io/kaitai_struct_compiler@8007e0e#diff-6164bdf7b558cea6ff36b4a50508205a4569f8e98e398e77c4373679fceafffa.
In JavaScriptSG, I just passed the existing `importList` to
JavaScriptTranslator without thinking about the consequences (which were
exactly the illegal `const ./Enum0 = require(...);` lines). This should
now be fixed.
generalmimon added a commit to kaitai-io/kaitai_struct_tests that referenced this issue Sep 29, 2024
... also in the remaining specs that haven't yet been regenerated in the
previous commit.

This is to indicate that the meaning of this parameter changed in the
previous commit because of the change to
`helpers/javascript/testHelper.js`. Previously, the test body received
the parser constructor function directly, but now it only receives the
wrapper object that contains the constructor under the name of the
format as a key. Appending an underscore `_` is our own convention to
indicate this, see
https://github.com/kaitai-io/kaitai_struct_compiler/blob/cc4e73b89bf120e57a6577dd70f72a4abd3d25e8/shared/src/main/scala/io/kaitai/struct/languages/JavaScriptCompiler.scala#L662-L666

kaitai-io/kaitai_struct#1074 is related.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant