chore: use nx commands #2493

david-luna · 2024-10-18T15:44:52Z

Which problem is this PR solving?

This PR tries to work towards the migration from lerna to nx open-telemetry/opentelemetry-js#4240. Although the issue is on the core repository I think both repos should be aligned in the tooling. I've just started with this one because it seems more straight-forward.

Changes

This PR surfaces the [email protected] package required by lerna. I haven't tried to use a higher version since lerna is not completely removed yet and it might have version conflicts. The package.json's script section has been changed to use the equivalent nx commands for building and testing the packages, with some nuances:

a nx.json file has been added defining task dependencies so some package scripts like precompile are not needed
a cache has been defined for some tasks like version:update, lint and compile. More details below

The workflow does not change. Indeed the development guidelines needs no changes since already recommends to run npm run compile at the root folder before start doing changes in a package.

Cache

nx can cache the output of a task like the compilation. Having it active may come handy for developers since you can switch branches and rebuild only what changed just by npm run compile at the root. The rest will come from the cache. Cache is located inside node_modules so its easy to invalidate via npm run ci or just removing the node_modules/.cache/nx folder.

Using it in CI

For unit and TAV tests there is a compilation step before running the test suite. Turns out that the tests are using the tsconfig.json to compile the tests before running. The file has the following options

{
  "extends": "../../../tsconfig.base",
  "compilerOptions": {
    "rootDir": ".",
    "outDir": "build"
  },
  "include": [
    "src/**/*.ts",
    "test/**/*.ts"
  ]
}

so the tests are compiling again the code in order to run. This is an unnecessary use of resources.

Test jobs in workflows do a clean checkout before testing so running npm run test in a package would fail because one or more of this reasons:

version.ts file is missing
package may depend on other in the same repo (this dependency needs to be compiled)
package is referring to build folder in tests (example)

Test workflows run npm run compile for all packages

And because of that workflows run npm run compile for all packages before testing consuming time.

This PR leverages nx cache to avoid this extra compilation given the assumption that TypeScript is giving the same output regardless of the nodejs version used for compilation:

TypeScript version is pinned and the same in all workflows
TS configuration does not change depending on nodejs version

So we can change the workflows to

have a single compilation step and storing the cache. Compilation is done with the same setup done for publishing (nodejs v16) so the build results are "production like".
restore the cache when test jobs are running. This way the needed compilation takes less time

The gain is significant since the compilation takes around 5~~7min and with the cache is around 3~~6sec (2sec downloading + 2~3sec compiling).

david-luna · 2024-10-22T07:27:19Z

.github/workflows/test-all-versions.yml

@@ -13,8 +13,35 @@ on:
        type: string

 jobs:
+  build:


note for reviewer: this is the build step that is going to be cached. github.run_number is unique for the PR so each time the workflow is run this tasks overwrites the artifact saving space. retention-days is set to 1 so the artifact will be removed the next day after the last run of this workflow.

david-luna · 2024-10-22T07:29:41Z

.github/workflows/unit-test.yml

@@ -5,7 +5,34 @@ on:
  pull_request:

 jobs:
+  build:


note for reviewer: same here for unit tests. I guess a single compile step can be done for both workflows but I think this will add complexity to them. This way both workflows are independent and still we reduce the resource usage

david-luna · 2024-10-22T07:30:59Z

detectors/node/opentelemetry-resource-detector-alibaba-cloud/package.json

@@ -10,7 +10,6 @@
    "compile": "tsc -p .",
    "lint": "eslint . --ext .ts",
    "lint:fix": "eslint . --ext .ts --fix",
-    "precompile": "tsc --version && lerna run version:update --scope @opentelemetry/resource-detector-alibaba-cloud --include-dependencies",


note for reviewer: not necessary since nx.json has defined the task dependencies. Same applies for other packages

david-luna · 2024-10-22T09:53:06Z

nx.json

+    "compile": {
+      "dependsOn": [
+        "version:update",
+        "^compile"


note for reviewer: this is the syntax to tell that dependencies within the monorepo should be compiled before compiling the current package

package.json

trentm · 2024-10-23T16:26:32Z

I'll sponsor to not have the "package unmaintained" logic.

trentm · 2024-10-24T22:41:49Z

The gain is significant since the compilation takes around 5~~7min and with the cache is around 3~~6sec (2sec downloading + 2~3sec compiling).

I got confused by the markup here. Are you trying to say the timing improved from 5-7 minutes to 3-6 seconds? If so, that's pretty nice. :)

david-luna · 2024-10-28T16:39:01Z

The gain is significant since the compilation takes around 5~~7min and with the cache is around 3~~6sec (2sec downloading + 2~3sec compiling).

I got confused by the markup here. Are you trying to say the timing improved from 5-7 minutes to 3-6 seconds? If so, that's pretty nice. :)

Also I think I'm not clear enough. There is improvement in resource (CPU, Memory, ...) usage but not in the total time spent in the workflow. I'll try to elaborate

Currently each job on the matrix does a compilation (necessary to generate version file and have dependencies solved) before testing. Each one of them taking its 5-7 minutes of CPU & Memory consumption.

With the change we compile only once at the beginning, still taking 5-7min but only taking resources from 1 node. Then the job matrix for testing reuses the output from that previous step thanks to the cache. It might be represented like this

Before
test-matrix
├─ node 14/ --- compile (5-7min)---test(?min)---(reports and others)
├─ node 16/ --- compile (5-7min)---test(?min)---(reports and others)
├─ ...

Now
build --- compile with node v16 (5-7min)
├─ test-matrix
   ├─ node 14/ --- compile (3-6sec from cache)---test(?min)---(reports and others)
   ├─ node 16/ --- compile (3-6sec from cache)---test(?min)---(reports and others)
   ├─ ...

so total time is the same but using less resources.

trentm

might want to add 'nx' to ignoreDeps in "renovate.json" for now

One thing that is lost here with nx run-many ... vs lerna run ... is the number of workers to run. When doing npm run compile, for example, with lerna run ... it would run 11 tasks in parallel on my machine. With nx run-many ... it runs 3 tasks in parallel. lerna is setting the "parallel / concurrency" based on the number of CPUs on the machine (https://github.com/lerna/lerna/blob/6.6.2/libs/core/src/lib/command/index.ts#L15). nx is hardcoding the default --parallel value to 3. I think even the latest nx version does that.

I was going to say that this means the initial npm run compile takes longer for me. However, I unscientifically compared time npm run compile (after running npm ci) in git clones with and without this PR, and even with the parallel=3 it is faster with nx.

# with this PR
npm run compile  453.13s user 31.79s system 601% cpu 1:20.59 total

# without this PR (latest main)
npm run compile  655.85s user 86.89s system 880% cpu 1:24.35 total

I'll take it.
For me it means that the initial npm run compile takes longer.

.github/workflows/test-all-versions.yml

trentm · 2024-10-24T22:49:25Z

.github/workflows/unit-test.yml

+      - name: Install
+        run: npm ci
+      - name: Build
+        run: npm run compile


I'm a little scared of the "if npm ci and npm run compile layout is the same for all node versions". :)
Have you done a basic check comparing two separate git clones that do those steps, one with node 16 and one with node 22, say?

I think I'm reasonably confident that for non-binary packages that the node_modules/... tree should be the same between versions -- otherwise the idea of package-lock.json is broken.

I'm not sure if we have binary deps -- i.e. ones that are built at install time via a gyp file or install or post-install npm scripts.

I'm a little scared of the "if npm ci and npm run compile layout is the same for all node versions". :)

We can do even another twist on this because:

ts-node compiles src & tests folders so "in theory" is not necessary to compile, just to have the version.ts file available

but some packages depend on another in the same monorepo like @opentelemetry/resouce-detector-container and these dependencies need to be build.

Since the published packages are compiled using nodejs v16 maybe is better to have a "production" build as tests assets instead of a different build with each nodejs version (14, 18, 20, ...)?

I'm not sure if we have binary deps -- i.e. ones that are built at install time via a gyp file or install or post-install npm scripts.

We don't have them anymore AFAICT. The last was removed with grpc-cencus-binary-propagator (#2276) because it always built on macOS (it had pre-built binaries for linux+glibc so building sometimes does not happen based on which system one is using). I think if there was another one left we'd have noticed it by now as it tends to take a lot of time on install. 🙂

Since the published packages are compiled using nodejs v16 maybe is better to have a "production" build as tests assets instead of a different build with each nodejs version (14, 18, 20, ...)?

I think having the Node.js version matched with the published packages makes sense from a testing standpoint. This way we can preempt surprises if there are any. @david-luna would you mind putting a comment in the publish workflow and here that the versions should always be updated together? 🙂

detectors/node/opentelemetry-resource-detector-container/package.json

plugins/node/opentelemetry-instrumentation-bunyan/package.json

pichlermarc

Looks good % some nits, thanks for working on this. 🎉
Looking forward to seeing it in action 🙂

pichlermarc · 2024-11-05T09:54:55Z

.github/workflows/unit-test.yml

+      - name: Install
+        run: npm ci
+      - name: Build
+        run: npm run compile


I'm not sure if we have binary deps -- i.e. ones that are built at install time via a gyp file or install or post-install npm scripts.

We don't have them anymore AFAICT. The last was removed with grpc-cencus-binary-propagator (#2276) because it always built on macOS (it had pre-built binaries for linux+glibc so building sometimes does not happen based on which system one is using). I think if there was another one left we'd have noticed it by now as it tends to take a lot of time on install. 🙂

Since the published packages are compiled using nodejs v16 maybe is better to have a "production" build as tests assets instead of a different build with each nodejs version (14, 18, 20, ...)?

I think having the Node.js version matched with the published packages makes sense from a testing standpoint. This way we can preempt surprises if there are any. @david-luna would you mind putting a comment in the publish workflow and here that the versions should always be updated together? 🙂

.github/workflows/unit-test.yml

nx.json

…metry-js-contrib into dluna/update-scripts

chore: use nx commands

22e58cd

david-luna added 2 commits October 21, 2024 17:05

chore: add dependency between jobs in TAV workflow

6b5df30

chore: remove old scripts

6cec554

david-luna commented Oct 22, 2024

View reviewed changes

chore: restore script

437637e

david-luna commented Oct 22, 2024

View reviewed changes

Merge branch 'main' into dluna/update-scripts

6aa6084

david-luna marked this pull request as ready for review October 22, 2024 09:53

david-luna requested a review from a team as a code owner October 22, 2024 09:53

david-luna mentioned this pull request Oct 22, 2024

Switch from lerna to nx open-telemetry/opentelemetry-js#4240

Open

david-luna commented Oct 23, 2024

View reviewed changes

package.json Show resolved Hide resolved

trentm added the has:sponsor This package or feature has a sponsor that has volunteered to review PRs and respond to questions label Oct 23, 2024

Merge branch 'main' into dluna/update-scripts

c7e6520

trentm approved these changes Oct 30, 2024

View reviewed changes

david-luna added 2 commits October 31, 2024 16:48

chore: add missing dev dependencies

cef0847

Merge branch 'main' into dluna/update-scripts

7ab5103

trentm approved these changes Oct 31, 2024

View reviewed changes

Merge branch 'main' into dluna/update-scripts

59fb12c

pichlermarc approved these changes Nov 5, 2024

View reviewed changes

david-luna added 4 commits November 5, 2024 11:35

chore: update node versions in test workflows

3ad305c

Merge branch 'dluna/update-scripts' of github.com:david-luna/opentele…

ff866c0

…metry-js-contrib into dluna/update-scripts

Merge branch 'main' into dluna/update-scripts

530fdbb

Merge branch 'main' into dluna/update-scripts

07e998a

david-luna merged commit 6234918 into open-telemetry:main Nov 5, 2024
25 checks passed

david-luna deleted the dluna/update-scripts branch November 5, 2024 17:59

david-luna mentioned this pull request Nov 13, 2024

chore: remove lerna #2526

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: use nx commands #2493

chore: use nx commands #2493

david-luna commented Oct 18, 2024 •

edited

Loading

david-luna Oct 22, 2024

david-luna Oct 22, 2024

david-luna Oct 22, 2024

david-luna Oct 22, 2024

trentm commented Oct 23, 2024

trentm commented Oct 24, 2024 •

edited

Loading

david-luna commented Oct 28, 2024

trentm left a comment •

edited

Loading

trentm Oct 24, 2024

david-luna Oct 31, 2024

pichlermarc Nov 5, 2024

pichlermarc left a comment •

edited

Loading

pichlermarc Nov 5, 2024

chore: use nx commands #2493

chore: use nx commands #2493

Conversation

david-luna commented Oct 18, 2024 • edited Loading

Which problem is this PR solving?

Changes

Cache

Using it in CI

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trentm commented Oct 23, 2024

trentm commented Oct 24, 2024 • edited Loading

david-luna commented Oct 28, 2024

trentm left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pichlermarc left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

david-luna commented Oct 18, 2024 •

edited

Loading

trentm commented Oct 24, 2024 •

edited

Loading

trentm left a comment •

edited

Loading

pichlermarc left a comment •

edited

Loading