Potentially the biggest performance improvement that can be done #46

SuperchupuDev · 2024-09-28T14:33:12Z

I'm not even sure how to approach the problem, but implementing this means that all of the weird patterns that currently avoid all optimizations would be really optimized along with literally everything else.

By default, fdir crawls all subdirectories and files of a root, which can result in extra processing work that's not necessary, harming performance. tinyglobby tries to apply some optimizations by inferring a common root.

fdir exposes a exclude function that can be used to exclude directories from crawling. It's being currently used on the ignore patterns to... not crawl those ignored patterns?

What if, we took the matching patterns (basically the patterns that aren't meant to be ignored), we did some weird transformations to them, and used them in the exclude matcher?

For example, let's say we have the following usage:

import { glob } from 'tinyglobby';

await glob(['src/files/index.ts', 'scripts/*.ts']);

with the following file structure:

- node_modules
  ^big
- plugins
  - myPlugins
    | plugin.ts
- scripts
  - utils
    | index.ts
  | deploy.ts
  | run.ts
- src
  - files
    | index.ts
  - other
    | index.ts

Basically, we need a picomatch matcher that returns true for every directory that we don't want to crawl, in this case node_modules, plugins, scripts/utils, and src/other. This could be implemented with the following picomatch usage:

import picomatch from 'picomatch';

const exclude = picomatch('**/*', {
  ignore: ['src/files', 'scripts/*/**']
});

Great! It now only crawls the directories needed (hopefully, I haven't checked). Now the question is how to implement something that converts ['src/files/index.ts', 'scripts/*.ts' into ['src/files', 'scripts/*/**'], which is the whole point of this issue. If we figure it out, tinyglobby should be nearly as fast as possible.

Some notes:

Whatever thing is implemented needs to take into account how <pattern>/** patterns work. It not only matches subdirectories of <pattern>, but it also matches <pattern> itself. this is why the second processed pattern in the example isn't scripts/**, as that would match scripts making fdir not crawl it.
Ideally this processing should happen in the normalizePattern function, so that we don't need to loop through the patterns extra times.

The text was updated successfully, but these errors were encountered:

webpro · 2024-10-25T12:20:48Z

Knip does something similar, but just for .gitignore files only. Pointers:

The key is to have functionality like this deepFilter. tinyglobby uses fdir which has filter which is maybe what we need here?

webpro · 2024-10-25T12:32:43Z

Scratch that, I mixed up filter and exclude.

Still, the deepFilter example might be useful as I think it does the same as fdir#exclude.

SuperchupuDev · 2024-11-28T22:33:02Z

an initial implementation has been merged in #69! the code can be improved further though, which is what i'll try to do before a new release

SuperchupuDev · 2025-01-17T22:41:30Z

current status of this issue: while an initial implementation was merged (and still not released yet!), it broke some use cases (#80). #76 was opened, which not only fixes this use case but it also works better at optimizing. however, there is still one use case that breaks in both: using / inside glob symbols (for example the pattern {a/b,c}. in theory i could just convert those kinds of parts to ** in the optimizer but it wouldn't be ideal as it would mean no optimizations are applied in those cases. i'm looking for solutions and open for help in #76

SuperchupuDev added help wanted Extra attention is needed performance Potential performance improvement labels Sep 28, 2024

SuperchupuDev mentioned this issue Sep 30, 2024

what is the equivalent of nodir? #51

Closed

xiboon mentioned this issue Oct 27, 2024

initial exclude optimizer implementation #69

Merged

SuperchupuDev closed this as completed in #69 Nov 28, 2024

romainmenke mentioned this issue Dec 2, 2024

Refactor to use tinyglobby instead of globby stylelint/stylelint#8051

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potentially the biggest performance improvement that can be done #46

Potentially the biggest performance improvement that can be done #46

SuperchupuDev commented Sep 28, 2024 •

edited

Loading

webpro commented Oct 25, 2024

webpro commented Oct 25, 2024

SuperchupuDev commented Nov 28, 2024

SuperchupuDev commented Jan 17, 2025

Potentially the biggest performance improvement that can be done #46

Potentially the biggest performance improvement that can be done #46

Comments

SuperchupuDev commented Sep 28, 2024 • edited Loading

webpro commented Oct 25, 2024

webpro commented Oct 25, 2024

SuperchupuDev commented Nov 28, 2024

SuperchupuDev commented Jan 17, 2025

SuperchupuDev commented Sep 28, 2024 •

edited

Loading