Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement legacy collections using glob #11976

Merged
merged 58 commits into from
Oct 4, 2024
Merged

Implement legacy collections using glob #11976

merged 58 commits into from
Oct 4, 2024

Conversation

ascorbic
Copy link
Contributor

@ascorbic ascorbic commented Sep 12, 2024

Changes

Breaking change to legacy content collections

Implements legacy content and data collections using glob() loader. By default, collections that use the the old types (content or data) are now implemented using the glob loader, with extra backward-compat handling. This includes any collection without a loader defined.

Any legacy content collections are handled like this:

  • a glob loader collection is defined, with patterns that match the previous handling (matches src/content/<collection name>/**/*.md and other content extensions depending on installed integrations, with underscore-prefixed files and folders ignored)
  • When used in the runtime, the entries have an ID based on the filename in the same format as legacy collections
  • A slug field is added with the same format as before
  • A render() method is added to the entry, so they can be called using entry.render()
  • getEntryBySlug is supported

Legacy data collections are handled like this:

  • a glob loader collection is defined, with patterns that match the previous handling (matches src/content/<collection name>/**/*{.json,.yaml} and other data extensions, with underscore-prefixed files and folders ignored)
  • Entries have an ID that is not slugified
  • getDataEntryById is supported

While these emulate most of the features of legacy collections, they have these differences:

  • Implicit collections for folders in src/content are only generated if no other collections use content layer. If no content layer collections are defined, and there are folders in src/content that don't match collections in src/content/config.ts then collections will be auto-generated for them. This is not recommended, and a warning will be logged that this is deprecated and that a collection should always be defined in config.ts. For legacy collections these can just be empty declarations: e.g.const blog = defineCollection({}).
  • The layout field is not supported in Markdown
  • Sort order of generated collections is non-deterministic and platform-dependent.
  • image().refine() is not supported
  • the key for getEntry is typed as string, rather than having types for every entry.

A new config flag legacy.collections is added for users that need the old behavior. When set, collections in src/content are processed in the same way as before rather than being implemented with glob. When set, content layer collections are forbidden in src/content, and will fail a build if defined.

Testing

Lots of tests have been updated. Where possible they use the new implementation, but when they need to test unsupported features then they have the legacy flag applied.

Docs

Copy link

changeset-bot bot commented Sep 12, 2024

🦋 Changeset detected

Latest commit: 4eae23f

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions bot added pkg: astro Related to the core `astro` package (scope) docs pr pr: preview This PR has a preview release labels Sep 12, 2024
@github-actions github-actions bot added the pkg: example Related to an example package (scope) label Sep 13, 2024
@ascorbic ascorbic changed the title Add experimental support for emulating legacy collections using glob Emulate legacy collections using glob Sep 13, 2024
Copy link
Member

@sarah11918 sarah11918 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ascorbic ! I've left a bunch of comments here, most of which boil down to that because this is a major feature people know and use, I think more of this could be geared towards helping people update with actions to follow to keep their projects in working order. You'll see what I mean!

Some are suggestions, some are questions to prompt towards the kind of guidance I'd expect to find here!

.changeset/quick-onions-leave.md Outdated Show resolved Hide resolved
.changeset/quick-onions-leave.md Outdated Show resolved Hide resolved
.changeset/quick-onions-leave.md Outdated Show resolved Hide resolved

While these emulate most of the features of legacy collections, they have these differences:

- Implicit collections for folders in `src/content` are only defined if no other collections use content layer. If no content layer collections are defined, and there are folders in `src/content` that don't match collections in `src/content/config.ts` then collections will be auto-generated for them. This is not recommended, and a warning will be logged that this is deprecated. A collection should always be defined in `config.ts`. For legacy collections these can just be empty declarations: e.g.`const blog = defineCollection({})`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am honestly having a hard time making sense of this from a usage/what happens to my existing project standpoint:

  • I don't know what an "implicit" collection is. This phrase hasn't been introduced.
  • Why would I have existing collections in src/content/ that do not match my collections in src/content/config.ts. Wouldn't that have already caused a problem for me?
  • It feels like we're describing a slightly convoluted thing that will happen, then saying it's not recommended? We could be a lot more explicit about telling people what they need to do, vs describing things that might be done to them.

I think the "what would make this a problem for someone to warrant even saying anything here?" question is the biggest one I have.

I get that you can now autogenerate collection schemas in certain cases, but existing legacy projects probably already have schemas? (So focusing on ones that don't is distracting when that's not most people's experience.) Is not having a schema the only time this surfaces as a problem? And if so, what fixes that: not having a mix of both old and new collections?

It is also OK to instruct people that "If condition X applies to you, you must now do X" if that is the happy path. These collections will be phased out eventually anyway. They don't need to know workarounds, or things that might be OK for a while. Just decide what you want them to do and tell them they need to do it. If other stuff happens to be possible, that's fine. This is to help people deal with a breaking change in the most straightforward way possible.

For example, if you want to straight up say that you simply can't mix new and legacy collections in a project anymore: that's fine! Say that this is the point at which you either update all your collections accordingly, or you enable the legacy flag if you're unable to do so yet. That is what the flag is for! That is very easy advice to follow, and doesn't require parsing a lot of "if" cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've rewritten this whole section. The reason I included this behaviour at all is because I do see a lot of people using these auto-generated collections without config. I didn't want everything to break as they upgraded, and I don;t want them to just immediately enabled the legacy flag! Adding this fallback behaviour helped get us to working with most sites, while also loudly warning that they should upgrade.

The only reason I've included the fact that it exists in the docs is because of the scenario where it works, but then stops working if they add new collections. I want the behaviour to be:

  • your config is untouched, most things will Just Work with no changes
  • you start to upgrade, you then should add the one-liner for all of your previously auto-generated collections

I'd love your thoughts on how best to communicate this (or if you think this is totally the wrong approach anyway)

.changeset/quick-onions-leave.md Outdated Show resolved Hide resolved
.changeset/quick-onions-leave.md Outdated Show resolved Hide resolved
.changeset/quick-onions-leave.md Outdated Show resolved Hide resolved
.changeset/quick-onions-leave.md Outdated Show resolved Hide resolved
.changeset/quick-onions-leave.md Outdated Show resolved Hide resolved
@ascorbic
Copy link
Contributor Author

ascorbic commented Oct 2, 2024

Thanks, @sarah11918! I've made lots of changes based on your feedback.

Copy link
Member

@sarah11918 sarah11918 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @ascorbic ! Thank you for taking the time to do a little rework here. I think it really does make a difference!

Just some tiny formatting things, and one suggestion to consider for the most problematic of the issues (which is now handled spot on, I think!) but I leave it to you what you feel makes the most sense!

I also left a suggestion for the flag's config reference, and am noting that an error was mentioned but I don't see one here in this PR, just fyi!

.changeset/quick-onions-leave.md Outdated Show resolved Hide resolved
.changeset/quick-onions-leave.md Outdated Show resolved Hide resolved
.changeset/quick-onions-leave.md Outdated Show resolved Hide resolved
packages/astro/src/types/public/config.ts Outdated Show resolved Hide resolved
Copy link
Member

@Princesseuh Princesseuh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big PR! I'm not 100% familiar with everything here, but nothing jumped to me while reading through it. Awesome work!

@ascorbic ascorbic merged commit 4d590a2 into next Oct 4, 2024
14 checks passed
@ascorbic ascorbic deleted the auto-glob branch October 4, 2024 15:10
ascorbic added a commit that referenced this pull request Oct 4, 2024
* feat: support pattern arrays with glob

* wip

* feat: emulate legacy content collections

* Fixes

* Lint

* Correctly handle legacy data

* Fix tests

* Switch flag handling

* Fix warnings

* Add layout warning

* Update fixtures

* More tests!

* Handle empty md files

* Lockfile

* Dedupe name

* Handle data ID unslug

* Fix e2e

* Clean build

* Clean builds in tests

* Test fixes

* Fix test

* Fix typegen

* Fix tests

* Fixture updates

* Test updates

* Update changeset

* Test

* Remove wait in test

* Handle race condition

* Lock

* chore: changes from review

* Handle folders without config

* lint

* Fix test

* Update wording for auto-collections

* Delete legacyId

* Sort another fixture

* Rename flag to `legacy.collections`

* Apply suggestions from code review

Co-authored-by: Sarah Rainsberger <[email protected]>

* Changes from review

* Apply suggestions from code review

Co-authored-by: Sarah Rainsberger <[email protected]>

* lockfile

* lock

---------

Co-authored-by: Sarah Rainsberger <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs pr pkg: astro Related to the core `astro` package (scope) pkg: example Related to an example package (scope) pkg: integration Related to any renderer integration (scope) pr: preview This PR has a preview release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants