Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement legacy collections using glob #11976

Merged
merged 58 commits into from
Oct 4, 2024
Merged
Show file tree
Hide file tree
Changes from 49 commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
6cfa42d
feat: support pattern arrays with glob
ascorbic Sep 9, 2024
c2f48aa
wip
ascorbic Sep 9, 2024
dd20f95
Merge branch 'next' into auto-glob
ascorbic Sep 12, 2024
0e9b54c
feat: emulate legacy content collections
ascorbic Sep 12, 2024
1ff97b4
Fixes
ascorbic Sep 12, 2024
305a756
Merge branch 'next' into auto-glob
ascorbic Sep 12, 2024
9585b5a
Lint
ascorbic Sep 12, 2024
181c950
Correctly handle legacy data
ascorbic Sep 12, 2024
f47c6ba
Fix tests
ascorbic Sep 12, 2024
b4f14ea
Merge branch 'next' into auto-glob
ascorbic Sep 13, 2024
3967012
Switch flag handling
ascorbic Sep 13, 2024
072dd7e
Fix warnings
ascorbic Sep 13, 2024
75c7181
Add layout warning
ascorbic Sep 13, 2024
9d371a1
Update fixtures
ascorbic Sep 13, 2024
1a27f35
More tests!
ascorbic Sep 13, 2024
e14090a
Handle empty md files
ascorbic Sep 13, 2024
77b7815
Merge branch 'next' into auto-glob
ascorbic Sep 13, 2024
bb15ef5
Lockfile
ascorbic Sep 13, 2024
e42c4b0
Dedupe name
ascorbic Sep 13, 2024
8b7dc59
Handle data ID unslug
ascorbic Sep 13, 2024
ab7918a
Fix e2e
ascorbic Sep 13, 2024
57127ff
Merge branch 'next' into auto-glob
ascorbic Sep 13, 2024
5d1684e
Clean build
ascorbic Sep 13, 2024
032ce4f
Clean builds in tests
ascorbic Sep 13, 2024
f48be22
Test fixes
ascorbic Sep 14, 2024
f337838
Fix test
ascorbic Sep 14, 2024
ec2fcb0
Fix typegen
ascorbic Sep 14, 2024
0aebf32
Fix tests
ascorbic Sep 14, 2024
b0cb038
Merge branch 'next' into auto-glob
ascorbic Sep 16, 2024
8f2e3fe
Fixture updates
ascorbic Sep 16, 2024
a7cccef
Test updates
ascorbic Sep 16, 2024
a030814
Update changeset
ascorbic Sep 16, 2024
67a4253
Test
ascorbic Sep 16, 2024
e1b9491
Remove wait in test
ascorbic Sep 16, 2024
5200a5d
Handle race condition
ascorbic Sep 16, 2024
7add932
Merge branch 'next' into auto-glob
ascorbic Sep 17, 2024
869dc3a
Lock
ascorbic Sep 17, 2024
505b426
Merge branch 'next' into auto-glob
ascorbic Sep 18, 2024
850c3e9
Merge branch 'next' into auto-glob
ascorbic Sep 20, 2024
81be579
Merge branch 'next' into auto-glob
ascorbic Oct 1, 2024
5339675
chore: changes from review
ascorbic Oct 1, 2024
8280450
Handle folders without config
ascorbic Oct 1, 2024
3da71b4
lint
ascorbic Oct 1, 2024
99c823a
Merge branch 'next' into auto-glob
ascorbic Oct 1, 2024
5c6e428
Fix test
ascorbic Oct 1, 2024
ce2c486
Update wording for auto-collections
ascorbic Oct 1, 2024
9d0d306
Delete legacyId
ascorbic Oct 1, 2024
c8e19ed
Sort another fixture
ascorbic Oct 1, 2024
6f4bab0
Rename flag to `legacy.collections`
ascorbic Oct 1, 2024
a8bf7aa
Merge branch 'next' into auto-glob
ascorbic Oct 1, 2024
0e310e9
Apply suggestions from code review
ascorbic Oct 2, 2024
5fb9247
Changes from review
ascorbic Oct 2, 2024
266da4d
Apply suggestions from code review
ascorbic Oct 2, 2024
e0b211e
Merge branch 'next' into auto-glob
ascorbic Oct 4, 2024
7ebcf58
lockfile
ascorbic Oct 4, 2024
7318f1c
Merge branch 'next' into auto-glob
ascorbic Oct 4, 2024
bd9a5f5
lock
ascorbic Oct 4, 2024
4eae23f
Merge branch 'next' into auto-glob
ascorbic Oct 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions .changeset/quick-onions-leave.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
'astro': patch
---

Implements legacy content and data collections using `glob()` loader
ascorbic marked this conversation as resolved.
Show resolved Hide resolved

:warning: **BREAKING CHANGE FOR LEGACY CONTENT COLLECTIONS** :warning:

By default, collections that use the the old types (content or data) are now implemented using the glob loader, with extra backward-compat handling. This includes any collection without a `loader` defined.
ascorbic marked this conversation as resolved.
Show resolved Hide resolved

Any legacy content collections are handled like this:
ascorbic marked this conversation as resolved.
Show resolved Hide resolved

- a `glob` loader collection is defined, with patterns that match the previous handling (matches `src/content/<collection name>/**/*.md` and other content extensions depending on installed integrations, with underscore-prefixed files and folders ignored)
- When used in the runtime, the entries have an ID based on the filename in the same format as legacy collections
- A `slug` field is added with the same format as before
- A `render()` method is added to the entry, so they can be called using `entry.render()`
- `getEntryBySlug` is supported

Legacy data collections are handled like this:

- a `glob` loader collection is defined, with patterns that match the previous handling (matches `src/content/<collection name>/**/*{.json,.yaml}` and other data extensions, with underscore-prefixed files and folders ignored)
- Entries have an ID that is not slugified
- `getDataEntryById` is supported
ascorbic marked this conversation as resolved.
Show resolved Hide resolved

While these emulate most of the features of legacy collections, they have these differences:
ascorbic marked this conversation as resolved.
Show resolved Hide resolved

- Implicit collections for folders in `src/content` are only defined if no other collections use content layer. If no content layer collections are defined, and there are folders in `src/content` that don't match collections in `src/content/config.ts` then collections will be auto-generated for them. This is not recommended, and a warning will be logged that this is deprecated. A collection should always be defined in `config.ts`. For legacy collections these can just be empty declarations: e.g.`const blog = defineCollection({})`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am honestly having a hard time making sense of this from a usage/what happens to my existing project standpoint:

  • I don't know what an "implicit" collection is. This phrase hasn't been introduced.
  • Why would I have existing collections in src/content/ that do not match my collections in src/content/config.ts. Wouldn't that have already caused a problem for me?
  • It feels like we're describing a slightly convoluted thing that will happen, then saying it's not recommended? We could be a lot more explicit about telling people what they need to do, vs describing things that might be done to them.

I think the "what would make this a problem for someone to warrant even saying anything here?" question is the biggest one I have.

I get that you can now autogenerate collection schemas in certain cases, but existing legacy projects probably already have schemas? (So focusing on ones that don't is distracting when that's not most people's experience.) Is not having a schema the only time this surfaces as a problem? And if so, what fixes that: not having a mix of both old and new collections?

It is also OK to instruct people that "If condition X applies to you, you must now do X" if that is the happy path. These collections will be phased out eventually anyway. They don't need to know workarounds, or things that might be OK for a while. Just decide what you want them to do and tell them they need to do it. If other stuff happens to be possible, that's fine. This is to help people deal with a breaking change in the most straightforward way possible.

For example, if you want to straight up say that you simply can't mix new and legacy collections in a project anymore: that's fine! Say that this is the point at which you either update all your collections accordingly, or you enable the legacy flag if you're unable to do so yet. That is what the flag is for! That is very easy advice to follow, and doesn't require parsing a lot of "if" cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've rewritten this whole section. The reason I included this behaviour at all is because I do see a lot of people using these auto-generated collections without config. I didn't want everything to break as they upgraded, and I don;t want them to just immediately enabled the legacy flag! Adding this fallback behaviour helped get us to working with most sites, while also loudly warning that they should upgrade.

The only reason I've included the fact that it exists in the docs is because of the scenario where it works, but then stops working if they add new collections. I want the behaviour to be:

  • your config is untouched, most things will Just Work with no changes
  • you start to upgrade, you then should add the one-liner for all of your previously auto-generated collections

I'd love your thoughts on how best to communicate this (or if you think this is totally the wrong approach anyway)

- The `layout` field is not supported in Markdown
ascorbic marked this conversation as resolved.
Show resolved Hide resolved
- Sort order of generated collections is non-deterministic and platform-dependent.
ascorbic marked this conversation as resolved.
Show resolved Hide resolved
- `image().refine()` is not supported
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not supported in content layer in general?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. It's one of the things I missed. It'll be tough to do, because of the fact we don't transform it into an object until runtime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like something we should talk about, I think this is pretty widely used.

ascorbic marked this conversation as resolved.
Show resolved Hide resolved
- the key for `getEntry` is typed as `string`, rather than having types for every entry.

A new config flag `legacy.collections` is added for users that need the old behavior. When set, collections in `src/content` are processed in the same way as before rather than being implemented with glob - including implicit collections. When set, content layer collections are forbidden in `src/content`, and will fail a build if defined.
ascorbic marked this conversation as resolved.
Show resolved Hide resolved
5 changes: 5 additions & 0 deletions examples/with-markdoc/src/content/config.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
import { defineCollection } from 'astro:content';

export const collections = {
docs: defineCollection({})
};
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
import { defineCollection } from "astro:content";


const posts = defineCollection({});

export const collections = { posts };
2 changes: 2 additions & 0 deletions packages/astro/src/content/data-store.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ export interface DataEntry<TData extends Record<string, unknown> = Record<string
*/
deferredRender?: boolean;
assetImports?: Array<string>;
/** @deprecated */
legacyId?: string;
}

/**
Expand Down
46 changes: 39 additions & 7 deletions packages/astro/src/content/loaders/glob.ts
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ function generateIdDefault({ entry, base, data }: GenerateIdOptions): string {
if (data.slug) {
return data.slug as string;
}
const entryURL = new URL(entry, base);
const entryURL = new URL(encodeURI(entry), base);
const { slug } = getContentEntryIdAndSlug({
entry: entryURL,
contentDir: base,
Expand All @@ -55,6 +55,15 @@ function checkPrefix(pattern: string | Array<string>, prefix: string) {
* Loads multiple entries, using a glob pattern to match files.
* @param pattern A glob pattern to match files, relative to the content directory.
*/
export function glob(globOptions: GlobOptions): Loader;
/** @private */
export function glob(
globOptions: GlobOptions & {
/** @deprecated */
_legacy?: true;
},
): Loader;

export function glob(globOptions: GlobOptions): Loader {
if (checkPrefix(globOptions.pattern, '../')) {
throw new Error(
Expand All @@ -80,19 +89,21 @@ export function glob(globOptions: GlobOptions): Loader {
>();

const untouchedEntries = new Set(store.keys());

const isLegacy = (globOptions as any)._legacy;
// If global legacy collection handling flag is *not* enabled then this loader is used to emulate them instead
const emulateLegacyCollections = !config.legacy.collections;
async function syncData(entry: string, base: URL, entryType?: ContentEntryType) {
if (!entryType) {
logger.warn(`No entry type found for ${entry}`);
return;
}
const fileUrl = new URL(entry, base);
const fileUrl = new URL(encodeURI(entry), base);
const contents = await fs.readFile(fileUrl, 'utf-8').catch((err) => {
logger.error(`Error reading ${entry}: ${err.message}`);
return;
});

if (!contents) {
if (!contents && contents !== '') {
logger.warn(`No contents found for ${entry}`);
return;
}
Expand All @@ -103,6 +114,17 @@ export function glob(globOptions: GlobOptions): Loader {
});

const id = generateId({ entry, base, data });
let legacyId: string | undefined;

if (isLegacy) {
const entryURL = new URL(encodeURI(entry), base);
const legacyOptions = getContentEntryIdAndSlug({
entry: entryURL,
contentDir: base,
collection: '',
});
legacyId = legacyOptions.id;
}
untouchedEntries.delete(id);

const existingEntry = store.get(id);
Expand Down Expand Up @@ -132,6 +154,12 @@ export function glob(globOptions: GlobOptions): Loader {
filePath,
});
if (entryType.getRenderFunction) {
if (isLegacy && data.layout) {
logger.error(
`The Markdown "layout" field is not supported in content collections in Astro 5. Ignoring layout for ${JSON.stringify(entry)}. Enable "legacy.collections" if you need to use the layout field.`,
);
}

let render = renderFunctionByContentType.get(entryType);
if (!render) {
render = await entryType.getRenderFunction(config);
Expand Down Expand Up @@ -160,6 +188,7 @@ export function glob(globOptions: GlobOptions): Loader {
digest,
rendered,
assetImports: rendered?.metadata?.imagePaths,
legacyId,
});

// todo: add an explicit way to opt in to deferred rendering
Expand All @@ -171,9 +200,10 @@ export function glob(globOptions: GlobOptions): Loader {
filePath: relativePath,
digest,
deferredRender: true,
legacyId,
});
} else {
store.set({ id, data: parsedData, body, filePath: relativePath, digest });
store.set({ id, data: parsedData, body, filePath: relativePath, digest, legacyId });
}

fileToIdMap.set(filePath, id);
Expand Down Expand Up @@ -222,7 +252,7 @@ export function glob(globOptions: GlobOptions): Loader {
if (isConfigFile(entry)) {
return;
}
if (isInContentDir(entry)) {
if (!emulateLegacyCollections && isInContentDir(entry)) {
skippedFiles.push(entry);
return;
}
Expand All @@ -240,7 +270,9 @@ export function glob(globOptions: GlobOptions): Loader {
? globOptions.pattern.join(', ')
: globOptions.pattern;

logger.warn(`The glob() loader cannot be used for files in ${bold('src/content')}.`);
logger.warn(
`The glob() loader cannot be used for files in ${bold('src/content')} when legacy mode is enabled.`,
);
if (skipCount > 10) {
logger.warn(
`Skipped ${green(skippedFiles.length)} files that matched ${green(patternList)}.`,
Expand Down
42 changes: 16 additions & 26 deletions packages/astro/src/content/mutable-data-store.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ import { Traverse } from 'neotraverse/modern';
import { imageSrcToImportId, importIdToSymbolName } from '../assets/utils/resolveImports.js';
import { AstroError, AstroErrorData } from '../core/errors/index.js';
import { IMAGE_IMPORT_PREFIX } from './consts.js';
import { type DataEntry, ImmutableDataStore, type RenderedContent } from './data-store.js';
import { type DataEntry, ImmutableDataStore } from './data-store.js';
import { contentModuleToId } from './utils.js';

const SAVE_DEBOUNCE_MS = 500;
Expand Down Expand Up @@ -197,7 +197,17 @@ export default new Map([\n${lines.join(',\n')}]);
entries: () => this.entries(collectionName),
values: () => this.values(collectionName),
keys: () => this.keys(collectionName),
set: ({ id: key, data, body, filePath, deferredRender, digest, rendered, assetImports }) => {
set: ({
id: key,
data,
body,
filePath,
deferredRender,
digest,
rendered,
assetImports,
legacyId,
}) => {
if (!key) {
throw new Error(`ID must be a non-empty string`);
}
Expand Down Expand Up @@ -244,6 +254,9 @@ export default new Map([\n${lines.join(',\n')}]);
if (rendered) {
entry.rendered = rendered;
}
if (legacyId) {
entry.legacyId = legacyId;
}
if (deferredRender) {
entry.deferredRender = deferredRender;
if (filePath) {
Expand Down Expand Up @@ -335,30 +348,7 @@ export interface DataStore {
key: string,
) => DataEntry<TData> | undefined;
entries: () => Array<[id: string, DataEntry]>;
set: <TData extends Record<string, unknown>>(opts: {
/** The ID of the entry. Must be unique per collection. */
id: string;
/** The data to store. */
data: TData;
/** The raw body of the content, if applicable. */
body?: string;
/** The file path of the content, if applicable. Relative to the site root. */
filePath?: string;
/** A content digest, to check if the content has changed. */
digest?: number | string;
/** The rendered content, if applicable. */
rendered?: RenderedContent;
/**
* If an entry is a deferred, its rendering phase is delegated to a virtual module during the runtime phase.
*/
deferredRender?: boolean;
/**
* Assets such as images to process during the build. These should be files on disk, with a path relative to filePath.
* Any values that use image() in the schema will already be added automatically.
* @internal
*/
assetImports?: Array<string>;
}) => boolean;
set: <TData extends Record<string, unknown>>(opts: DataEntry<TData>) => boolean;
values: () => Array<DataEntry>;
keys: () => Array<string>;
delete: (key: string) => void;
Expand Down
56 changes: 44 additions & 12 deletions packages/astro/src/content/runtime.ts
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ export function createGetCollection({
if (hasFilter && !filter(entry)) {
continue;
}
result.push(entry);
result.push(entry.legacyId ? emulateLegacyEntry(entry) : entry);
}
return result;
} else {
Expand Down Expand Up @@ -162,23 +162,31 @@ export function createGetEntryBySlug({
getEntryImport,
getRenderEntryImport,
collectionNames,
getEntry,
}: {
getEntryImport: GetEntryImport;
getRenderEntryImport: GetEntryImport;
collectionNames: Set<string>;
getEntry: ReturnType<typeof createGetEntry>;
}) {
return async function getEntryBySlug(collection: string, slug: string) {
const store = await globalDataStore.get();

if (!collectionNames.has(collection)) {
if (store.hasCollection(collection)) {
const entry = await getEntry(collection, slug);
if (entry && 'slug' in entry) {
return entry;
}
throw new AstroError({
...AstroErrorData.GetEntryDeprecationError,
message: AstroErrorData.GetEntryDeprecationError.message(collection, 'getEntryBySlug'),
});
}
// eslint-disable-next-line no-console
console.warn(`The collection ${JSON.stringify(collection)} does not exist.`);
console.warn(
`The collection ${JSON.stringify(collection)} does not exist. Please ensure it is defined in your content config.`,
);
return undefined;
}

Expand Down Expand Up @@ -207,22 +215,23 @@ export function createGetEntryBySlug({
export function createGetDataEntryById({
getEntryImport,
collectionNames,
getEntry,
}: {
getEntryImport: GetEntryImport;
collectionNames: Set<string>;
getEntry: ReturnType<typeof createGetEntry>;
}) {
return async function getDataEntryById(collection: string, id: string) {
const store = await globalDataStore.get();

if (!collectionNames.has(collection)) {
if (store.hasCollection(collection)) {
throw new AstroError({
...AstroErrorData.GetEntryDeprecationError,
message: AstroErrorData.GetEntryDeprecationError.message(collection, 'getDataEntryById'),
});
return getEntry(collection, id);
}
// eslint-disable-next-line no-console
console.warn(`The collection ${JSON.stringify(collection)} does not exist.`);
console.warn(
`The collection ${JSON.stringify(collection)} does not exist. Please ensure it is defined in your content config.`,
);
return undefined;
}

Expand Down Expand Up @@ -256,6 +265,21 @@ type DataEntryResult = {

type EntryLookupObject = { collection: string; id: string } | { collection: string; slug: string };

function emulateLegacyEntry(entry: DataEntry) {
// Define this first so it's in scope for the render function
const legacyEntry = {
...entry,
id: entry.legacyId!,
slug: entry.id,
};
delete legacyEntry.legacyId;
return {
...legacyEntry,
// Define separately so the render function isn't included in the object passed to `renderEntry()`
render: () => renderEntry(legacyEntry),
};
}

export function createGetEntry({
getEntryImport,
getRenderEntryImport,
Expand Down Expand Up @@ -303,6 +327,9 @@ export function createGetEntry({
// @ts-expect-error virtual module
const { default: imageAssetMap } = await import('astro:asset-imports');
entry.data = updateImageReferencesInData(entry.data, entry.filePath, imageAssetMap);
if (entry.legacyId) {
return { ...emulateLegacyEntry(entry), collection } as ContentEntryResult;
}
return {
...entry,
collection,
Expand All @@ -311,7 +338,9 @@ export function createGetEntry({

if (!collectionNames.has(collection)) {
// eslint-disable-next-line no-console
console.warn(`The collection ${JSON.stringify(collection)} does not exist.`);
console.warn(
`The collection ${JSON.stringify(collection)} does not exist. Please ensure it is defined in your content config.`,
);
return undefined;
}

Expand Down Expand Up @@ -433,9 +462,12 @@ function updateImageReferencesInData<T extends Record<string, unknown>>(
}

export async function renderEntry(
entry: DataEntry | { render: () => Promise<{ Content: AstroComponentFactory }> },
entry:
| DataEntry
| { render: () => Promise<{ Content: AstroComponentFactory }> }
| (DataEntry & { render: () => Promise<{ Content: AstroComponentFactory }> }),
) {
if (entry && 'render' in entry) {
if (entry && 'render' in entry && !('legacyId' in entry)) {
// This is an old content collection entry, so we use its render method
return entry.render();
}
Expand Down Expand Up @@ -623,13 +655,13 @@ export function createReference({ lookupMap }: { lookupMap: ContentLookupMap })
return { id: lookup, collection };
}

if (!lookupMap[collection] && store.collections().size === 0) {
if (!lookupMap[collection] && store.collections().size < 2) {
// If the collection is not in the lookup map or store, it may be a content layer collection and the store may not yet be populated.
// The store may still have a single collection, which would be the top level metadata collection.
// For now, we can't validate this reference, so we'll optimistically convert it to a reference object which we'll validate
// later in the pipeline when we do have access to the store.
return { id: lookup, collection };
}

const { type, entries } = lookupMap[collection];
const entry = entries[lookup];

Expand Down
Loading
Loading