Skip to content

Commit

Permalink
Implement legacy collections using glob (#11976)
Browse files Browse the repository at this point in the history
* feat: support pattern arrays with glob

* wip

* feat: emulate legacy content collections

* Fixes

* Lint

* Correctly handle legacy data

* Fix tests

* Switch flag handling

* Fix warnings

* Add layout warning

* Update fixtures

* More tests!

* Handle empty md files

* Lockfile

* Dedupe name

* Handle data ID unslug

* Fix e2e

* Clean build

* Clean builds in tests

* Test fixes

* Fix test

* Fix typegen

* Fix tests

* Fixture updates

* Test updates

* Update changeset

* Test

* Remove wait in test

* Handle race condition

* Lock

* chore: changes from review

* Handle folders without config

* lint

* Fix test

* Update wording for auto-collections

* Delete legacyId

* Sort another fixture

* Rename flag to `legacy.collections`

* Apply suggestions from code review

Co-authored-by: Sarah Rainsberger <[email protected]>

* Changes from review

* Apply suggestions from code review

Co-authored-by: Sarah Rainsberger <[email protected]>

* lockfile

* lock

---------

Co-authored-by: Sarah Rainsberger <[email protected]>
  • Loading branch information
ascorbic and sarah11918 committed Oct 4, 2024
1 parent 953e6e0 commit abf9a89
Show file tree
Hide file tree
Showing 117 changed files with 2,167 additions and 506 deletions.
47 changes: 47 additions & 0 deletions .changeset/quick-onions-leave.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
---
'astro': major
---

Refactors legacy `content` and `data` collections to use the Content Layer API `glob()` loader for better performance and to support backwards compatibility. Also introduces the `legacy.collections` flag for projects that are unable to update to the new behavior immediately.

:warning: **BREAKING CHANGE FOR LEGACY CONTENT COLLECTIONS** :warning:

By default, collections that use the old types (`content` or `data`) and do not define a `loader` are now implemented under the hood using the Content Layer API's built-in `glob()` loader, with extra backward-compatibility handling.

In order to achieve backwards compatibility with existing `content` collections, the following have been implemented:

- a `glob` loader collection is defined, with patterns that match the previous handling (matches `src/content/<collection name>/**/*.md` and other content extensions depending on installed integrations, with underscore-prefixed files and folders ignored)
- When used in the runtime, the entries have an ID based on the filename in the same format as legacy collections
- A `slug` field is added with the same format as before
- A `render()` method is added to the entry, so they can be called using `entry.render()`
- `getEntryBySlug` is supported

In order to achieve backwards compatibility with existing `data` collections, the following have been implemented:

- a `glob` loader collection is defined, with patterns that match the previous handling (matches `src/content/<collection name>/**/*{.json,.yaml}` and other data extensions, with underscore-prefixed files and folders ignored)
- Entries have an ID that is not slugified
- `getDataEntryById` is supported

While this backwards compatibility implementation is able to emulate most of the features of legacy collections, **there are some differences and limitations that may cause breaking changes to existing collections**:

- In previous versions of Astro, collections would be generated for all folders in `src/content/`, even if they were not defined in `src/content/config.ts`. This behavior is now deprecated, and collections should always be defined in `src/content/config.ts`. For existing collections, these can just be empty declarations (e.g. `const blog = defineCollection({})`) and Astro will implicitly define your legacy collection for you in a way that is compatible with the new loading behavior.
- The special `layout` field is not supported in Markdown collection entries. This property is intended only for standalone page files located in `src/pages/` and not likely to be in your collection entries. However, if you were using this property, you must now create dynamic routes that include your page styling.
- Sort order of generated collections is non-deterministic and platform-dependent. This means that if you are calling `getCollection()`, the order in which entries are returned may be different than before. If you need a specific order, you should sort the collection entries yourself.
- `image().refine()` is not supported. If you need to validate the properties of an image you will need to do this at runtime in your page or component.
- the `key` argument of `getEntry(collection, key)` is typed as `string`, rather than having types for every entry.

A new legacy configuration flag `legacy.collections` is added for users that want to keep their current legacy (content and data) collections behavior (available in Astro v2 - v4), or who are not yet ready to update their projects:

```js
// astro.config.mjs
import { defineConfig } from 'astro/config';

export default defineConfig({
legacy: {
collections: true
}
});
```

When set, no changes to your existing collections are necessary, and the restrictions on storing both new and old collections continue to exist: legacy collections (only) must continue to remain in `src/content/`, while new collections using a loader from the Content Layer API are forbidden in that folder.

5 changes: 5 additions & 0 deletions examples/with-markdoc/src/content/config.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
import { defineCollection } from 'astro:content';

export const collections = {
docs: defineCollection({})
};
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
import { defineCollection } from "astro:content";


const posts = defineCollection({});

export const collections = { posts };
2 changes: 2 additions & 0 deletions packages/astro/src/content/data-store.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ export interface DataEntry<TData extends Record<string, unknown> = Record<string
*/
deferredRender?: boolean;
assetImports?: Array<string>;
/** @deprecated */
legacyId?: string;
}

/**
Expand Down
46 changes: 39 additions & 7 deletions packages/astro/src/content/loaders/glob.ts
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ function generateIdDefault({ entry, base, data }: GenerateIdOptions): string {
if (data.slug) {
return data.slug as string;
}
const entryURL = new URL(entry, base);
const entryURL = new URL(encodeURI(entry), base);
const { slug } = getContentEntryIdAndSlug({
entry: entryURL,
contentDir: base,
Expand All @@ -55,6 +55,15 @@ function checkPrefix(pattern: string | Array<string>, prefix: string) {
* Loads multiple entries, using a glob pattern to match files.
* @param pattern A glob pattern to match files, relative to the content directory.
*/
export function glob(globOptions: GlobOptions): Loader;
/** @private */
export function glob(
globOptions: GlobOptions & {
/** @deprecated */
_legacy?: true;
},
): Loader;

export function glob(globOptions: GlobOptions): Loader {
if (checkPrefix(globOptions.pattern, '../')) {
throw new Error(
Expand All @@ -80,19 +89,21 @@ export function glob(globOptions: GlobOptions): Loader {
>();

const untouchedEntries = new Set(store.keys());

const isLegacy = (globOptions as any)._legacy;
// If global legacy collection handling flag is *not* enabled then this loader is used to emulate them instead
const emulateLegacyCollections = !config.legacy.collections;
async function syncData(entry: string, base: URL, entryType?: ContentEntryType) {
if (!entryType) {
logger.warn(`No entry type found for ${entry}`);
return;
}
const fileUrl = new URL(entry, base);
const fileUrl = new URL(encodeURI(entry), base);
const contents = await fs.readFile(fileUrl, 'utf-8').catch((err) => {
logger.error(`Error reading ${entry}: ${err.message}`);
return;
});

if (!contents) {
if (!contents && contents !== '') {
logger.warn(`No contents found for ${entry}`);
return;
}
Expand All @@ -103,6 +114,17 @@ export function glob(globOptions: GlobOptions): Loader {
});

const id = generateId({ entry, base, data });
let legacyId: string | undefined;

if (isLegacy) {
const entryURL = new URL(encodeURI(entry), base);
const legacyOptions = getContentEntryIdAndSlug({
entry: entryURL,
contentDir: base,
collection: '',
});
legacyId = legacyOptions.id;
}
untouchedEntries.delete(id);

const existingEntry = store.get(id);
Expand Down Expand Up @@ -132,6 +154,12 @@ export function glob(globOptions: GlobOptions): Loader {
filePath,
});
if (entryType.getRenderFunction) {
if (isLegacy && data.layout) {
logger.error(
`The Markdown "layout" field is not supported in content collections in Astro 5. Ignoring layout for ${JSON.stringify(entry)}. Enable "legacy.collections" if you need to use the layout field.`,
);
}

let render = renderFunctionByContentType.get(entryType);
if (!render) {
render = await entryType.getRenderFunction(config);
Expand Down Expand Up @@ -160,6 +188,7 @@ export function glob(globOptions: GlobOptions): Loader {
digest,
rendered,
assetImports: rendered?.metadata?.imagePaths,
legacyId,
});

// todo: add an explicit way to opt in to deferred rendering
Expand All @@ -171,9 +200,10 @@ export function glob(globOptions: GlobOptions): Loader {
filePath: relativePath,
digest,
deferredRender: true,
legacyId,
});
} else {
store.set({ id, data: parsedData, body, filePath: relativePath, digest });
store.set({ id, data: parsedData, body, filePath: relativePath, digest, legacyId });
}

fileToIdMap.set(filePath, id);
Expand Down Expand Up @@ -222,7 +252,7 @@ export function glob(globOptions: GlobOptions): Loader {
if (isConfigFile(entry)) {
return;
}
if (isInContentDir(entry)) {
if (!emulateLegacyCollections && isInContentDir(entry)) {
skippedFiles.push(entry);
return;
}
Expand All @@ -240,7 +270,9 @@ export function glob(globOptions: GlobOptions): Loader {
? globOptions.pattern.join(', ')
: globOptions.pattern;

logger.warn(`The glob() loader cannot be used for files in ${bold('src/content')}.`);
logger.warn(
`The glob() loader cannot be used for files in ${bold('src/content')} when legacy mode is enabled.`,
);
if (skipCount > 10) {
logger.warn(
`Skipped ${green(skippedFiles.length)} files that matched ${green(patternList)}.`,
Expand Down
42 changes: 16 additions & 26 deletions packages/astro/src/content/mutable-data-store.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ import { Traverse } from 'neotraverse/modern';
import { imageSrcToImportId, importIdToSymbolName } from '../assets/utils/resolveImports.js';
import { AstroError, AstroErrorData } from '../core/errors/index.js';
import { IMAGE_IMPORT_PREFIX } from './consts.js';
import { type DataEntry, ImmutableDataStore, type RenderedContent } from './data-store.js';
import { type DataEntry, ImmutableDataStore } from './data-store.js';
import { contentModuleToId } from './utils.js';

const SAVE_DEBOUNCE_MS = 500;
Expand Down Expand Up @@ -197,7 +197,17 @@ export default new Map([\n${lines.join(',\n')}]);
entries: () => this.entries(collectionName),
values: () => this.values(collectionName),
keys: () => this.keys(collectionName),
set: ({ id: key, data, body, filePath, deferredRender, digest, rendered, assetImports }) => {
set: ({
id: key,
data,
body,
filePath,
deferredRender,
digest,
rendered,
assetImports,
legacyId,
}) => {
if (!key) {
throw new Error(`ID must be a non-empty string`);
}
Expand Down Expand Up @@ -244,6 +254,9 @@ export default new Map([\n${lines.join(',\n')}]);
if (rendered) {
entry.rendered = rendered;
}
if (legacyId) {
entry.legacyId = legacyId;
}
if (deferredRender) {
entry.deferredRender = deferredRender;
if (filePath) {
Expand Down Expand Up @@ -335,30 +348,7 @@ export interface DataStore {
key: string,
) => DataEntry<TData> | undefined;
entries: () => Array<[id: string, DataEntry]>;
set: <TData extends Record<string, unknown>>(opts: {
/** The ID of the entry. Must be unique per collection. */
id: string;
/** The data to store. */
data: TData;
/** The raw body of the content, if applicable. */
body?: string;
/** The file path of the content, if applicable. Relative to the site root. */
filePath?: string;
/** A content digest, to check if the content has changed. */
digest?: number | string;
/** The rendered content, if applicable. */
rendered?: RenderedContent;
/**
* If an entry is a deferred, its rendering phase is delegated to a virtual module during the runtime phase.
*/
deferredRender?: boolean;
/**
* Assets such as images to process during the build. These should be files on disk, with a path relative to filePath.
* Any values that use image() in the schema will already be added automatically.
* @internal
*/
assetImports?: Array<string>;
}) => boolean;
set: <TData extends Record<string, unknown>>(opts: DataEntry<TData>) => boolean;
values: () => Array<DataEntry>;
keys: () => Array<string>;
delete: (key: string) => void;
Expand Down
Loading

0 comments on commit abf9a89

Please sign in to comment.