Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search index and topic improvements #5

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

sammeltassen
Copy link

Search index

  • Metadata values are now added to the plaintext field to be indexed by Typesense.

This currently may include html tags if present in the source manifest.

Topics

  • Added function to capitalize the first character of topics

This can be disabled as follows:

    config:
      extract-topics:
        upperCase: false
  • Added an option to create date ranges

This can be enabled as follows:

    config:
      extract-topics:
        dateRange: ["date"]
        topicTypes:
          date: ['Date', 'Datering']

This will process metadata values such as 1900 - 1925 and create collections within the date topic for all years in between: 1900, 1901, ... 1924, 1925.

Todo: add an integer field to the Typesense index with the year values.

Comment on lines -62 to +66
let plaintext = '';
const keywordsFile = join(api.files, 'keywords.txt');
if (existsSync(keywordsFile)) {
plaintext = await readFile(keywordsFile, 'utf-8');
}
// let plaintext = '';
// const keywordsFile = join(api.files, 'keywords.txt');
// if (existsSync(keywordsFile)) {
// plaintext = await readFile(keywordsFile, 'utf-8');
// }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can look at this again in the future, it was a way to have scripts that extract plaintext (into the keywords.txt) for indexing - but it made the output 2-3x larger!

@stephenwf
Copy link
Member

Really nice addition with the dates too. I'll try and get this running locally to test soon 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants