Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎨 Made custom_excerpt searchable via Sodo Search #21998

Closed

Conversation

dkalpakchi
Copy link

no issue
When custom excerpt was defined in the posts, it was currently non-searchable, which is confusing UX-wise. On top of that hash symbols (#) or underscores (_) were not searchable

Got some code for us? Awesome 🎊!

Please include a description of your change & check your PR against this list, thanks!

  • There's a clear use-case for this code change, explained below
  • Commit message has a short title & references relevant issues
  • The build will pass (run yarn test:all and yarn lint)

We appreciate your contribution!

P.S. I couldn't run yarn test:all, got command not found, but nothing should fail with this small of a change.

no issue
When custom excerpt was defined in the posts, it was currently non-searchable, which is confusing UX-wise. On top of that hash symbols (#) or underscores (_) were not searchable
@cathysarisky
Copy link
Contributor

Hey @dkalpakchi , I'm confused by this PR. When search retrieves the excerpt field from the Content API, it receives the custom_excerpt if it is set, or the first bit of the post body if it isn't. There's not a need to retrieve or search the custom_excerpt separately.

I believe you're correct about missing # and _ - if you want to resubmit with just that fix, I'll be happy to review it. Sounds like there's also a hole in the tests there - to check that a # in the search field doesn't break anything, and also to check that a post with a # can be found.

(p.s. to run tests, you need to install ghost from source - here's a link. If you get stuck on getting your dev environment set up, come on over to https://forum.ghost.org for help! )

@dkalpakchi
Copy link
Author

Hey, thank you for your comment! I want to start by saying that it's my first PR and I really appreciate all tips I can get from you 🙂

Regarding searching the custom_excerpt, I am aware that if set it gets returned instead of the post's chunk. My need it to search both of them simultaneously.

Say I want to create a one-liner explaining the whole article as my custom_excerpt, and make put this article behind paywall. Then when someone searches with the current setup, they will only get the hits on my one-liner. But because the search is a regular inverted index without any kind of semantic search or similar, they would need to write in the words I use. In this case I think there is only gain to sodo search if we can broaden the search to both post's begining and custom excerpt.

Hope my explanation didn't confuse you more 🙂

Regarding tests, point taken - I'll have a look at how those are written in Ghost and try to add them.

@cathysarisky
Copy link
Contributor

Regarding searching the custom_excerpt, I am aware that if set it gets returned instead of the post's chunk. My need it to search both of them simultaneously.

As I understand it, that's not going to work, because requesting the 'excerpt' field returns the custom_excerpt if it is set. (That behavior is baked into the content API, not the search app.)

@dkalpakchi
Copy link
Author

dkalpakchi commented Jan 27, 2025

Yeah, it struck me now that that's what you meant. So basically when I add custom excerpt, it will just get duplicated, that was your point then. The reason I added this was that it didn't search custom excerpts on my Ghost instance when they were set, and with this fix it started to do so.

I need to double check why I added this fix. I'll get back to you when I'm in front of the computer.

@dkalpakchi
Copy link
Author

dkalpakchi commented Jan 27, 2025

Now I could recover my thinking back then, and it was indeed a miss on my part that excerpt gets the value of custom_excerpt most of the time. However the case I tested on was when I have a members-only article without public preview, but with custom excerpt. Then the Content API (which Sodo Search uses, I believe) returns empty excerpt, but returns the custom_excerpt string.

Here one could argue in multiple ways. One could make an argument that custom_excerpt should not have been returned from the Content API, so it's potentially a bug in the Content API.

One could also argue that custom_excerpt could be used in many different ways depending on the theme creator, so it's reasonable that custom_excerpt is returned by Content API, and should hence be searchable in such cases. This is the one I would personally argue for.

Obviously my PR doesn't solve correctly any of these problems, but given I'm the proponent of the second approach, I would basically still leave custom_excerpt searchable as per this PR, but then also added plaintext.substring(0, 500), which I believe is the definition of the excerpt. Ofc I won't be able to add it via the API directly, so I'll need to add this in the #updatePostIndex function.

However at this point I'm not sure which of these things is the most beneficial a typical Ghost user, so I'd like to get your opinion on this. Maybe what I'm proposing will break something conventional for Ghost users - not sure how people really use custom excerpts.

What do you think about all of this?

@cathysarisky
Copy link
Contributor

I think it'd make sense to submit the fix for missing # and _ (please!) as a separate PR, and then open a thread over on the Ghost forum to discuss some of these ideas.

@dkalpakchi
Copy link
Author

OK, will do during the weekend and will get back to you.

@dkalpakchi
Copy link
Author

Hi again! Sorry for the delay, PR #22102 is not ready for review, I believe. Closing this one for now.

@dkalpakchi dkalpakchi closed this Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants