Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add changeset comment search api with filtering by author and time #4359

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

AntonKhorev
Copy link
Collaborator

Similar to #4248, but for the api. This is the first part that doesn't require any db changes. The next step is to add searching by user who created the commented changeset, then it will be possible to have a user's discussed changeset list like on https://hdyc.neis-one.org/.

  • There was a choice of route to add this request to, either index (used by changeset search), or search (used by notes search). I chose index because you like resourceful routes and there's no other obvious request to put at index.
  • Common user, time interval from/to, limit methods moved to a mixin.
  • Since limit parameter already existed for changeset comment feeds, added default/max limit settings like the ones for notes and changesets.

@AntonKhorev AntonKhorev force-pushed the changeset-comments-search-api branch from cbbbc07 to f45194d Compare November 27, 2023 00:16
@milan-cvetkovic
Copy link
Contributor

I think it would be useful to be able select changeset comments relevant to a user - which would include comments to all changesets they are subscribed to. Something along the lines of:

select * from changeset_comments
where changeset_id in (
  select changeset_id from changesets_subscribers
  where subscriber_id=%{user_id});

@tomhughes
Copy link
Member

That's a potentially very expensive query though so I'm not sure we want to be doing that in the live API.

@tomhughes
Copy link
Member

I should add that I haven't reviewed this PR at all yet so I have no idea what other queries it is doing or whether they are sustainable but in general "search" apis make me extremely nervous as they've often caused us major performance problems in the past.

@milan-cvetkovic
Copy link
Contributor

That's a potentially very expensive query though so I'm not sure we want to be doing that in the live API.

I understand the reluctance, one has to be careful with search queries.

I was thinking of the usefulness of such a query - I assume users would like to know what comments others wrote to their changesets (and replies to their comments).

Not underestimating the complexity, I think the query can be indexed and use single join.

@tomhughes
Copy link
Member

Oh there is an index so it won't have to do a scan - my concern is more that the set of changesets somebody is subscribed to might be very large.

@milan-cvetkovic
Copy link
Contributor

milan-cvetkovic commented Feb 6, 2024

It shouldn't be too bad. It is the number of comments on their subscribed changesets that is the limiting factor, rather than number of subscribed changesets. Inner join should get rid of the changesets without comments, and we can limit the size of the output. I am not sure what the heavy duty contributors' numbers would be though - I am the only one commenting on my changesets so far :-(

Here is the query rewritten to use JOIN explicitly:

SELECT * FROM changesets_subscribers S
INNER JOIN changeset_comments M
ON S.changeset_id=M.changeset_id
WHERE S.subscriber_id='<userid>' AND M.visible = TRUE
ORDER by M.created_at DESC
LIMIT 100

@tomhughes
Copy link
Member

Yes a join is probably better than a subquery here - the optimiser might do that anyway but join is more what we want so is likely a better choice.

The current record appears to be a user that is subscribed to nearly 650000 changesets ;-)

@milan-cvetkovic
Copy link
Contributor

The current record appears to be a user that is subscribed to nearly 650000 changesets ;-)

Wow! Probably the number of changeset comments for him/her is also in thounsands

@tomhughes
Copy link
Member

Actually not really - they have just made a LOT of changesets (like nearly twice as many as the next biggest user) but they have never had a single comment!

The worst case for number of changesets commented on is a user with about 20000 changesets that attracted comments, basically because DWG reverted their work and commented on all the changesets.

The worst case for number of changeset comments is a user with just over 40000 comments because somebody took a dislike to them and robo-posted 300+ comments on each of their changesets.

@AntonKhorev
Copy link
Collaborator Author

I think it would be useful to be able select changeset comments relevant to a user - which would include comments to all changesets they are subscribed to.

To any user? I don't think your subscriptions are revealed to anyone other than yourself currently.

@milan-cvetkovic
Copy link
Contributor

I think it would be useful to be able select changeset comments relevant to a user - which would include comments to all changesets they are subscribed to.

To any user? I don't think your subscriptions are revealed to anyone other than yourself currently.

I don't think that would be wise. Only to the logged-in user.

In other words, it would be nice to have an ability to ask for "List of all changeset comments I to changesets I am subscribed to, newer than 7 days, maximum of 100 in the list"

@AntonKhorev
Copy link
Collaborator Author

I don't think that would be wise. Only to the logged-in user.

This is usually done with a different endpoint:

@AntonKhorev AntonKhorev force-pushed the changeset-comments-search-api branch 2 times, most recently from e83658c to e2771f2 Compare February 28, 2024 12:42
@AntonKhorev AntonKhorev force-pushed the changeset-comments-search-api branch 2 times, most recently from b270ebb to 04b5213 Compare March 13, 2024 23:17
@AntonKhorev AntonKhorev force-pushed the changeset-comments-search-api branch from 04b5213 to bd08a93 Compare April 1, 2024 22:58
@AntonKhorev AntonKhorev force-pushed the changeset-comments-search-api branch from bd08a93 to fed17ac Compare June 12, 2024 13:11
@AntonKhorev AntonKhorev added api Related to the XML or JSON APIs changesets Related to the Changesets feature labels Jul 3, 2024
@AntonKhorev AntonKhorev force-pushed the changeset-comments-search-api branch from fed17ac to 6c56855 Compare July 15, 2024 12:14
@AntonKhorev AntonKhorev force-pushed the changeset-comments-search-api branch from 6c56855 to 2afcba4 Compare September 6, 2024 05:25
@AntonKhorev AntonKhorev force-pushed the changeset-comments-search-api branch from 2afcba4 to 32dac6f Compare December 20, 2024 12:04
@AntonKhorev AntonKhorev force-pushed the changeset-comments-search-api branch from 32dac6f to 8d058ac Compare December 20, 2024 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Related to the XML or JSON APIs changesets Related to the Changesets feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants