Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Update collection sorting, metadata, stats #2327

Merged
merged 22 commits into from
Jan 23, 2025

Conversation

tw4l
Copy link
Member

@tw4l tw4l commented Jan 21, 2025

Fixes #2326

This slightly reworks the existing page counts in our API to include snapshotCount as distinct from pageCount (which now only counts unique URLs).

Tweaks migration 0040 (will require rerun_from_migration to be set to rerun this on dev or elsewhere where we've run that already) to include snapshot count and adds a new migration 0041 to update collection counts.

@tw4l tw4l requested a review from ikreymer January 21, 2025 20:07
@tw4l tw4l marked this pull request as draft January 21, 2025 20:08
@tw4l tw4l assigned tw4l and SuaYoo Jan 21, 2025
@SuaYoo
Copy link
Member

SuaYoo commented Jan 21, 2025

@tw4l looks like we also need to decrement the counts when archived items are deleted.

@SuaYoo SuaYoo removed their assignment Jan 21, 2025
@tw4l tw4l marked this pull request as ready for review January 21, 2025 21:37
@tw4l
Copy link
Member Author

tw4l commented Jan 21, 2025

@tw4l looks like we also need to decrement the counts when archived items are deleted.

Done, good catch! Looks like that was a bug that's been around for a while.

@tw4l
Copy link
Member Author

tw4l commented Jan 21, 2025

@ikreymer marked this as ready for review. I've fixed the issues Sua found while testing and tested locally that all looks good, but we decided to do the frontend updates in a separate PR.

@SuaYoo
Copy link
Member

SuaYoo commented Jan 21, 2025

Tested locally with #2324, LGTM!

ikreymer pushed a commit that referenced this pull request Jan 22, 2025
- Fixes #2321
- Resolves #2323

Follows #2327, should be
rebased and merged afterwards

## Changes

- Refactors dashboard and org profile preview to use private API
endpoint, to fix public collections not showing the org visibility is
hidden
- Enables sorting collections by `dateLatest`, sorts public collections
by `dateLatest` by default
- Enables sorting collections by page count
- Shows collection period (i.e. `dateEarliest` to `dateLatest`) in
collections list
- Shows same collection metadata in private and public views, updates
private view info bar
- Fixes "Update Org Profile" action item showing for crawler roles





<!-- ## Follow-ups -->

---------

Co-authored-by: Tessa Walsh <[email protected]>
@SuaYoo SuaYoo changed the title Add snapshotCount to org, collection, and archived item models feat: Update collection sorting, metadata, stats Jan 22, 2025
tw4l and others added 14 commits January 21, 2025 18:32
- Fixes #2321
- Resolves #2323

Follows #2327, should be
rebased and merged afterwards

## Changes

- Refactors dashboard and org profile preview to use private API
endpoint, to fix public collections not showing the org visibility is
hidden
- Enables sorting collections by `dateLatest`, sorts public collections
by `dateLatest` by default
- Enables sorting collections by page count
- Shows collection period (i.e. `dateEarliest` to `dateLatest`) in
collections list
- Shows same collection metadata in private and public views, updates
private view info bar
- Fixes "Update Org Profile" action item showing for crawler roles





<!-- ## Follow-ups -->

---------

Co-authored-by: Tessa Walsh <[email protected]>
@ikreymer
Copy link
Member

Per suggestions in discord, I think we should keep pageCount as is and add a uniquePageCount, and not use the term snapshot at all..

I think term snapshot could easily be confused for non-page URLs as well as screenshots..

@SuaYoo SuaYoo marked this pull request as draft January 22, 2025 18:48
@SuaYoo
Copy link
Member

SuaYoo commented Jan 22, 2025

@ikreymer @tw4l Updated frontend, not sure if it's a migration issue but uniquePageCount is returning 0 even though there are pages crawled (running the backend locally):
Screenshot 2025-01-22 at 1 24 40 PM

Also updated this dropdown (cc @Shrinks99 )
Screenshot 2025-01-22 at 1 27 02 PM

@SuaYoo SuaYoo marked this pull request as ready for review January 22, 2025 21:29
@ikreymer ikreymer added this to the Public Collections milestone Jan 22, 2025
@SuaYoo SuaYoo self-requested a review January 22, 2025 22:32
Copy link
Member

@SuaYoo SuaYoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested both backend and frontend locally, looks good to me

Copy link
Member

@ikreymer ikreymer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed small typo in migration, should be good now!

@tw4l tw4l merged commit 763c654 into main Jan 23, 2025
27 checks passed
@tw4l tw4l deleted the issue-2326-page-snapshot-counts branch January 23, 2025 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Keep separate counts for page and unique page counts
3 participants