Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design and implement activity storing system #311

Open
arkid15r opened this issue Dec 30, 2024 · 7 comments
Open

Design and implement activity storing system #311

arkid15r opened this issue Dec 30, 2024 · 7 comments
Assignees
Labels

Comments

@arkid15r
Copy link
Collaborator

arkid15r commented Dec 30, 2024

Is your feature request related to a problem? Please describe.
We get a bunch of different events from GitHub API: issues, PRs, releases, repositories, users.

Describe the solution you'd like
There is a need to standardize this data within a separate entity. Each record should be linked to a project/repo and user (normally we have this data). After indexing it we can use it for project and user timelines on corresponding Nest pages.

Additional context
https://docs.github.com/en/rest/activity/events?apiVersion=2022-11-28#list-public-events

@arkid15r arkid15r added this to the Main page 🏠 milestone Dec 30, 2024
@github-project-automation github-project-automation bot moved this to Backlog in Project Nest Dec 30, 2024
@arkid15r arkid15r moved this from Backlog to Todo in Project Nest Dec 30, 2024
@yashpandey06
Copy link
Collaborator

So we ll store this activity in db?

@arkid15r
Copy link
Collaborator Author

So we ll store this activity in db?

Yes, and index it using our index engine.

@yashpandey06
Copy link
Collaborator

yashpandey06 commented Dec 31, 2024

Okay !

@yashpandey06
Copy link
Collaborator

Can I take up this issue ?

@arkid15r
Copy link
Collaborator Author

Yes, just note it's a purely backend tasks that will require dealing with models and sync process updates.

@yashpandey06
Copy link
Collaborator

yashpandey06 commented Jan 1, 2025

Yeah ,I understand ... this will be my approach broken down in steps .

Note : Earlier i was thinking for going with webhooks for real time updates ...but we already have celery here right ? so we can go with with schedule polling ,just want to make sure which one of the two would be good ?

1. Event Data Model

  • Event: GitHub event data linked to repositories and users.
  • Repository: Repository metadata.
  • User: GitHub user information.

2. Data Fetching

  • Fetch events using GitHub API.

3. Data Storage

  • Store data in DB.
  • Link events to repositories and users.

4. Indexing with Algolia

  • Push event data to Algolia indices.

5. Sync Mechanism

  • Monitor Any changes
  • Sync DB and Algolia.

@arkid15r
Copy link
Collaborator Author

arkid15r commented Jan 3, 2025

No, we don't have celery and in general I'd prefer subscribing to webhook events instead of polling the API for data in order to stay within the API rate limit. Eventually we're going to have fetch (for historical data) + webhook subscription system for activity. At least that's my vision for now as I haven't looked into implementation details yet.

So I guess you understand the complexity and willing to tackle it. Assigning this to you @yashpandey06

@arkid15r arkid15r moved this from Todo to In progress in Project Nest Jan 3, 2025
@arkid15r arkid15r assigned arkid15r and unassigned yashpandey06 Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In progress
Development

No branches or pull requests

2 participants