Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added onlick for organization icons and limit in github data fetcher #217

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

KasRoudra
Copy link
Contributor

Summary of changes

  • Formatting queries(fixed indentation)
  • Added argument support in fetcher file to fetch more data (pull requests and issues more than 100 (upto given limit))
  • Added onclick for organization icons

Details on fetcher

Now git_data_fetcher.mjs accepts four arguments. --pr-limit(-pl) for pull requests, --issue-limit(-il) for issues, --org-limit(-ol) for organizations and --limit(-l) for all. Example: node git_data_fetcher.mjs -l 200.Until the limit matches we keep requesting to github api and append the data altogether. So, user contributed organizations will properly appear now.

@saiteja13427 saiteja13427 self-requested a review June 20, 2022 15:35
@saiteja13427 saiteja13427 self-assigned this Jun 20, 2022
@ashutosh1919
Copy link
Owner

@saiteja13427 I think this one should block until the one you raised is merged.

@saiteja13427
Copy link
Collaborator

@ashutosh1919 I think this is a better approach than my approach, though I will review it properly once and then see which one do we go with.

@saiteja13427
Copy link
Collaborator

@KasRoudra I have tried testing the git script, I ran node git_data_fetcher.mjs -l 30 but still I got my 86 PRs.

Did you test that script properly or am i doing something wrong?

@KasRoudra
Copy link
Contributor Author

@saiteja13427 You aren't doing anything wrong. That's the correct behaviour of code. Currently the limit isn't sharp enough. If you give a number of 101-199, it will give you 100 results. For 201-299 you will get 200 results. Hope you got the point. I've just implemented core logic focusing on data fetching. However, I will try fix the limit issue in next commit. Thanks for your review

@saiteja13427
Copy link
Collaborator

@KasRoudra Yeah sure, do fix this and raise a commit!

Let us make it flexible with any number as limit.

@KasRoudra
Copy link
Contributor Author

@saiteja13427 I've done the commit. Take a test!

@saiteja13427
Copy link
Collaborator

@KasRoudra Tested It.

I ran node git_data_fetcher.mjs -pl 30, then it did fetch only 30 PRs but it is skipping issues and organisations. It gets issues and organisations only when i give a limit to them.

Also for orgs and issues if i give a limit more than the number of issues and orgs i have, It doesn't fetch anything at all.

This need to be corrected. I hope you understood both the points!

@KasRoudra
Copy link
Contributor Author

@saiteja13427 The first issue you mentioned isn't reproducible for me. You can check the code

const limit = numify(args.limit || args.l);
const pr_limit = numify(args["pr-limit"] || args.pl || limit);
const issue_limit = numify(args["issue-limit"] || args.il || limit);
const org_limit = numify(args["org-limit"] || args.ol || limit);

Even if you skip a value, that will automatically take a value of 100. Maybe that issue was a temporary github bug. However, from the last commit limits larger than data length will be ignored. Let me know if you find any other issue

@saiteja13427 saiteja13427 removed their request for review June 25, 2022 09:22
@saiteja13427
Copy link
Collaborator

Yup, now it seems to work well. Basically, both the issues are linked that's why the first issue was not reproducible for you.

@ashutosh1919 I have checked this PR. Please do have a look and then we can merge it.

Thanks, @KasRoudra for the changes.

@ashutosh1919
Copy link
Owner

@KasRoudra @saiteja13427 I think it should depend upon the already produced data.

So, what i expected originally was this one: we first need to see whether we have any PRs which are fetched. If we have PRs, then we need to find PRs which are not available. For example, if you run github_data_fetcher.mjs -pl 120, then it should fetch upto 120 PRs which are not already available in website local data. We can distinguish what is available and what is not from the PR id. It is okay to make multiple API calls internally to fetch more than 100 PRs.

Same thing should happen to issues and orgs.

What do you both think about this expected behaviour?

@KasRoudra
Copy link
Contributor Author

@ashutosh1919 That will be nice feature but I do not get the point of using local data. The local data will be also available in github database and during api calls they will be automatically fetched. So why should we read and append that local data?

@ashutosh1919
Copy link
Owner

We are locally storing data in order to improve performance. Fetching data would improve latency issues.

@saiteja13427
Copy link
Collaborator

saiteja13427 commented Jun 26, 2022

I think the behaviour should be like this.

By default the script should fetch all the PRs, Issues and Orgs on the first time and from the second run, it should just append the remaining to local data.

On top of it, we can also give a limit feature where we will do the above thing only but cut the data to the specific limit number of PRs, orgs or issues so that users can have a specific number of latest contributions if they want.

What do you think @KasRoudra @ashutosh1919

@KasRoudra
Copy link
Contributor Author

@saiteja13427 If we fetch "all" PRs, fetched data will automatically contain local data and therefore we don't need to read or append local ones. @ashutosh1919 suggests that we should only fetch those PRs which aren't available locally. Currently, I don't have idea how to fetch those accurately. If the user don't use a limit, we can save startCursor and endCursor locally and then later use them in next api call for pagination. But if the user uses a limit, data will be inaccurate because by default cursor is for 100 entries.

@ashutosh1919
Copy link
Owner

@KasRoudra no need to get confused. You don't need to worry about cursors. For each script run, we need to fetch all PRs/issues/orgs irrespective of the arguments it generates. And all means All even if the number of PRs are >>100. So once we get the data, then we can compare it with id of PR.

Also, github API doesn't show data which is more than 1 year old IMO. So, that is why we are appending the new data and not replacing it.

@saiteja13427
Copy link
Collaborator

Alright so, let us make it clear.

So, we are going with the approach of appending.

When a user runs the script, if he doesn't have old data, then we just get everything.

If the user has data, then we just get all data and keep on appending the data until the Id matches the latest data-id from the old data.

@KasRoudra
Copy link
Contributor Author

So, our next goal is now clear. We can already get all PRs, issues and orgs. We also have limiting feature. The only remaining feature is appending local data. I think @saiteja13427 can do it better as he has already worked on it. I suggest merging this one for now and then @saiteja13427 should raise a new PR for that feature

@KasRoudra
Copy link
Contributor Author

@ashutosh1919 Any decision on the pending pull requests?

@ashutosh1919
Copy link
Owner

@saiteja13427 @KasRoudra what is the status of this PR? I think it is much needed feature and we should get it done with this one

@saiteja13427
Copy link
Collaborator

@ashutosh1919 this got left out actually. Let me test this once again and we will merge this PR first.

Then on top of it I will add the appending feature we had discussed about!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants