[Github] Failure to Read > 29 Branches #327

ZoriaRPG · 2019-11-03T05:11:31Z

Related to #82

Reading from github in a new environment, source-integration fails to import all branches/revisions. From observation, we're seeing that the likely culprit is the Github API page system:

e.g. the first 'page' of our repo:
https://api.github.com/repos/armageddongames/zeldaclassic/branches

The plugin fails to try to parse beyond that first page--perhaps at one time, they were one unified page?

See:
https://api.github.com/repos/armageddongames/zeldaclassic/branches?page=2
https://api.github.com/repos/armageddongames/zeldaclassic/branches?page=3
&c.

Beyond this, it seems to always stop at the 29th branch, out of 30. I cannot explain this one unless the software has a hardcoded limit, but we can reproduce it with 100% reliability.

Fetching all branches with '*', we always stop at the 29th branch on the first page, and the plugin believes that there remains naught more to see, or to fetch beyond that, unless we manually direct it at each missing branch.

We tested this from the CLI, to ensure that it was not a server / php timeout. Happens every time, without variation.

dregad · 2019-12-23T16:02:20Z

Looks like your observation is correct - according to Documentation

Requests that return multiple items will be paginated to 30 items by default

perhaps at one time, they were one unified page

More likely, I don't think anyone ever faced (or reported) this problem before.

it seems to always stop at the 29th branch, out of 30

The JSON returned by the API does contain 30 elements, but please consider that the array is 0-based so the last branch has ID 29. Can you please confirm that you are actually getting 29 branches ? If so then something else is possibly broken.

In any case, there is indeed a bug, because that the logic to retrieve branches does not do any pagination.

Considering that proper pagination handling per GitHub's documentation requires reading the request's Link header, it will not be easy to fix considering that the current MantisBT low-level APIs used to retrieve the JSON from GitHub only return the request's payload but not the headers. This would require some refactoring to rely on another method to consume the API (e.g GuzzleHttp).

Maybe as a workaround we could try (against GitHub's recommendation) to construct the URL by increasing the page number until payload is empty.

dregad · 2019-12-23T16:51:48Z

Just a quick proof-of-concept

// Set to max value allowed by GitHub API, to minimize number of requests
$t_per_page = 100; 
$t_url = "https://api.github.com/repos/armageddongames/zeldaclassic/branches?per_page=$t_per_page";
$t_page = 1;
$t_count_branches = 0;

$t_options = [
    // Whatever is needed here, e.g. proxy, etc.
    // Reference http://docs.guzzlephp.org/en/stable/request-options.html
];
$t_client = new GuzzleHttp\Client( $t_options );

do {
    echo "Processing page ", $t_page++, " - GET $t_url\n";
    $res = $t_client->get( $t_url );

    // Process payload
    $t_json = json_decode( $res->getBody() );
    $t_count_branches += count( $t_json );

    // Get the next page
    $links = GuzzleHttp\Psr7\parse_header( $res->getHeader( 'Link' ) );
    foreach( $links as $link ) {
        if( $link['rel'] == 'next' ) {
            $t_url = trim( $link[0], '<>' );
            continue 2;
        }
    }
    // There is no "next" link - all pages have been processed
    break;
} while (true);

echo "Total $t_count_branches branches found.\n";

Returns 553 branches, matching number at https://github.com/armageddongames/zeldaclassic/.

dregad added bug GitHub labels Dec 23, 2019

dregad closed this as completed in 66cec45 Feb 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Github] Failure to Read > 29 Branches #327

[Github] Failure to Read > 29 Branches #327

ZoriaRPG commented Nov 3, 2019

dregad commented Dec 23, 2019

dregad commented Dec 23, 2019

[Github] Failure to Read > 29 Branches #327

[Github] Failure to Read > 29 Branches #327

Comments

ZoriaRPG commented Nov 3, 2019

dregad commented Dec 23, 2019

dregad commented Dec 23, 2019