Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests when prefetch buffer is full #327

Open
alyosha-tas opened this issue Sep 4, 2023 · 6 comments
Open

Tests when prefetch buffer is full #327

alyosha-tas opened this issue Sep 4, 2023 · 6 comments
Assignees

Comments

@alyosha-tas
Copy link

alyosha-tas commented Sep 4, 2023

I started looking more carefully at the prefetcher, as that seems like the most likely thing wrong with Shrek 2 timing. Shrek 2 uses EWRAM a lot, so there is a lot of time for the prefetcher to actually do something.

Here are some basic tests:

https://github.com/alyosha-tas/gba-tests/tree/master/prefetcher

Currently NanoBoyADvance fails a test in the 'prefetcher.gba' test rom. The test it fails tests what happens when the prefetch buffer is full. It seems that for the test to work out, the prefetch unit must wait until it is empty to restart again, and when it does so it uses non-sequential timing. Otherwise you get 51 instead of the required 56.

EDIT: This behaviour actually seems to be important for Shrek 2, as when I implement it I can get much closer to the correct value (25FC vs 2611)

@fleroviux
Copy link
Member

fleroviux commented Sep 4, 2023

Thanks for this! I'll definitely try to work on this soon. I noticed that your readme says Tests prefetcher behaviour when branching to nearby addresses. Did you by chance happen to test what happens when branching to the address that the CPU is would fetch the next opcode from? For example:

b .label
nop
nop
.label:
...

A couple of months I ago I noticed that there can be a penalty related to this, and it to some extent depends on the instructions before the branch. For example any RAM access or internal cycle right before the branch would make the discrepancy disappear. But I couldn't quite get the behavior right back then.

@alyosha-tas
Copy link
Author

alyosha-tas commented Sep 5, 2023

yes that is what those branch tests do, but I made them not knowing about the buffer full behaviour so they inadvertently rely on it at least in the thumb version. I'll clean them up to only test branch behaviour.

Although, the thumb version passes in NanoBoy Advance currently, which is inconsistent with what I just wrote, so maybe I don't have all the details.

@alyosha-tas
Copy link
Author

I made some new tests that isolate branching (in thumb mode) but I didn't see any odd behaviour and NanoBoyAdvance currently passes those tests.

Do you have any more details about when you saw weird results with branching?

@fleroviux
Copy link
Member

I can check my notes and test ROM tomorrow or on the weekend. It's been a while and I don't remember the details anymore.

@RetroEdit
Copy link

RetroEdit commented Sep 8, 2023

Based on above discussion, the Shrek 2 timing issue is probably related/dependent on this behavior: #312

@alyosha-tas
Copy link
Author

alyosha-tas commented Sep 9, 2023

Following up about the nearby branching thing. There is an unintuitive behaviour when branching 2 instructions away as in the above code, but only when the prefetch buffer is empty.

In this case, the prefetch address and the instruction address the cpu is trying to fetch are the same, and both start reading it at the same time (prefetcher with sequential accesses and cpu with non-sequential since it just branched.) In this case non-sequential timing is used.

This is tested in 'prefetcher_branch_thumb_2.gba' in my repository. NanoBoyAdvance currently fails the test.

In fact this quirk is important for Metroid Fusion, which does a lot of such branches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants