-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests when prefetch buffer is full #327
Comments
Thanks for this! I'll definitely try to work on this soon. I noticed that your readme says b .label
nop
nop
.label:
... A couple of months I ago I noticed that there can be a penalty related to this, and it to some extent depends on the instructions before the branch. For example any RAM access or internal cycle right before the branch would make the discrepancy disappear. But I couldn't quite get the behavior right back then. |
yes that is what those branch tests do, but I made them not knowing about the buffer full behaviour so they inadvertently rely on it at least in the thumb version. I'll clean them up to only test branch behaviour. Although, the thumb version passes in NanoBoy Advance currently, which is inconsistent with what I just wrote, so maybe I don't have all the details. |
I made some new tests that isolate branching (in thumb mode) but I didn't see any odd behaviour and NanoBoyAdvance currently passes those tests. Do you have any more details about when you saw weird results with branching? |
I can check my notes and test ROM tomorrow or on the weekend. It's been a while and I don't remember the details anymore. |
Based on above discussion, the Shrek 2 timing issue is probably related/dependent on this behavior: #312 |
Following up about the nearby branching thing. There is an unintuitive behaviour when branching 2 instructions away as in the above code, but only when the prefetch buffer is empty. In this case, the prefetch address and the instruction address the cpu is trying to fetch are the same, and both start reading it at the same time (prefetcher with sequential accesses and cpu with non-sequential since it just branched.) In this case non-sequential timing is used. This is tested in 'prefetcher_branch_thumb_2.gba' in my repository. NanoBoyAdvance currently fails the test. In fact this quirk is important for Metroid Fusion, which does a lot of such branches. |
I started looking more carefully at the prefetcher, as that seems like the most likely thing wrong with Shrek 2 timing. Shrek 2 uses EWRAM a lot, so there is a lot of time for the prefetcher to actually do something.
Here are some basic tests:
https://github.com/alyosha-tas/gba-tests/tree/master/prefetcher
Currently NanoBoyADvance fails a test in the 'prefetcher.gba' test rom. The test it fails tests what happens when the prefetch buffer is full. It seems that for the test to work out, the prefetch unit must wait until it is empty to restart again, and when it does so it uses non-sequential timing. Otherwise you get 51 instead of the required 56.
EDIT: This behaviour actually seems to be important for Shrek 2, as when I implement it I can get much closer to the correct value (25FC vs 2611)
The text was updated successfully, but these errors were encountered: