Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOS/EGA] Performance Issues #55

Open
sulix opened this issue Aug 20, 2023 · 0 comments
Open

[DOS/EGA] Performance Issues #55

sulix opened this issue Aug 20, 2023 · 0 comments

Comments

@sulix
Copy link
Owner

sulix commented Aug 20, 2023

The DOS/EGA renderer is way slower than the original Keen one, which causes problems both on real hardware (it's slow-but-playable on a 486 with a fast VLB card, but nigh unusuable on a 40Mhz 386 with an ISA VGA card), and potentially under DOSBox (where 3000 cycles is way too slow).

There are a few obvious optimisations we can attempt:

  1. Actually move the tile buffer into video memory. This will speed up erase blocks and animated tiles, as we can copy 4 bytes at a time.
  2. As a result, the tile buffer can scroll properly, rather than being copied around. This should drastically increase scrolling speed, which is the real bottleneck.
  3. Write optimised blitters and call them directly. There's a lot of conditional code an indirection to pick both the VL_DOS backend, then check both source and destination surface type.
  4. (As a bonus, add optimised 16×16 px aligned tile blitters, which would be much faster for the common case, and a 'scale' blitter or similar for the SW-scroller, which is absurdly slow. Maybe we'd need the scaler compiler, too, but baby steps.)
  5. If we split up the functions which write to EGA memory, we can more easily support DJGPP farmem instead of nearmem (slower, but WinNT compatible), and maybe non-SVGA-compat mode wrapping. (The latter may be needed for a scrolling tile buffer anyway).
  6. If we're still too slow, we can maybe investigate doing tilebuffer→tilebuffer copies when scrolling on a new tile which is already onscreen. The cache metadata would probably take enough memory and be slow enough to only make this worthwhile on very slow ISA busses, though.
sulix added a commit that referenced this issue Nov 11, 2024
The DOS backend has many issues, both performance and correctness, which
need resolving. This change tackles two of them, both related to
ScrollSurface.

The first is a performance/smoothness issue: because we're always doing
the "SVGA compat" mode, we make periodic copies from one end of video
memory to the other. However, we always copied _both_ pages, but we only
really need to handle one. Update this to better place pages, and to
only copy one at a time. To ensure correctness, track the crtc offset
and update this after a copy. This _shouldn't_ be necessary in most
cases, but does get rid of potential overlap (which never actually
occurs) when a wrap occurs.

Secondly, fix an issue whereby the edge tiles from one page would
overwrite the currently visible opposite edge from the visible page.
This made a flickery, incorrect edge appear. This is due to there not
being a sufficient "gap" between the pages which we can use during
scolling. Keen handles this by having the pages actually be very large
(1024px wide). If we try this, though, we still get the problem, as VL
asssumes the entire page is valid and needs scrollling, not just the
actually used "port". If we hack around this by adding a gap, everything
works fine. But we then break the terminator intro, which needs more
memory as it's super-wide.

Hack around this with a "set gap" function on DOS. It's not great, and I
vaguely want to replace this by allowing a specific "gap" and "pitch" to
be specified on all surfaces" (so VL can be aware of separate used and
unused bits of memory), and maybe redo the interface enough to make the
tile buffer live in VRam and scroll efficiently (and copy efficiently).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant