Huge memory model is non-optimal, suggest memory model change to large 16-bit #6

joncampbell123 · 2016-12-12T11:27:24Z

Just a heads up, the huge memory model has the benefit of having the compiler adjust pointers such that pointer math is easier, but, also causes performance loss and overhead in your code.

The large model will give you all the benefits of far code and data without the overhead of the compiler and C library's pointer adjustment routines.

You will have to normalize pointers yourself and deal with crossing 64KB segments for large data, but it's worth it.

Also, the compiled binaries for Large model are significantly smaller than Huge model binaries.

sparky4 · 2016-12-15T17:41:34Z

ah ok thanks

sparky4 · 2016-12-17T23:31:42Z

ah large mode breaks scroll's ability to draw chikyuu properly hmmmm

jmalak · 2016-12-18T07:07:00Z

Take into account that if you use arrays longer then one segment you need huge memory model (it handles pointers over segment boundary) or you must handle this case yourself.

joncampbell123 · 2016-12-18T07:33:42Z

The VRS rendering code assumes that the offset portion of the pointer is such that rendering a scanline never crosses 64KB. Try normalizing the pointer yourself before rendering.

Basic (slow) example code:

unsigned long a = (FP_SEG(ptr) << 4UL) + (unsigned long)FP_OFF(ptr);
ptr = MK_FP(a>>4UL,a&0x0FUL);

Is there anything in your code that uses an array that exceeds 64KB?
You might shrink the array, or set up the array in segments so that no part of it exceeds 64KB.

joncampbell123 · 2016-12-18T07:35:04Z

You may also consider dynamically allocating the array segments as well.

Ruedii · 2016-12-18T07:38:17Z

Might I recommend a solution?
Switch the arrays out for matrix that use page alignment for lines.

When X and Y are the address, X is the pointer, Y is the segment designator.

For simplified code:
Data Segment = Base+(Y*Pagesize)
Data Pointer = X

If you REALLY need a liniar array, you can then virtualize this into pages by:
X = L Modulus 4K (Utilizing truncation of high bits)
Y = Integer Truncation of L/4K (Utilizing truncation of low bits followed by shifting high bits low)

This of course uses 4K segments, you can use any increment of 4K, but I recommend keeping to powers of two for convenience of simplification into binary operations so arithmetic operations aren't needed.

jmalak · 2016-12-18T09:27:56Z

With OW you can use large memory model (more eficient) and appropriate variable/pointers can be marked as __huge to use right aritmetic (slow) only for these variables/pointers.

sparky4 · 2016-12-19T18:05:37Z

well holy shit it is SIGNIFICANTLY FASTER GOD DAMN!

joncampbell123 · 2016-12-19T18:16:39Z

Here's hoping you're not joking or being sarcastic. Good luck :)

Ruedii · 2017-01-04T19:16:51Z

The method I mentioned is very clean, it's an old method for fixed segment databases.

sparky4 · 2018-06-05T15:34:51Z

@Ruedii wolf 3d dose a pre calculation of the render off set in an array ..

Ruedii · 2019-04-04T18:12:24Z

Sorry for the late reply, got lost in my master in pile, and my life has been quite busy as well, mostly family issues for the past year.

The x and y as variables are confusing. I should have used Ps and Po for "Pointer Segment" and "Pointer Offset" Of course, in rendering you can convert this as directly as possible by using certain block sizes.

It might be good to dynamically assign segment block size based on platform if it's capable. Each Intel generation buffers in larger and larger segments. In the 486 an optional cache gets added. The L1/L0 data cache slowly grows with generations and can usually be read on all CPUs 486 and later.

Further assembler optimizations of memory access of arrays would be done in the C Library itself, or supplementary math libraries you can add. Arrays simply add one more multiplier in (component object size) to determine their pointer size. As long as the component object is smaller than the preferred memory page size you can handle cross pages.

Since you aren't protecting pages individually, cross-page access with the offset when accessing an object crossing into the next page via a offset from the last page should only create a small performance loss, that should be a non-issue. However, calculating your object size to be a common denominator to the page size should prevent this issue altogether hands-off. This may mean adding a bit of padding that you can use for some nice added metadata flags or something.

If you wish to add streaming basic math and copy routines. The proper way to handle it is to run the stream-handler loop so that the pointers for the next data piece are assigned to the data pointer register immediately after the data is pulled to the register, before computing the data in the register. This will allow the full time of doing operations on the data in the register to provide time for the memory to have it's registers flipped. This will particularly help on 486 and later processors that have a (very small) internal L0/L1 data buffer or cache.

sparky4 · 2022-03-17T00:09:06Z

i honestly been swamped with school work constantly and lack of help made me not work on it in general. lots have been going on with my mental health but i am getting more stable. i aint dead nor i forgot the project. i just been super busy with school thats all. the biggest problem is i don't have the XT sitting around as it is in storage to work on the game some more. so i cannot really test it on authentic hardware except a 286 i will continue once my life is better and not grinding away at school.

joncampbell123 added the enhancement label Dec 12, 2016

sparky4 self-assigned this Dec 15, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Huge memory model is non-optimal, suggest memory model change to large 16-bit #6

Huge memory model is non-optimal, suggest memory model change to large 16-bit #6

joncampbell123 commented Dec 12, 2016

sparky4 commented Dec 15, 2016

sparky4 commented Dec 17, 2016

jmalak commented Dec 18, 2016

joncampbell123 commented Dec 18, 2016

joncampbell123 commented Dec 18, 2016

Ruedii commented Dec 18, 2016 •

edited

Loading

jmalak commented Dec 18, 2016

sparky4 commented Dec 19, 2016

joncampbell123 commented Dec 19, 2016

Ruedii commented Jan 4, 2017

sparky4 commented Jun 5, 2018

Ruedii commented Apr 4, 2019

sparky4 commented Mar 17, 2022 •

edited

Loading

Huge memory model is non-optimal, suggest memory model change to large 16-bit #6

Huge memory model is non-optimal, suggest memory model change to large 16-bit #6

Comments

joncampbell123 commented Dec 12, 2016

sparky4 commented Dec 15, 2016

sparky4 commented Dec 17, 2016

jmalak commented Dec 18, 2016

joncampbell123 commented Dec 18, 2016

joncampbell123 commented Dec 18, 2016

Ruedii commented Dec 18, 2016 • edited Loading

jmalak commented Dec 18, 2016

sparky4 commented Dec 19, 2016

joncampbell123 commented Dec 19, 2016

Ruedii commented Jan 4, 2017

sparky4 commented Jun 5, 2018

Ruedii commented Apr 4, 2019

sparky4 commented Mar 17, 2022 • edited Loading

Ruedii commented Dec 18, 2016 •

edited

Loading

sparky4 commented Mar 17, 2022 •

edited

Loading