Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow for.zs #352

Open
jespa007 opened this issue Sep 10, 2023 · 3 comments
Open

Slow for.zs #352

jespa007 opened this issue Sep 10, 2023 · 3 comments
Assignees

Comments

@jespa007
Copy link
Owner

jespa007 commented Sep 10, 2023

Description

For 'for.zs' that makes a performance test of a 1000000 of iterations that covers .push and add it shows an important slow respect other languages. For example one iteration in lua it takes 0.07s whereas ZetScript it takes ~64s

Lua is fast 64/0.07 = x914 times  that ZetScript

For wren it takes 0.18s

Wren is fast 64/0.18s = x355 times  that ZetScript

Has been detected that the slow part of the execution is affected to push and load,

elapsed push: 56.667000s
elapsed load: 61.035000s

Test code

Slow code 1 (60s)

var list=[];
for (var i=0; i < 1000000; i++) {
  list.push(i)
}

Byte code

[0000| 1|01]    NEW_ARRAY
[0001| 1|02]    PUSH_STK_LOCAL          list
[0002|-1|00]    STORE                   n:1 [RST]
[0003| 0|00]    PUSH_SCOPE
[0004| 1|01]    LOAD_INT                0
[0005| 1|02]    PUSH_STK_LOCAL          i
[0006|-1|00]    STORE                   n:1 [RST]
[0007| 1|01]    LT                      Local['i'],1000000
[0008|-1|00]    JNT                     019 (ins+11) 
[0009| 0|00]    PUSH_SCOPE
[0010| 1|01]    LOAD_LOCAL              list
[0011| 0|01]    LOAD_OBJ@ITEM           push 
[0012| 1|02]    PUSH_STK_LOCAL          i
[0013|-1|00]    MEMBER_CALL             arg:1 ret:1 [RST]
[0014| 0|00]    POP_SCOPE
[0015| 1|01]    PUSH_STK_LOCAL          i
[0016| 0|00]    POST_INC                [RST]
[0017| 0|00]    JMP                     007 (ins-10) 
[0018| 0|00]    POP_SCOPE
[0019| 0|00]    POP_SCOPE

Slow code 2 (60s)

var list=[];
var sum = 0
for (var i in list) {
        sum = sum + i
}

Byte code

[0000| 1|01]    NEW_ARRAY
[0001| 1|02]    PUSH_STK_LOCAL          list
[0002|-1|00]    STORE                   n:1 [RST]
[0003| 1|01]    LOAD_INT                0
[0004| 1|02]    PUSH_STK_LOCAL          sum
[0005|-1|00]    STORE                   n:1 [RST]
[0006| 0|00]    PUSH_SCOPE
[0007| 1|01]    LOAD_LOCAL              list
[0008| 1|02]    PUSH_STK_LOCAL          @_iter_0
[0009| 0|00]    IT_INIT                 [RST]
[0010| 1|01]    LOAD_LOCAL              @_iter_0
[0011| 0|01]    LOAD_OBJ@ITEM           _end 
[0012| 0|01]    MEMBER_CALL             arg:0 ret:1 
[0013|-1|00]    JT                      029 (ins+16) 
[0014| 0|00]    PUSH_SCOPE
[0015| 1|01]    LOAD_LOCAL              @_iter_0
[0016| 0|01]    LOAD_OBJ@ITEM           _get 
[0017| 0|01]    MEMBER_CALL             arg:0 ret:1 
[0018| 1|02]    PUSH_STK_LOCAL          i
[0019|-1|00]    STORE                   n:1 [RST]
[0020| 1|01]    ADD                     Local['sum'],Local['i']
[0021| 1|02]    PUSH_STK_LOCAL          sum
[0022|-1|00]    STORE                   n:1 [RST]
[0023| 0|00]    POP_SCOPE
[0024| 1|01]    LOAD_LOCAL              @_iter_0
[0025| 0|01]    LOAD_OBJ@ITEM           _next 
[0026|-1|00]    MEMBER_CALL             arg:0 ret:0 [RST]
[0027| 0|00]    JMP                     010 (ins-17) 
[0028| 0|00]    POP_SCOPE
[0029| 0|00]    POP_SCOPE

Fast code (77ms)

var sum=0;
for (var i=0; i < 1000000; i++) {
  sum=sum+i
}

Byte code

[0000| 1|01]    LOAD_INT                0
[0001| 1|02]    PUSH_STK_LOCAL          sum
[0002|-1|00]    STORE                   n:1 [RST]
[0003| 0|00]    PUSH_SCOPE
[0004| 1|01]    LOAD_INT                0
[0005| 1|02]    PUSH_STK_LOCAL          i
[0006|-1|00]    STORE                   n:1 [RST]
[0007| 1|01]    LT                      Local['i'],1000000
[0008|-1|00]    JNT                     018 (ins+10) 
[0009| 0|00]    PUSH_SCOPE
[0010| 1|01]    ADD                     Local['sum'],Local['i']
[0011| 1|02]    PUSH_STK_LOCAL          sum
[0012|-1|00]    STORE                   n:1 [RST]
[0013| 0|00]    POP_SCOPE
[0014| 1|01]    PUSH_STK_LOCAL          i
[0015| 0|00]    POST_INC                [RST]
[0016| 0|00]    JMP                     007 (ins-9) 
[0017| 0|00]    POP_SCOPE
[0018| 0|00]    POP_SCOPE
@jespa007 jespa007 added this to the ZetScript 2.0.0 milestone Sep 10, 2023
@jespa007 jespa007 self-assigned this Sep 10, 2023
@jespa007
Copy link
Owner Author

The list.push It has improve x20 by doing double size when reaches capacity. It has still an performance issue:

  1. vm_find_native_function (2.69% overhead of calls): it searchs for the c++ function every time before execute call.
  2. vm_load_field (34%): It does expensive operations like 'getSymbolMemberFunction' and 'ZS_NEW_OBJECT_MEMBER_FUNCTION'

@jespa007
Copy link
Owner Author

jespa007 commented Sep 15, 2023

Has been tested that if we don't use vm_find_native_function it speeds up x2.69. So with this improve,

Lua is fast 0.9/0.04 = x22 times that ZetScript

@jespa007
Copy link
Owner Author

jespa007 commented Sep 26, 2023

Has been modified zs_string constructor for fast creation. Furthermore, in the "vm_load_field" it had a CPU overload in searching member symbol. Because instruction value_op2 is not used, it has been used to save the last symbol searched in the instruction. In general the performance has been increased by x4.

So in metrics ZetScript longs 0.6 seconds. In general has been improved by,

64s/0.6s = x106 times faster

And now,


Lua
-----

Lua is fast 1.34s/0.07s = x8 times faster that ZetScript

Wren
-------

Wren is fast 1.34s/0.17s = x3.5 times faster that ZetScript

@jespa007 jespa007 removed this from the ZetScript 2.1.0 milestone May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant