-
Notifications
You must be signed in to change notification settings - Fork 69
/
NOTES
211 lines (181 loc) · 10.4 KB
/
NOTES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
Notes on 2D cmd submission:
The IOCTL_KGSL_RINGBUFFER_ISSUEIBCMDS ioctl contains a single ibdesc
entry, with pointer to GPU addr. The first 320 words restore context
state, and are skipped if there has not been a context switch.
There are two primary places that the 2D core reads commands. The
ringbuffer which is set up on kernel side, and essentially just
contains jumps to buffers which contain actual rendering commands
which are mapped by userspace (via IOCTL_KGSL_SHAREDMEM_FROM_VMALLOC)
and submitted via IOCTL_KGSL_RINGBUFFER_ISSUEIBCMDS. The ringbuffer
entry is setup to branch to user cmd buffer via addcmd():
Branch to cmd:
Z180_STREAM_PACKET_CALL <-- 7c000275
gpuaddr of next user cmd
Z180_CALL_CMD | nextcnt <-- 00001000 | nextcnt (nextcnt == 5 on non-ctx-switch restore)
ADDR_VGV3_LAST << 24 <-- 7f000000
ADDR_VGV3_LAST << 24 <-- 7f000000
Prior to this in the ringbuffer slot is a marker before n+1 branch-
to-cmd, inserted by addmarker():
Z180_STREAM_PACKET <-- 7c000176
Z180_MARKER_CMD | 5 <-- 00008005
ADDR_VGV3_LAST << 24 <-- 7f000000
ADDR_VGV3_LAST << 24 <-- 7f000000
ADDR_VGV3_LAST << 24 <-- 7f000000
Z180_STREAM_PACKET <-- 7c000176
5 <-- 00000005 (length of first packet in cmd submitted from userspace??)
ADDR_VGV3_LAST << 24 <-- 7f000000
ADDR_VGV3_LAST << 24 <-- 7f000000
ADDR_VGV3_LAST << 24 <-- 7f000000
In z180_cmdstream_issueibcmds() the cmd field of next packet-call is
fixed up to point back to the next ringbuffer entry (which starts with
the marker packet).
Each packet seems to contain the address (or 00000000 to not branch, ie.
next packet follows in same buffer) and length of the following packet.
Ringbuffer Entry N
Marker
Stream Packet Call
cmd: user cmd gpuaddr (or ctxt restore[1]) --+
len: 5 (or 320 for ctxt restore[1]) |
|
(if ctxt restore) / \ (else)
User Cmd N / |
Stream Packet Call - ctxt restore <---------+ |
cmd: 00000000 (next cmd follows) |
len: 5 |
Stream Packet Call - preamble (?) <-------------+
cmd: 00000000 (next cmd follows)
len: length of actual rendering cmd
Stream Packet Call - rendering
cmd: cmd/len set in z180_cmdstream_issueibcmds() to --+
len: to point back to next ringbuffer entry |
|
Ringbuffer Entry N+1 |
Marker <-----------------------------------+
Stream Packet Call
... and so on ...
NOTEs:
Len field is OR'd with:
#define Z180_CALL_CMD 0x1000
#define Z180_MARKER_CMD 0x8000
#define Z180_STREAM_END_CMD 0x9000
First dword is one ofs:
#define Z180_STREAM_PACKET 0x7C000176
#define Z180_STREAM_PACKET_CALL 0x7C000275
And packet is terminated with some number of 7F000000's.. the
number varies.. seems to usually be 2, but addmarker() uses 3..
The marker for ringbuffer entry n+1 is setup on issueibcmds ioctl #N.
[1] first 320 words of buffer submitted in issueibcmds is skipped if
not context switch, it seems to restore the state of the core.
Dump of preamble (?) plus render packet for solid fill to 64x64 surface:
0, 0, 64, 64 color:ff556677
gpuaddr still 6615a000
00000500 75 02 00 7c 00 00 00 00 <1a 00 00 00> 34 01 00 7c <-- 0x1a == 26, if this is the
00000510 00 00 00 00<<75 02 00 7c xx xx xx xx xx xx xx xx size of next packet, then:
00000520 00 00 00 0c 00 00 00 11 00 00 03 d0 40 00 08 d2
00000530 08 70 00 01 00 01 00 7c ,00 a0 15 66, d3 01 00 7c <-- 6615a000 is dst surface gpuaddr
00000540 ,00 a0 15 66, d1 01 00 7c 08 70 00 40 00 00 00 d5
00000550 00 00 04 08 00 00 04 09 08 00 00 0f 08 00 00 0f
00000560 09 00 00 0f 00 00 00 0e 00 00 00 f0 40 00 40 f1
00000570 ff 01 00 7c ;77 66 55 ff; 03 00 00 fe>>00 00 00 7f <-- ff556677 is fill color
00000580 00 00 00 7f aa aa aa aa aa aa aa aa aa aa aa aa
00000590 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
A set of two fills to 64x64 surface:
1st fill: x=0, y=0, w=64, h=64 color:ff556677
2nd fill: x=27, y=24, w=10, h=16 color:ff223344
(x2=37 y2=40)
gpuaddr is 6615a000
00000500 75 02 00 7c 00 00 00 00 <30 00 00 00> 34 01 00 7c
00000510 00 00 00 00<<75 02 00 7c xx xx xx xx xx xx xx xx
00000520 00 00 00 0c 00 00 00 11 00 00 03 d0 40 00 08 d2
00000530 08 70 00 01 00 01 00 7c ,00 a0 15 66, d3 01 00 7c <-- 6615a000 is dst surface gpuaddr
00000540 ,00 a0 15 66, d1 01 00 7c 08 70 00 40 00 00 00 d5
00000550 00 00 04 08 00 00 04 09 08 00 00 0f 08 00 00 0f
00000560 09 00 00 0f 00 00 00 0e 00 00 00 f0 40 00 40 f1
00000570 ff 01 00 7c ;77 66 55 ff; 00 00 00 0c 00 00 00 11 <-- ff556677 is 1st fill color
00000580 00 00 03 d0 40 00 08 d2 08 70 00 01 00 01 00 7c
00000590 ,00 a0 15 66, d3 01 00 7c ,00 a0 15 66, d1 01 00 7c <-- 6615a000 is dst surface gpuaddr
000005a0 08 70 00 40 00 00 00 d5 1b 50 02 08 18 80 02 09
000005b0 09 00 00 0f 09 00 00 0f 09 00 00 0f 00 00 00 0e
000005c0 18 00 1b f0 10 00 0a f1 ff 01 00 7c ;44 33 22 ff; <-- ff223344 is 2nd fill color
000005d0 03 00 00 fe>>00 00 00 7f 00 00 00 7f aa aa aa aa
000005e0 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
Hmm, a lot of: xx 01 00 7c
Another set of two fills to 128x256 surface:
1st fill: x=0, y=0, w=128, h=256 color:ff556677
2nd fill: x=59, y=120, w=10, h=16 color:ff223344
(x2=69 y2=136)
gpuaddr is 6615e000
00000500 75 02 00 7c 00 00 00 00 30 00 00 00 34 01 00 7c
00000510 00 00 00 00 75 02 00 7c 00 70 55 40 40 91 00 00
00000520 00 00 00 0c 00 00 00 11 00 00 03 d0 80 00 20 d2
00000530 10 70 00 01 00 01 00 7c 00 e0 15 66 d3 01 00 7c
00000540 00 e0 15 66 d1 01 00 7c 10 70 00 40 00 00 00 d5
00000550 00 00 08 08 00 00 10 09 08 00 00 0f 08 00 00 0f
00000560 09 00 00 0f 00 00 00 0e 00 00 00 f0 00 01 80 f1
00000570 ff 01 00 7c 77 66 55 ff 00 00 00 0c 00 00 00 11
00000580 00 00 03 d0 80 00 20 d2 10 70 00 01 00 01 00 7c
00000590 00 e0 15 66 d3 01 00 7c 00 e0 15 66 d1 01 00 7c
000005a0 10 70 00 40 00 00 00 d5 3b 50 04 08 78 80 08 09
000005b0 09 00 00 0f 09 00 00 0f 09 00 00 0f 00 00 00 0e
000005c0 78 00 3b f0 10 00 0a f1 ff 01 00 7c 44 33 22 ff
000005d0 03 00 00 fe 00 00 00 7f 00 00 00 7f aa aa aa aa
000005e0 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
----------------
Composite porter-duff operation notes:
There are 2 to 4 dwords in the cmdstream which seem to differ based on
op, as well as src/dst format (xRGB vs ARGB/A8... A8 seems to be handled
internally the same as ARGB??)
There are some observable patterns.. for example, the low 16b of column 2
and 4 are always identical.
In the tables below 0x00000000 means the dword is omitted:
// xRGB->xRGB
[PictOpSrc] = { 0x7c000114, 0x10002010, 0x00000000, 0x18012210 },
[PictOpIn] = { 0x7c000114, 0xb0100004, 0x00000000, 0x18110a04 },
[PictOpOut] = { 0x7c000114, 0xb0102004, 0x00000000, 0x18112a04 },
[PictOpOver] = { 0x7c000114, 0xd080a004, 0x7c000118, 0x8081aa04 },
[PictOpOutReverse] = { 0x7c000114, 0x80808040, 0x7c000118, 0x80808840 },
[PictOpAdd] = { 0x7c000114, 0x5080a004, 0x7c000118, 0x20818204 },
[PictOpOverReverse] = { 0x7c000114, 0x7090a004, 0x7c000118, 0x2091a204 },
[PictOpInReverse] = { 0x7c000114, 0x80800040, 0x7c000118, 0x80800840 },
[PictOpAtop] = { 0x7c000114, 0xf0908004, 0x7c000118, 0xa0918a04 },
[PictOpAtopReverse] = { 0x7c000114, 0xf0902004, 0x7c000118, 0xa0912a04 },
[PictOpXor] = { 0x7c000114, 0xf090a004, 0x7c000118, 0xa091aa04 },
// ARGB->ARGB
[PictOpSrc] = { 0x00000000, 0x14012010, 0x00000000, 0x18012210 },
[PictOpIn] = { 0x00000000, 0x14110004, 0x00000000, 0x18110a04 },
[PictOpOut] = { 0x00000000, 0x14112004, 0x00000000, 0x18112a04 },
[PictOpOver] = { 0x7c000114, 0x0281a004, 0x7c000118, 0x0281aa04 },
[PictOpOutReverse] = { 0x7c000114, 0x02808040, 0x7c000118, 0x02808840 },
[PictOpAdd] = { 0x00000000, 0x1481a004, 0x00000000, 0x18898204 },
[PictOpOverReverse] = { 0x00000000, 0x1491a004, 0x00000000, 0x1891a204 },
[PictOpInReverse] = { 0x7c000114, 0x02800040, 0x7c000118, 0x02800840 },
[PictOpAtop] = { 0x7c000114, 0x02918004, 0x7c000118, 0x02918a04 },
[PictOpAtopReverse] = { 0x7c000114, 0x02912004, 0x7c000118, 0x02912a04 },
[PictOpXor] = { 0x7c000114, 0x0291a004, 0x7c000118, 0x0291aa04 },
// A8->A8 same as ARGB->ARGB (I guess A8 is expanded to ARGB internally?)
// ARGB->xRGB
[PictOpSrc] = { 0x00000000, 0x14012010, 0x00000000, 0x18012210 },
[PictOpIn] = { 0x7c000114, 0x20110004, 0x00000000, 0x18110a04 },
[PictOpOut] = { 0x7c000114, 0x20112004, 0x00000000, 0x18112a04 },
[PictOpOver] = { 0x7c000114, 0x4281a004, 0x7c000118, 0x0281aa04 },
[PictOpOutReverse] = { 0x7c000114, 0x02808040, 0x7c000118, 0x02808840 },
[PictOpAdd] = { 0x7c000114, 0x4081a004, 0x00000000, 0x18898204 },
[PictOpOverReverse] = { 0x7c000114, 0x6091a004, 0x7c000118, 0x2091a204 },
[PictOpInReverse] = { 0x7c000114, 0x02800040, 0x7c000118, 0x02800840 },
[PictOpAtop] = { 0x7c000114, 0x62918004, 0x7c000118, 0x22918a04 },
[PictOpAtopReverse] = { 0x7c000114, 0x62912004, 0x7c000118, 0x22912a04 },
[PictOpXor] = { 0x7c000114, 0x6291a004, 0x7c000118, 0x2291aa04 },
// A8->xRGB same as ARGB->xRGB
// xRGB->ARGB
[PictOpSrc] = { 0x7c000114, 0x10002010, 0x00000000, 0x18012210 },
[PictOpIn] = { 0x7c000114, 0x90100004, 0x00000000, 0x18110a04 },
[PictOpOut] = { 0x7c000114, 0x90102004, 0x00000000, 0x18112a04 },
[PictOpOver] = { 0x7c000114, 0x9080a004, 0x7c000118, 0x8081aa04 },
[PictOpOutReverse] = { 0x7c000114, 0x80808040, 0x7c000118, 0x80808840 },
[PictOpAdd] = { 0x7c000114, 0x1080a004, 0x7c000118, 0x20818204 },
[PictOpOverReverse] = { 0x7c000114, 0x1090a004, 0x00000000, 0x1891a204 },
[PictOpInReverse] = { 0x7c000114, 0x80800040, 0x7c000118, 0x80800840 },
[PictOpAtop] = { 0x7c000114, 0x90908004, 0x7c000118, 0x80918a04 },
[PictOpAtopReverse] = { 0x7c000114, 0x90902004, 0x7c000118, 0x80912a04 },
[PictOpXor] = { 0x7c000114, 0x9090a004, 0x7c000118, 0x8091aa04 },
// xRGB->A8 same as xRGB->A8