-
Notifications
You must be signed in to change notification settings - Fork 0
/
fasm.txt
4940 lines (4335 loc) · 265 KB
/
fasm.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
,'''
,,;,, ,,,, ,,,,, ,,, ,,
; ; ; ; ; ;
; ,''''; '''', ; ; ;
; ',,,,;, ,,,,,' ; ; ;
flat assembler 1.73
Programmer's Manual
Table of contents
-----------------
Chapter 1 Introduction
1.1 Compiler overview
1.1.1 System requirements
1.1.2 Executing compiler from command line
1.1.3 Compiler messages
1.1.4 Output formats
1.2 Assembly syntax
1.2.1 Instruction syntax
1.2.2 Data definitions
1.2.3 Constants and labels
1.2.4 Numerical expressions
1.2.5 Jumps and calls
1.2.6 Size settings
Chapter 2 Instruction set
2.1 The x86 architecture instructions
2.1.1 Data movement instructions
2.1.2 Type conversion instructions
2.1.3 Binary arithmetic instructions
2.1.4 Decimal arithmetic instructions
2.1.5 Logical instructions
2.1.6 Control transfer instructions
2.1.7 I/O instructions
2.1.8 Strings operations
2.1.9 Flag control instructions
2.1.10 Conditional operations
2.1.11 Miscellaneous instructions
2.1.12 System instructions
2.1.13 FPU instructions
2.1.14 MMX instructions
2.1.15 SSE instructions
2.1.16 SSE2 instructions
2.1.17 SSE3 instructions
2.1.18 AMD 3DNow! instructions
2.1.19 The x86-64 long mode instructions
2.1.20 SSE4 instructions
2.1.21 AVX instructions
2.1.22 AVX2 instructions
2.1.23 Auxiliary sets of computational instructions
2.1.24 AVX-512 instructions
2.1.25 Other extensions of instruction set
2.2 Control directives
2.2.1 Numerical constants
2.2.2 Conditional assembly
2.2.3 Repeating blocks of instructions
2.2.4 Addressing spaces
2.2.5 Other directives
2.2.6 Multiple passes
2.3 Preprocessor directives
2.3.1 Including source files
2.3.2 Symbolic constants
2.3.3 Macroinstructions
2.3.4 Structures
2.3.5 Repeating macroinstructions
2.3.6 Conditional preprocessing
2.3.7 Order of processing
2.4 Formatter directives
2.4.1 MZ executable
2.4.2 Portable Executable
2.4.3 Common Object File Format
2.4.4 Executable and Linkable Format
Chapter 1 Introduction
-----------------------
This chapter contains all the most important information you need to begin
using the flat assembler. If you are experienced assembly language programmer,
you should read at least this chapter before using this compiler.
1.1 Compiler overview
Flat assembler is a fast assembly language compiler for the x86 architecture
processors, which does multiple passes to optimize the size of generated
machine code. It is self-compilable and versions for different operating
systems are provided. All the versions are designed to be used from the system
command line and they should not differ in behavior.
1.1.1 System requirements
All versions require the x86 architecture 32-bit processor (at least 80386),
although they can produce programs for the x86 architecture 16-bit processors,
too. DOS version requires an OS compatible with MS DOS 2.0 and either true
real mode environment or DPMI. Windows version requires a Win32 console
compatible with 3.1 version.
1.1.2 Executing compiler from command line
To execute flat assembler from the command line you need to provide two
parameters - first should be name of source file, second should be name of
destination file. If no second parameter is given, the name for output
file will be guessed automatically. After displaying short information about
the program name and version, compiler will read the data from source file and
compile it. When the compilation is successful, compiler will write the
generated code to the destination file and display the summary of compilation
process; otherwise it will display the information about error that occurred.
The source file should be a text file, and can be created in any text
editor. Line breaks are accepted in both DOS and Unix standards, tabulators
are treated as spaces.
In the command line you can also include "-m" option followed by a number,
which specifies how many kilobytes of memory flat assembler should maximally
use. In case of DOS version this options limits only the usage of extended
memory. The "-p" option followed by a number can be used to specify the limit
for number of passes the assembler performs. If code cannot be generated
within specified amount of passes, the assembly will be terminated with an
error message. The maximum value of this setting is 65536, while the default
limit, used when no such option is included in command line, is 100.
It is also possible to limit the number of passes the assembler
performs, with the "-p" option followed by a number specifying the maximum
number of passes.
There are no command line options that would affect the output of compiler,
flat assembler requires only the source code to include the information it
really needs. For example, to specify output format you specify it by using
the "format" directive at the beginning of source.
1.1.3 Compiler messages
As it is stated above, after the successful compilation, the compiler displays
the compilation summary. It includes the information of how many passes was
done, how much time it took, and how many bytes were written into the
destination file.
The following is an example of the compilation summary:
flat assembler version 1.72 (16384 kilobytes memory)
38 passes, 5.3 seconds, 77824 bytes.
In case of error during the compilation process, the program will display an
error message. For example, when compiler can't find the input file, it will
display the following message:
flat assembler version 1.72 (16384 kilobytes memory)
error: source file not found.
If the error is connected with a specific part of source code, the source line
that caused the error will be also displayed. Also placement of this line in
the source is given to help you finding this error, for example:
flat assembler version 1.72 (16384 kilobytes memory)
example.asm [3]:
mob ax,1
error: illegal instruction.
It means that in the third line of the "example.asm" file compiler has
encountered an unrecognized instruction. When the line that caused error
contains a macroinstruction, also the line in macroinstruction definition
that generated the erroneous instruction is displayed:
flat assembler version 1.72 (16384 kilobytes memory)
example.asm [6]:
stoschar 7
example.asm [3] stoschar [1]:
mob al,char
error: illegal instruction.
It means that the macroinstruction in the sixth line of the "example.asm" file
generated an unrecognized instruction with the first line of its definition.
1.1.4 Output formats
By default, when there is no "format" directive in source file, flat
assembler simply puts generated instruction codes into output, creating this
way flat binary file. By default it generates 16-bit code, but you can always
turn it into the 16-bit or 32-bit mode by using "use16" or "use32" directive.
Some of the output formats switch into 32-bit mode, when selected - more
information about formats which you can choose can be found in 2.4.
All output code is always in the order in which it was entered into the
source file.
1.2 Assembly syntax
The information provided below is intended mainly for the assembly language
programmers that have been using some other assembly compilers before.
If you are beginner, you should look for the assembly programming tutorials.
Flat assembler by default uses the Intel syntax for the assembly
instructions, although you can customize it using the preprocessor
capabilities (macroinstructions and symbolic constants). It also has its own
set of the directives - the instructions for compiler.
All symbols defined inside the sources are case-sensitive.
1.2.1 Instruction syntax
Instructions in assembly language are separated by line breaks, and one
instruction is expected to fill the one line of text. If a line contains
a semicolon, except for the semicolons inside the quoted strings, the rest of
this line is the comment and compiler ignores it. If a line ends with "\"
character (eventually the semicolon and comment may follow it), the next line
is attached at this point.
Each line in source is the sequence of items, which may be one of the three
types. One type are the symbol characters, which are the special characters
that are individual items even when are not spaced from the other ones.
Any of the "+-*/=<>()[]{}:,|&~#`" is the symbol character. The sequence of
other characters, separated from other items with either blank spaces or
symbol characters, is a symbol. If the first character of symbol is either a
single or double quote, it integrates any sequence of characters following it,
even the special ones, into a quoted string, which should end with the same
character, with which it began (the single or double quote) - however if there
are two such characters in a row (without any other character between them),
they are integrated into quoted string as just one of them and the quoted
string continues then. The symbols other than symbol characters and quoted
strings can be used as names, so are also called the name symbols.
Every instruction consists of the mnemonic and the various number of
operands, separated with commas. The operand can be register, immediate value
or a data addressed in memory, it can also be preceded by size operator to
define or override its size (table 1.1). Names of available registers you can
find in table 1.2, their sizes cannot be overridden. Immediate value can be
specified by any numerical expression.
When operand is a data in memory, the address of that data (also any
numerical expression, but it may contain registers) should be enclosed in
square brackets or preceded by "ptr" operator. For example instruction
"mov eax,3" will put the immediate value 3 into the EAX register, instruction
"mov eax,[7]" will put the 32-bit value from the address 7 into EAX and the
instruction "mov byte [7],3" will put the immediate value 3 into the byte at
address 7, it can also be written as "mov byte ptr 7,3". To specify which
segment register should be used for addressing, segment register name followed
by a colon should be put just before the address value (inside the square
brackets or after the "ptr" operator).
Table 1.1 Size operators
/-------------------------\
| Operator | Bits | Bytes |
|==========|======|=======|
| byte | 8 | 1 |
| word | 16 | 2 |
| dword | 32 | 4 |
| fword | 48 | 6 |
| pword | 48 | 6 |
| qword | 64 | 8 |
| tbyte | 80 | 10 |
| tword | 80 | 10 |
| dqword | 128 | 16 |
| xword | 128 | 16 |
| qqword | 256 | 32 |
| yword | 256 | 32 |
| dqqword | 512 | 64 |
| zword | 512 | 64 |
\-------------------------/
Table 1.2 Registers
/-----------------------------------------------------------------\
| Type | Bits | |
|=========|======|================================================|
| | 8 | al cl dl bl ah ch dh bh |
| General | 16 | ax cx dx bx sp bp si di |
| | 32 | eax ecx edx ebx esp ebp esi edi |
|---------|------|------------------------------------------------|
| Segment | 16 | es cs ss ds fs gs |
|---------|------|------------------------------------------------|
| Control | 32 | cr0 cr2 cr3 cr4 |
|---------|------|------------------------------------------------|
| Debug | 32 | dr0 dr1 dr2 dr3 dr6 dr7 |
|---------|------|------------------------------------------------|
| FPU | 80 | st0 st1 st2 st3 st4 st5 st6 st7 |
|---------|------|------------------------------------------------|
| MMX | 64 | mm0 mm1 mm2 mm3 mm4 mm5 mm6 mm7 |
|---------|------|------------------------------------------------|
| SSE | 128 | xmm0 xmm1 xmm2 xmm3 xmm4 xmm5 xmm6 xmm7 |
|---------|------|------------------------------------------------|
| AVX | 256 | ymm0 ymm1 ymm2 ymm3 ymm4 ymm5 ymm6 ymm7 |
|---------|------|------------------------------------------------|
| AVX-512 | 512 | zmm0 zmm1 zmm2 zmm3 zmm4 zmm5 zmm6 zmm7 |
|---------|------|------------------------------------------------|
| Opmask | 64 | k0 k1 k2 k3 k4 k5 k6 k7 |
|---------|------|------------------------------------------------|
| Bounds | 128 | bnd0 bnd1 bnd2 bnd3 |
\-----------------------------------------------------------------/
1.2.2 Data definitions
To define data or reserve a space for it, use one of the directives listed in
table 1.3. The data definition directive should be followed by one or more of
numerical expressions, separated with commas. These expressions define the
values for data cells of size depending on which directive is used. For
example "db 1,2,3" will define the three bytes of values 1, 2 and 3
respectively.
The "db" and "du" directives also accept the quoted string values of any
length, which will be converted into chain of bytes when "db" is used and into
chain of words with zeroed high byte when "du" is used. For example "db 'abc'"
will define the three bytes of values 61, 62 and 63.
The "dp" directive and its synonym "df" accept the values consisting of two
numerical expressions separated with colon, the first value will become the
high word and the second value will become the low double word of the far
pointer value. Also "dd" accepts such pointers consisting of two word values
separated with colon, and "dt" accepts the word and quad word value separated
with colon, the quad word is stored first. The "dt" directive with single
expression as parameter accepts only floating point values and creates data in
FPU double extended precision format.
Any of the above directive allows the usage of special "dup" operator to
make multiple copies of given values. The count of duplicates should precede
this operator and the value to duplicate should follow - it can even be the
chain of values separated with commas, but such set of values needs to be
enclosed with parenthesis, like "db 5 dup (1,2)", which defines five copies
of the given two byte sequence.
The "file" is a special directive and its syntax is different. This
directive includes a chain of bytes from file and it should be followed by the
quoted file name, then optionally numerical expression specifying offset in
file preceded by the colon, and - also optionally - comma and numerical
expression specifying count of bytes to include (if no count is specified, all
data up to the end of file is included). For example "file 'data.bin'" will
include the whole file as binary data and "file 'data.bin':10h,4" will include
only four bytes starting at offset 10h.
The data reservation directive should be followed by only one numerical
expression, and this value defines how many cells of the specified size should
be reserved. All data definition directives also accept the "?" value, which
means that this cell should not be initialized to any value and the effect is
the same as by using the data reservation directive. The uninitialized data
may not be included in the output file, so its values should be always
considered unknown.
Table 1.3 Data directives
/----------------------------\
| Size | Define | Reserve |
| (bytes) | data | data |
|=========|========|=========|
| 1 | db | rb |
| | file | |
|---------|--------|---------|
| 2 | dw | rw |
| | du | |
|---------|--------|---------|
| 4 | dd | rd |
|---------|--------|---------|
| 6 | dp | rp |
| | df | rf |
|---------|--------|---------|
| 8 | dq | rq |
|---------|--------|---------|
| 10 | dt | rt |
\----------------------------/
1.2.3 Constants and labels
In the numerical expressions you can also use constants or labels instead of
numbers. To define the constant or label you should use the specific
directives. Each label can be defined only once and it is accessible from the
any place of source (even before it was defined). Constant can be redefined
many times, but in this case it is accessible only after it was defined, and
is always equal to the value from last definition before the place where it's
used. When a constant is defined only once in source, it is - like the label -
accessible from anywhere.
The definition of constant consists of name of the constant followed by the
"=" character and numerical expression, which after calculation will become
the value of constant. This value is always calculated at the time the
constant is defined. For example you can define "count" constant by using the
directive "count = 17", and then use it in the assembly instructions, like
"mov cx,count" - which will become "mov cx,17" during the compilation process.
There are different ways to define labels. The simplest is to follow the
name of label by the colon, this directive can even be followed by the other
instruction in the same line. It defines the label whose value is equal to
offset of the point where it's defined. This method is usually used to label
the places in code. The other way is to follow the name of label (without a
colon) by some data directive. It defines the label with value equal to
offset of the beginning of defined data, and remembered as a label for data
with cell size as specified for that data directive in table 1.3.
The label can be treated as constant of value equal to offset of labeled
code or data. For example when you define data using the labeled directive
"char db 224", to put the offset of this data into BX register you should use
"mov bx,char" instruction, and to put the value of byte addressed by "char"
label to DL register, you should use "mov dl,[char]" (or "mov dl,ptr char").
But when you try to assemble "mov ax,[char]", it will cause an error, because
fasm compares the sizes of operands, which should be equal. You can force
assembling that instruction by using size override: "mov ax,word [char]", but
remember that this instruction will read the two bytes beginning at "char"
address, while it was defined as a one byte.
The last and the most flexible way to define labels is to use "label"
directive. This directive should be followed by the name of label, then
optionally size operator (it can be preceded by a colon) and then - also
optionally "at" operator and the numerical expression defining the address at
which this label should be defined. For example "label wchar word at char"
will define a new label for the 16-bit data at the address of "char". Now the
instruction "mov ax,[wchar]" will be after compilation the same as
"mov ax,word [char]". If no address is specified, "label" directive defines
the label at current offset. Thus "mov [wchar],57568" will copy two bytes
while "mov [char],224" will copy one byte to the same address.
The label whose name begins with dot is treated as local label, and its name
is attached to the name of last global label (with name beginning with
anything but dot) to make the full name of this label. So you can use the
short name (beginning with dot) of this label anywhere before the next global
label is defined, and in the other places you have to use the full name. Label
beginning with two dots are the exception - they are like global, but they
don't become the new prefix for local labels.
The "@@" name means anonymous label, you can have defined many of them in
the source. Symbol "@b" (or equivalent "@r") references the nearest preceding
anonymous label, symbol "@f" references the nearest following anonymous label.
These special symbol are case-insensitive.
1.2.4 Numerical expressions
In the above examples all the numerical expressions were the simple numbers,
constants or labels. But they can be more complex, by using the arithmetical
or logical operators for calculations at compile time. All these operators
with their priority values are listed in table 1.4. The operations with higher
priority value will be calculated first, you can of course change this
behavior by putting some parts of expression into parenthesis. The "+", "-",
"*" and "/" are standard arithmetical operations, "mod" calculates the
remainder from division. The "and", "or", "xor", "shl", "shr", "bsf", "bsr"
and "not" perform the same bit-logical operations as assembly instructions of
those names. The "rva" and "plt" are special unary operators that perform
conversions between different kinds of addresses, they can be used only with
few of the output formats and their meaning may vary (see 2.4).
The arithmetical and bit-logical calculations are processed as if they
operated on infinite precision 2-adic numbers, and assembler signalizes an
overflow error if because of its limitations it is not table to perform the
required calculation, or if the result is too large number to fit in either
signed or unsigned range for the destination unit size.
The numbers in the expression are by default treated as a decimal, binary
numbers should have the "b" letter attached at the end, octal number should
end with "o" letter, hexadecimal numbers should begin with "0x" characters
(like in C language) or with the "$" character (like in Pascal language) or
they should end with "h" letter. Also quoted string, when encountered in
expression, will be converted into number - the first character will become
the least significant byte of number.
The numerical expression used as an address value can also contain any of
general registers used for addressing, they can be added and multiplied by
appropriate values, as it is allowed for the x86 architecture instructions.
The numerical calculations inside address definition by default operate with
target size assumed to be the same as the current bitness of code, even if
generated instruction encoding will use a different address size.
There are also some special symbols that can be used inside the numerical
expression. First is "$", which is always equal to the value of current
offset, while "$$" is equal to base address of current addressing space. The
other one is "%", which is the number of current repeat in parts of code that
are repeated using some special directives (see 2.2) and zero anywhere else.
There's also "%t" symbol, which is always equal to the current time stamp.
Any numerical expression can also consist of single floating point value
(flat assembler does not allow any floating point operations at compilation
time) in the scientific notation, they can end with the "f" letter to be
recognized, otherwise they should contain at least one of the "." or "E"
characters. So "1.0", "1E0" and "1f" define the same floating point value,
while simple "1" defines an integer value.
Table 1.4 Arithmetical and bit-logical operators by priority
/-------------------------\
| Priority | Operators |
|==========|==============|
| 0 | + - |
|----------|--------------|
| 1 | * / |
|----------|--------------|
| 2 | mod |
|----------|--------------|
| 3 | and or xor |
|----------|--------------|
| 4 | shl shr |
|----------|--------------|
| 5 | not |
|----------|--------------|
| 6 | bsf bsr |
|----------|--------------|
| 7 | rva plt |
\-------------------------/
1.2.5 Jumps and calls
The operand of any jump or call instruction can be preceded not only by the
size operator, but also by one of the operators specifying type of the jump:
"short", "near" or "far". For example, when assembler is in 16-bit mode,
instruction "jmp dword [0]" will become the far jump and when assembler is
in 32-bit mode, it will become the near jump. To force this instruction to be
treated differently, use the "jmp near dword [0]" or "jmp far dword [0]" form.
When operand of near jump is the immediate value, assembler will generate
the shortest variant of this jump instruction if possible (but will not create
32-bit instruction in 16-bit mode nor 16-bit instruction in 32-bit mode,
unless there is a size operator stating it). By specifying the jump type
you can force it to always generate long variant (for example "jmp near 0")
or to always generate short variant and terminate with an error when it's
impossible (for example "jmp short 0").
1.2.6 Size settings
When instruction uses some memory addressing, by default the smallest form of
instruction is generated by using the short displacement if only address
value fits in the range. This can be overridden using the "word" or "dword"
operator before the address inside the square brackets (or after the "ptr"
operator), which forces the long displacement of appropriate size to be made.
In case when address is not relative to any registers, those operators allow
also to choose the appropriate mode of absolute addressing.
Instructions "adc", "add", "and", "cmp", "or", "sbb", "sub" and "xor" with
first operand being 16-bit or 32-bit are by default generated in shortened
8-bit form when the second operand is immediate value fitting in the range
for signed 8-bit values. It also can be overridden by putting the "word" or
"dword" operator before the immediate value. The similar rules applies to the
"imul" instruction with the last operand being immediate value.
Immediate value as an operand for "push" instruction without a size operator
is by default treated as a word value if assembler is in 16-bit mode and as a
double word value if assembler is in 32-bit mode, shorter 8-bit form of this
instruction is used if possible, "word" or "dword" size operator forces the
"push" instruction to be generated in longer form for specified size. "pushw"
and "pushd" mnemonics force assembler to generate 16-bit or 32-bit code
without forcing it to use the longer form of instruction.
Chapter 2 Instruction set
--------------------------
This chapter provides the detailed information about the instructions and
directives supported by flat assembler. Directives for defining labels were
already discussed in 1.2.3, all other directives will be described later in
this chapter.
2.1 The x86 architecture instructions
In this section you can find both the information about the syntax and
purpose the assembly language instructions. If you need more technical
information, look for the Intel Architecture Software Developer's Manual.
Assembly instructions consist of the mnemonic (instruction's name) and from
zero to three operands. If there are two or more operands, usually first is
the destination operand and second is the source operand. Each operand can be
register, memory or immediate value (see 1.2 for details about syntax of
operands). After the description of each instruction there are examples
of different combinations of operands, if the instruction has any.
Some instructions act as prefixes and can be followed by other instruction
in the same line, and there can be more than one prefix in a line. Each name
of the segment register is also a mnemonic of instruction prefix, altough it
is recommended to use segment overrides inside the square brackets instead of
these prefixes.
2.1.1 Data movement instructions
"mov" transfers a byte, word or double word from the source operand to the
destination operand. It can transfer data between general registers, from
the general register to memory, or from memory to general register, but it
cannot move from memory to memory. It can also transfer an immediate value to
general register or memory, segment register to general register or memory,
general register or memory to segment register, control or debug register to
general register and general register to control or debug register. The "mov"
can be assembled only if the size of source operand and size of destination
operand are the same. Below are the examples for each of the allowed
combinations:
mov bx,ax ; general register to general register
mov [char],al ; general register to memory
mov bl,[char] ; memory to general register
mov dl,32 ; immediate value to general register
mov [char],32 ; immediate value to memory
mov ax,ds ; segment register to general register
mov [bx],ds ; segment register to memory
mov ds,ax ; general register to segment register
mov ds,[bx] ; memory to segment register
mov eax,cr0 ; control register to general register
mov cr3,ebx ; general register to control register
"xchg" swaps the contents of two operands. It can swap two byte operands,
two word operands or two double word operands. Order of operands is not
important. The operands may be two general registers, or general register
with memory. For example:
xchg ax,bx ; swap two general registers
xchg al,[char] ; swap register with memory
"push" decrements the stack frame pointer (ESP register), then transfers
the operand to the top of stack indicated by ESP. The operand can be memory,
general register, segment register or immediate value of word or double word
size. If operand is an immediate value and no size is specified, it is by
default treated as a word value if assembler is in 16-bit mode and as a double
word value if assembler is in 32-bit mode. "pushw" and "pushd" mnemonics are
variants of this instruction that store the values of word or double word size
respectively. If more operands follow in the same line (separated only with
spaces, not commas), compiler will assemble chain of the "push" instructions
with these operands. The examples are with single operands:
push ax ; store general register
push es ; store segment register
pushw [bx] ; store memory
push 1000h ; store immediate value
"pusha" saves the contents of the eight general register on the stack.
This instruction has no operands. There are two version of this instruction,
one 16-bit and one 32-bit, assembler automatically generates the appropriate
version for current mode, but it can be overridden by using "pushaw" or
"pushad" mnemonic to always get the 16-bit or 32-bit version. The 16-bit
version of this instruction pushes general registers on the stack in the
following order: AX, CX, DX, BX, the initial value of SP before AX was pushed,
BP, SI and DI. The 32-bit version pushes equivalent 32-bit general registers
in the same order.
"pop" transfers the word or double word at the current top of stack to the
destination operand, and then increments ESP to point to the new top of stack.
The operand can be memory, general register or segment register. "popw" and
"popd" mnemonics are variants of this instruction for restoring the values of
word or double word size respectively. If more operands separated with spaces
follow in the same line, compiler will assemble chain of the "pop"
instructions with these operands.
pop bx ; restore general register
pop ds ; restore segment register
popw [si] ; restore memory
"popa" restores the registers saved on the stack by "pusha" instruction,
except for the saved value of SP (or ESP), which is ignored. This instruction
has no operands. To force assembling 16-bit or 32-bit version of this
instruction use "popaw" or "popad" mnemonic.
2.1.2 Type conversion instructions
The type conversion instructions convert bytes into words, words into double
words, and double words into quad words. These conversions can be done using
the sign extension or zero extension. The sign extension fills the extra bits
of the larger item with the value of the sign bit of the smaller item, the
zero extension simply fills them with zeros.
"cwd" and "cdq" double the size of value AX or EAX register respectively
and store the extra bits into the DX or EDX register. The conversion is done
using the sign extension. These instructions have no operands.
"cbw" extends the sign of the byte in AL throughout AX, and "cwde" extends
the sign of the word in AX throughout EAX. These instructions also have no
operands.
"movsx" converts a byte to word or double word and a word to double word
using the sign extension. "movzx" does the same, but it uses the zero
extension. The source operand can be general register or memory, while the
destination operand must be a general register. For example:
movsx ax,al ; byte register to word register
movsx edx,dl ; byte register to double word register
movsx eax,ax ; word register to double word register
movsx ax,byte [bx] ; byte memory to word register
movsx edx,byte [bx] ; byte memory to double word register
movsx eax,word [bx] ; word memory to double word register
2.1.3 Binary arithmetic instructions
"add" replaces the destination operand with the sum of the source and
destination operands and sets CF if overflow has occurred. The operands may
be bytes, words or double words. The destination operand can be general
register or memory, the source operand can be general register or immediate
value, it can also be memory if the destination operand is register.
add ax,bx ; add register to register
add ax,[si] ; add memory to register
add [di],al ; add register to memory
add al,48 ; add immediate value to register
add [char],48 ; add immediate value to memory
"adc" sums the operands, adds one if CF is set, and replaces the destination
operand with the result. Rules for the operands are the same as for the "add"
instruction. An "add" followed by multiple "adc" instructions can be used to
add numbers longer than 32 bits.
"inc" adds one to the operand, it does not affect CF. The operand can be a
general register or memory, and the size of the operand can be byte, word or
double word.
inc ax ; increment register by one
inc byte [bx] ; increment memory by one
"sub" subtracts the source operand from the destination operand and replaces
the destination operand with the result. If a borrow is required, the CF is
set. Rules for the operands are the same as for the "add" instruction.
"sbb" subtracts the source operand from the destination operand, subtracts
one if CF is set, and stores the result to the destination operand. Rules for
the operands are the same as for the "add" instruction. A "sub" followed by
multiple "sbb" instructions may be used to subtract numbers longer than 32
bits.
"dec" subtracts one from the operand, it does not affect CF. Rules for the
operand are the same as for the "inc" instruction.
"cmp" subtracts the source operand from the destination operand. It updates
the flags as the "sub" instruction, but does not alter the source and
destination operands. Rules for the operands are the same as for the "sub"
instruction.
"neg" subtracts a signed integer operand from zero. The effect of this
instructon is to reverse the sign of the operand from positive to negative or
from negative to positive. Rules for the operand are the same as for the "inc"
instruction.
"xadd" exchanges the destination operand with the source operand, then loads
the sum of the two values into the destination operand. The destination operand
may be a general register or memory, the source operand must be a general
register.
All the above binary arithmetic instructions update SF, ZF, PF and OF flags.
SF is always set to the same value as the result's sign bit, ZF is set when
all the bits of result are zero, PF is set when low order eight bits of result
contain an even number of set bits, OF is set if result is too large for a
positive number or too small for a negative number (excluding sign bit) to fit
in destination operand.
"mul" performs an unsigned multiplication of the operand and the
accumulator. If the operand is a byte, the processor multiplies it by the
contents of AL and returns the 16-bit result to AH and AL. If the operand is a
word, the processor multiplies it by the contents of AX and returns the 32-bit
result to DX and AX. If the operand is a double word, the processor multiplies
it by the contents of EAX and returns the 64-bit result in EDX and EAX. "mul"
sets CF and OF when the upper half of the result is nonzero, otherwise they
are cleared. Rules for the operand are the same as for the "inc" instruction.
"imul" performs a signed multiplication operation. This instruction has
three variations. First has one operand and behaves in the same way as the
"mul" instruction. Second has two operands, in this case destination operand
is multiplied by the source operand and the result replaces the destination
operand. Destination operand must be a general register, it can be word or
double word, source operand can be general register, memory or immediate
value. Third form has three operands, the destination operand must be a
general register, word or double word in size, source operand can be general
register or memory, and third operand must be an immediate value. The source
operand is multiplied by the immediate value and the result is stored in the
destination register. All the three forms calculate the product to twice the
size of operands and set CF and OF when the upper half of the result is
nonzero, but second and third form truncate the product to the size of
operands. So second and third forms can be also used for unsigned operands
because, whether the operands are signed or unsigned, the lower half of the
product is the same. Below are the examples for all three forms:
imul bl ; accumulator by register
imul word [si] ; accumulator by memory
imul bx,cx ; register by register
imul bx,[si] ; register by memory
imul bx,10 ; register by immediate value
imul ax,bx,10 ; register by immediate value to register
imul ax,[si],10 ; memory by immediate value to register
"div" performs an unsigned division of the accumulator by the operand.
The dividend (the accumulator) is twice the size of the divisor (the operand),
the quotient and remainder have the same size as the divisor. If divisor is
byte, the dividend is taken from AX register, the quotient is stored in AL and
the remainder is stored in AH. If divisor is word, the upper half of dividend
is taken from DX, the lower half of dividend is taken from AX, the quotient is
stored in AX and the remainder is stored in DX. If divisor is double word,
the upper half of dividend is taken from EDX, the lower half of dividend is
taken from EAX, the quotient is stored in EAX and the remainder is stored in
EDX. Rules for the operand are the same as for the "mul" instruction.
"idiv" performs a signed division of the accumulator by the operand.
It uses the same registers as the "div" instruction, and the rules for
the operand are the same.
2.1.4 Decimal arithmetic instructions
Decimal arithmetic is performed by combining the binary arithmetic
instructions (already described in the prior section) with the decimal
arithmetic instructions. The decimal arithmetic instructions are used to
adjust the results of a previous binary arithmetic operation to produce a
valid packed or unpacked decimal result, or to adjust the inputs to a
subsequent binary arithmetic operation so the operation will produce a valid
packed or unpacked decimal result.
"daa" adjusts the result of adding two valid packed decimal operands in
AL. "daa" must always follow the addition of two pairs of packed decimal
numbers (one digit in each half-byte) to obtain a pair of valid packed
decimal digits as results. The carry flag is set if carry was needed.
This instruction has no operands.
"das" adjusts the result of subtracting two valid packed decimal operands
in AL. "das" must always follow the subtraction of one pair of packed decimal
numbers (one digit in each half-byte) from another to obtain a pair of valid
packed decimal digits as results. The carry flag is set if a borrow was
needed. This instruction has no operands.
"aaa" changes the contents of register AL to a valid unpacked decimal
number, and zeroes the top four bits. "aaa" must always follow the addition
of two unpacked decimal operands in AL. The carry flag is set and AH is
incremented if a carry is necessary. This instruction has no operands.
"aas" changes the contents of register AL to a valid unpacked decimal
number, and zeroes the top four bits. "aas" must always follow the
subtraction of one unpacked decimal operand from another in AL. The carry flag
is set and AH decremented if a borrow is necessary. This instruction has no
operands.
"aam" corrects the result of a multiplication of two valid unpacked decimal
numbers. "aam" must always follow the multiplication of two decimal numbers
to produce a valid decimal result. The high order digit is left in AH, the
low order digit in AL. The generalized version of this instruction allows
adjustment of the contents of the AX to create two unpacked digits of any
number base. The standard version of this instruction has no operands, the
generalized version has one operand - an immediate value specifying the
number base for the created digits.
"aad" modifies the numerator in AH and AL to prepare for the division of two
valid unpacked decimal operands so that the quotient produced by the division
will be a valid unpacked decimal number. AH should contain the high order
digit and AL the low order digit. This instruction adjusts the value and
places the result in AL, while AH will contain zero. The generalized version
of this instruction allows adjustment of two unpacked digits of any number
base. Rules for the operand are the same as for the "aam" instruction.
2.1.5 Logical instructions
"not" inverts the bits in the specified operand to form a one's complement
of the operand. It has no effect on the flags. Rules for the operand are the
same as for the "inc" instruction.
"and", "or" and "xor" instructions perform the standard logical operations.
They update the SF, ZF and PF flags. Rules for the operands are the same as
for the "add" instruction.
"bt", "bts", "btr" and "btc" instructions operate on a single bit which can
be in memory or in a general register. The location of the bit is specified
as an offset from the low order end of the operand. The value of the offset
is the taken from the second operand, it either may be an immediate byte or
a general register. These instructions first assign the value of the selected
bit to CF. "bt" instruction does nothing more, "bts" sets the selected bit to
1, "btr" resets the selected bit to 0, "btc" changes the bit to its
complement. The first operand can be word or double word.
bt ax,15 ; test bit in register
bts word [bx],15 ; test and set bit in memory
btr ax,cx ; test and reset bit in register
btc word [bx],cx ; test and complement bit in memory
"bsf" and "bsr" instructions scan a word or double word for first set bit
and store the index of this bit into destination operand, which must be
general register. The bit string being scanned is specified by source operand,
it may be either general register or memory. The ZF flag is set if the entire
string is zero (no set bits are found); otherwise it is cleared. If no set bit
is found, the value of the destination register is undefined. "bsf" scans from
low order to high order (starting from bit index zero). "bsr" scans from high
order to low order (starting from bit index 15 of a word or index 31 of a
double word).
bsf ax,bx ; scan register forward
bsr ax,[si] ; scan memory reverse
"shl" shifts the destination operand left by the number of bits specified
in the second operand. The destination operand can be byte, word, or double
word general register or memory. The second operand can be an immediate value
or the CL register. The processor shifts zeros in from the right (low order)
side of the operand as bits exit from the left side. The last bit that exited
is stored in CF. "sal" is a synonym for "shl".
shl al,1 ; shift register left by one bit
shl byte [bx],1 ; shift memory left by one bit
shl ax,cl ; shift register left by count from cl
shl word [bx],cl ; shift memory left by count from cl
"shr" and "sar" shift the destination operand right by the number of bits
specified in the second operand. Rules for operands are the same as for the
"shl" instruction. "shr" shifts zeros in from the left side of the operand as
bits exit from the right side. The last bit that exited is stored in CF.
"sar" preserves the sign of the operand by shifting in zeros on the left side
if the value is positive or by shifting in ones if the value is negative.
"shld" shifts bits of the destination operand to the left by the number
of bits specified in third operand, while shifting high order bits from the
source operand into the destination operand on the right. The source operand
remains unmodified. The destination operand can be a word or double word
general register or memory, the source operand must be a general register,
third operand can be an immediate value or the CL register.
shld ax,bx,1 ; shift register left by one bit
shld [di],bx,1 ; shift memory left by one bit
shld ax,bx,cl ; shift register left by count from cl
shld [di],bx,cl ; shift memory left by count from cl
"shrd" shifts bits of the destination operand to the right, while shifting
low order bits from the source operand into the destination operand on the
left. The source operand remains unmodified. Rules for operands are the same
as for the "shld" instruction.
"rol" and "rcl" rotate the byte, word or double word destination operand
left by the number of bits specified in the second operand. For each rotation
specified, the high order bit that exits from the left of the operand returns
at the right to become the new low order bit. "rcl" additionally puts in CF
each high order bit that exits from the left side of the operand before it
returns to the operand as the low order bit on the next rotation cycle. Rules
for operands are the same as for the "shl" instruction.
"ror" and "rcr" rotate the byte, word or double word destination operand
right by the number of bits specified in the second operand. For each rotation
specified, the low order bit that exits from the right of the operand returns
at the left to become the new high order bit. "rcr" additionally puts in CF
each low order bit that exits from the right side of the operand before it
returns to the operand as the high order bit on the next rotation cycle.
Rules for operands are the same as for the "shl" instruction.
"test" performs the same action as the "and" instruction, but it does not
alter the destination operand, only updates flags. Rules for the operands are
the same as for the "and" instruction.
"bswap" reverses the byte order of a 32-bit general register: bits 0 through
7 are swapped with bits 24 through 31, and bits 8 through 15 are swapped with
bits 16 through 23. This instruction is provided for converting little-endian
values to big-endian format and vice versa.
bswap edx ; swap bytes in register
2.1.6 Control transfer instructions
"jmp" unconditionally transfers control to the target location. The
destination address can be specified directly within the instruction or
indirectly through a register or memory, the acceptable size of this address
depends on whether the jump is near or far (it can be specified by preceding
the operand with "near" or "far" operator) and whether the instruction is
16-bit or 32-bit. Operand for near jump should be "word" size for 16-bit
instruction or the "dword" size for 32-bit instruction. Operand for far jump
should be "dword" size for 16-bit instruction or "pword" size for 32-bit
instruction. A direct "jmp" instruction includes the destination address as
part of the instruction (and can be preceded by "short", "near" or "far"
operator), the operand specifying address should be the numerical expression
for near or short jump, or two numerical expressions separated with colon for
far jump, the first specifies selector of segment, the second is the offset
within segment. The "pword" operator can be used to force the 32-bit far call,
and "dword" to force the 16-bit far call. An indirect "jmp" instruction
obtains the destination address indirectly through a register or a pointer
variable, the operand should be general register or memory. See also 1.2.5 for
some more details.
jmp 100h ; direct near jump
jmp 0FFFFh:0 ; direct far jump
jmp ax ; indirect near jump
jmp pword [ebx] ; indirect far jump
"call" transfers control to the procedure, saving on the stack the address
of the instruction following the "call" for later use by a "ret" (return)
instruction. Rules for the operands are the same as for the "jmp" instruction,
but the "call" has no short variant of direct instruction and thus it not
optimized.
"ret", "retn" and "retf" instructions terminate the execution of a procedure
and transfers control back to the program that originally invoked the
procedure using the address that was stored on the stack by the "call"
instruction. "ret" is the equivalent for "retn", which returns from the
procedure that was executed using the near call, while "retf" returns from
the procedure that was executed using the far call. These instructions default
to the size of address appropriate for the current code setting, but the size
of address can be forced to 16-bit by using the "retw", "retnw" and "retfw"
mnemonics, and to 32-bit by using the "retd", "retnd" and "retfd" mnemonics.
All these instructions may optionally specify an immediate operand, by adding
this constant to the stack pointer, they effectively remove any arguments that
the calling program pushed on the stack before the execution of the "call"
instruction.
"iret" returns control to an interrupted procedure. It differs from "ret" in
that it also pops the flags from the stack into the flags register. The flags
are stored on the stack by the interrupt mechanism. It defaults to the size of
return address appropriate for the current code setting, but it can be forced
to use 16-bit or 32-bit address by using the "iretw" or "iretd" mnemonic.
The conditional transfer instructions are jumps that may or may not transfer
control, depending on the state of the CPU flags when the instruction
executes. The mnemonics for conditional jumps may be obtained by attaching
the condition mnemonic (see table 2.1) to the "j" mnemonic,
for example "jc" instruction will transfer the control when the CF flag is
set. The conditional jumps can be short or near, and direct only, and can be
optimized (see 1.2.5), the operand should be an immediate value specifying
target address.
Table 2.1 Conditions
/-----------------------------------------------------------\
| Mnemonic | Condition tested | Description |
|==========|=======================|========================|
| o | OF = 1 | overflow |
|----------|-----------------------|------------------------|
| no | OF = 0 | not overflow |
|----------|-----------------------|------------------------|
| c | | carry |
| b | CF = 1 | below |
| nae | | not above nor equal |
|----------|-----------------------|------------------------|
| nc | | not carry |
| ae | CF = 0 | above or equal |
| nb | | not below |
|----------|-----------------------|------------------------|
| e | ZF = 1 | equal |
| z | | zero |
|----------|-----------------------|------------------------|
| ne | ZF = 0 | not equal |
| nz | | not zero |
|----------|-----------------------|------------------------|
| be | CF or ZF = 1 | below or equal |
| na | | not above |
|----------|-----------------------|------------------------|
| a | CF or ZF = 0 | above |
| nbe | | not below nor equal |
|----------|-----------------------|------------------------|
| s | SF = 1 | sign |
|----------|-----------------------|------------------------|
| ns | SF = 0 | not sign |
|----------|-----------------------|------------------------|
| p | PF = 1 | parity |
| pe | | parity even |
|----------|-----------------------|------------------------|
| np | PF = 0 | not parity |
| po | | parity odd |
|----------|-----------------------|------------------------|
| l | SF xor OF = 1 | less |
| nge | | not greater nor equal |
|----------|-----------------------|------------------------|
| ge | SF xor OF = 0 | greater or equal |
| nl | | not less |
|----------|-----------------------|------------------------|
| le | (SF xor OF) or ZF = 1 | less or equal |
| ng | | not greater |
|----------|-----------------------|------------------------|
| g | (SF xor OF) or ZF = 0 | greater |
| nle | | not less nor equal |