Penguin
Blame: AssemblyLanguage
EditPageHistoryDiffInfoLikePages
Annotated edit history of AssemblyLanguage version 12, including all changes. View license author blame.
Rev Author # Line
7 AristotlePagaltzis 1 AssemblyLanguage is 1:1 translation of MachineCode into English mnemonics.
5 StuartYeates 2
11 JohnMcPherson 3 The Art of AssemblyLanguage Programming is a delicate topic. By programming in AssemblyLanguage you can hand optimize code and achieve efficiency that is difficult if not impossible to duplicate in a higher level language. However, current computers are fast enough to write most code in less efficient higher level languages. AssemblyLanguage is still used for embedded systems (where space and CPU speed are limited), and in parts of an OperatingSystem that are run very frequently or must run fast ([InterruptHandler]s etc.). Some parts of the GNU C library are also written in assembly for the same reasons (for example, some of the maths functions).
5 StuartYeates 4
7 AristotlePagaltzis 5 AssemblyLanguage code is not portable across different [CPU] architectures, of which there are many: Intel [x86], [MIPS], and the Motorola m68000 series, to name but a few. Early versions of [Unix] were written in assembler, and when BellLabs got new machines, they re-wrote their operating system for the new MachineCode, until they finally re-wrote most of it in [C] in 1973.
5 StuartYeates 6
7 AristotlePagaltzis 7 AssemblyLanguage code is difficult to understand and maintain. It is usually easier to start from scratch than to debug faulty code.
5 StuartYeates 8
12 AristotlePagaltzis 9 A Compiler such as [GCC] will hide its generation of AssemblyLanguage code from you as it generates its object files and the executables. It is however possible to tell it to generate the AssemblyLanguage code for you by passing it the <tt>-S</tt> CommandLine option
5 StuartYeates 10
11 Here is an example. First, the [C] code:
12
12 AristotlePagaltzis 13 <verbatim>
14 #include <stdio.h>
15
16 int main(void) {
17 int i;
18
19 i = 5;
20 i = i * 3;
21 printf("%d\n",i);
22 i = 0xff;
23 return i;
24 }
25 </verbatim>
5 StuartYeates 26
27 Now you can translate this to assembler. If I do this on an [x86] (ie [Intel] machine), I get:
7 AristotlePagaltzis 28
12 AristotlePagaltzis 29 <pre>
30 __$ gcc -S x.c && cat x.s__
31 .file "x.c"
32 .section .rodata
33 .LC0:
34 .string "%d\n"
35 .text
36 .globl main
37 .type main, @function
38 main:
39 pushl %ebp
40 movl %esp, %ebp
41 subl $8, %esp
42 andl $-16, %esp
43 movl $0, %eax
44 addl $15, %eax
45 addl $15, %eax
46 shrl $4, %eax
47 sall $4, %eax
48 subl %eax, %esp
49 movl $5, -4(%ebp)
50 movl -4(%ebp), %edx
51 movl %edx, %eax
52 addl %eax, %eax
53 addl %edx, %eax
54 movl %eax, -4(%ebp)
55 subl $8, %esp
56 pushl -4(%ebp)
57 pushl $.LC0
58 call printf
59 addl $16, %esp
60 movl $255, -4(%ebp)
61 movl -4(%ebp), %eax
62 leave
63 ret
64 .size main, .-main
65 .section .note.GNU-stack,"",@progbits
66 .ident "GCC: (GNU) 3.4.6"
67 </pre>
68
69 <tt>movl</tt>, <tt>jmp</tt>, <tt>addl</tt>, etc are mnemonics for individual [CPU] instruction OpCodes. <tt>%esp</tt>, <tt>%ebp</tt> etc are mnemonics for registers. For example, <tt>%esp</tt> is the [Stack] Pointer - it points to the top of the current process's [Stack]. The first <tt>movl</tt> copies the value in <tt>%esp</tt> into <tt>%ebp</tt>, then the <tt>subl</tt> subtracts 24 off <tt>%esp</tt>, so that the [Stack] has grown by 24 bytes. The next <tt>movl</tt> copies the value 5 into [Stack], 4 bytes below its end. This address is where the variable <tt>i</tt> is being stored, so all accesses to <tt>i</tt> in the [C] code become references to this memory location in MachineCode. We can also witness an optimization here: instead of doing i*3, it does i+(i+i). That's the <tt>addl</tt> and <tt>leal</tt> instructions. Below that, it puts some pointers (to <tt>printf</tt>'s arguments) on the stack and calls <tt>printf</tt>, which pulls its arguments from the stack.
5 StuartYeates 70
7 AristotlePagaltzis 71 As you can see, explaining what AssemblyLanguage code is doing line-by-line is tediously boring. This is how programmers used to write code, and it is a common fact that AssemblyLanguage programmers get paid more per line of code than those who hack away in higher level languages.
5 StuartYeates 72
10 AristotlePagaltzis 73 We can also note that it is extremely bad for your health to rely on the [GCC] output of some [C] code when learning [x86] AssemblyLanguage. [GCC] generates extremely horrid code on occassion, especially when working with multiplication and division because [x86] multiplication and division instructions are restricted in the registers they can use.
8 RuudSchramp 74
12 AristotlePagaltzis 75 However, the output of [GCC] can be a tremendously useful resource when optimising [C] code. Especialy when mixing different sizes of integers (char, int, long), the resulting MachineCode is sometimes flooded with unexpected typecasting instructions. While concealed at the [C] level, these extra instructions are quite obvious in the AssemblyLanguage (lots of <tt>and</tt> instructions and often additional <tt>mov</tt>).
5 StuartYeates 76
12 AristotlePagaltzis 77 Another sample piece of AssemblyLanguage code for [Linux] can be found in the HelloWorld page.
7 AristotlePagaltzis 78
79 ----
80 CategoryProgrammingLanguages

PHP Warning

lib/blame.php:177: Warning: Invalid argument supplied for foreach() (...repeated 2 times)