Diff: Endianness - Waikato Linux Users Group

Differences between current version and predecessor to the previous major change of Endianness.

Other diffs: Previous Revision, Previous Author, or view the Annotated Edit History

Newer page:	version 4	Last edited on Sunday, September 20, 2009 2:06:38 am	by AristotlePagaltzis
Older page:	version 2	Last edited on Saturday, September 19, 2009 10:54:51 pm	by LawrenceDoliveiro	Revert

@@ -1,8 +1,8 @@

''“There are 10 kinds of people in the world; those who know binary and those who don't.” – Seen on the net.''

<br> ''“There are 01 kinds of people who know binary; little-endians and everyone else.” – zcat(1)''

-The order of bytes in a word. The names “~~Big~~ -endian” and “little-endian” originate from the book “Gulliver’s Travels”, where a tribe of tiny people divide themselves into two factions in a ReligiousWar over which end they should cut their eggs open at – the big end, or the little end. In computer terms, big-endian [CPU]s store the most significant byte at the lowest byte address of a word and progress to less significant bytes at higher addresses, while little-endian machines start with the least significant byte and store progressively more significant ones. A [C] program demonstrates this:

+The order of bytes in a word. The names “big -endian” and “little-endian” originate from the book “Gulliver’s Travels”, where a tribe of tiny people divide themselves into two factions in a ReligiousWar over which end they should cut their eggs open at – the big end, or the little end. In computer terms, big-endian [CPU]s store the most significant byte at the lowest byte address of a word and progress to less significant bytes at higher addresses, while little-endian machines start with the least significant byte and store progressively more significant ones. A [C] program demonstrates this:

#include <stdio.h>

int main( void ) {

@@ -29,7 +29,29 @@

Humans using a left-to-right writing system with arabic numbers are big-endian: when we write “1234,” the 1 means one thousand, the 4 means 4, so we write digits in order from the most to the least significant. Humans using right-to-left writing systems with arabic numbers, such as Arabic and Hebrew, are little-endian, because they approach a number such as 1234 from the right, ie from the least significant digit first. Funnily enough, this means that even though the digits have the same geometrical sequence on paper, they are little-endian in one writing system and big-endian in the other.

Another peculiarity is found in spoken German (or written German with spelled-out numbers): tens and ones are arranged in little-endian order, f.ex. “einundzwanzig” means “one-and-twenty.” However, the more significant digits are arranged in big-endian order: “dreihunderteinundzwanzig” means “three-hundred-and-one-and-twenty”. This is also sometimes seen in older English usage, e.g. “four-and-twenty blackbirds baked in a pie”.

+

+Ultimately, it comes down to reading order versus logical consistency. Big-endian matches the order in which most of us read things from left-to-right, while little-endian simplifies the relationship between three different numberings:

+* that of the binary digits of an integer—call this ''i''

+* that of bits within a byte—call this ''b''

+* that of bytes within a word—call this ''B''

+

+For instance, the bits in a byte are numbered 0-7. A byte can hold an unsigned integer in the range 0 .. 255. Does the bit numbered 0 represent the 2**0 digit, bit 1 represent 2**1, etc? It might or might not—this is a convention defined by the CPU architecture. And what happens with, say, a two-byte integer? Does byte 0 hold bits 0-7 and byte 1 hold bits 8-15, or vice versa? This is where the endianness of the CPU architecture comes in.

+

+Suppose the numbering of a bit in an ''N''-byte integer is ''j''. In little-endian architectures, the following are always true:

+* ''b'' = ''j'' __mod__ 8

+* ''B'' = ''j'' __div__ 8

+* ''i'' = ''j''

+

+In big-endian architectures, the situation is more complicated. For example, in the Motorola 680''x''0 architecture, the first and third of the above equations still hold, while the second one becomes

+* ''B'' = (8 * ''N'' - 1 - ''j'') __div__ 8

+

+while with the IBM PowerPC, all three equations are different:

+* ''b'' = 7 - ''j'' __mod__ 8

+* ''B'' = (8 * ''N'' - 1 - ''j'') __div__ 8

+* ''i'' = 8 * ''N'' - 1 - ''j''

+

+As you can see, the simplest form of relationships is in the little-endian case.