Big-Endian, Little-Endian: Comparison

Byte Order in Multibyte Values

The terms big-endian and little-endian are derived from the Lilliputians of Gulliver's Travels, who see a major political issue of whether soft-boiled eggs should be opened on the big side or the little side. Likewise, the big/little-endian computer debate has much more to do with political issues than technological issues.

Java's Virtual Machine, MIPS, HP, IBM and Motorola 68000 systems store multibyte values in Big Endian order, while Intel 80x86 and DEC VAX systems store them in Little Endian order. Big Endian stores the high-order byte at the starting address while Little Endian stores the low-order byte at the starting address. The low-order byte contains the bits for the lowest possible values, that is, 0-255, while the high-order byte contains the bits that specify the large values (that is, 256-65535 in a short integer). Swapping integer data between computers of different types is a difficult problem unless you convert the information into ASCII characters.

The Berkeley Sockets library #include<netinet/in.h> or the GNU Sockets library #include<sys/socket.h> has functions to convert Long and Short Integers to and from Internet standard byte ordering, which is Big Endian.

htonl, convert host-to-network, long integer
htons, convert host-to-network, short integer
ntohl, convert network-to-host, long integer
ntohs, convert network-to-host, short integer

Big Endian systems such as HP-UX and MPE are already compatible with the network ordering, so they can define these functions as null macros.

The Power PC is a bi-endian processor; that is, it supports both big- and little-endian addressing modes. This bi-endian architecture enables software developers to choose either mode when migrating OSes and applications from other machines.

There are advantages and disadvantages to each method which I will explain below. A big-endian architecture stores the most significant byte at the lowest address offset while little-endian architecture stores the least significant byte at the lowest address offset. the 32-bit hex value 0x12345678 would be stored in memory as follows:>

Address         00    01    02     03
----------     --    --    --     --
big-endian     12    34    56     78
little-endian 78    56    34    12

As you can see above, reading a hex dump is certainly easier in a big-endian machine since we normally read numbers from left to right (lower to higher address). Listed below are the pros and cons of each architecture (as far as I'm concerned).

Big-Endian Pros

1) Reading the arithmetic sign bit of the wrong word size, integer size or floating point number size will always yield a correct value.

2) Most network header codes and bitmapped graphics (displays and memory arrangements) are mapped with a "MSB on the left" scheme which means that shifts and stores of graphical elements larger than a byte are handled naturally by the architecture. This is a major performance disadvantage for little-endian machines since you have to keep reversing the byte order when working with large graphical elements.

3) When decoding variable length bit codes (compressed data) such as Huffman and LZW, you can use the code word as an index into a lookup table if it's encoded MSB to LSB (big-endian); The same goes for encoding since the shifted bits would then have to be mirrored to generate such codes on a little-endian machine.

4) Easier to read hex dumps.

Big-Endian Cons

1) Reading a value of the wrong word size will result in an incorrect value; when done on little-endian architecture, it can sometimes yield a correct result.

2) Most big-endian architectures (non-Intel) do not allow words to be written on non-word address boundaries (odd addresses). Intel allows odd address reads and writes (they get broken into 2 separate operations) which makes it easier for programmers, but more difficult for hardware designers.

The pros and cons of little-endian are basically the opposites of those listed above. Intel added a nice instruction to help this situation starting with the 486. The BSWAP instruction reverses the byte order within a 32-bit data register. That's about all that needs to be said on this topic. There are some who have turned this into a religious battle, but if you look at it objectively, there aren't any real important differences between the two architectures. As as side note, the Internet commitee chose big-endian order for use within network data packets; they call it 'network byte order'.