Blame: FloatingPoint - Waikato Linux Users Group

Annotated edit history of FloatingPoint version 1, including all changes. View license author blame.

Rev	Author	#	Line
1	CraigBox	1	`Floating point is a number representation consisting of a mantissa, M, an exponent, E, and an (assumed) radix (or "base") . The number represented is M*R^E where R is the`
		2	`radix - usually ten but sometimes 2.`
		3
		4	`Many different representations are used for the mantissa and exponent themselves. The [IEEE] specify a standard representation which is used by many hardware floating-point`
		5	`systems. This is [IEEE 754\|http://grouper.ieee.org/groups/754/]. There is also lots of documentation at http://cch.loria.fr/documentation/IEEE754/.`
		6
		7	`The opposite is fixed-point.`
		8
		9	`!Single Precision`
		10
		11	`The IEEE single precision floating point standard representation requires a 32 bit word, which may be represented as numbered from 0 to 31, left to right. The first bit is the sign bit, S, the next eight bits are the exponent bits, 'E', and the final 23 bits are the fraction 'F':`
		12
		13	`S EEEEEEEE FFFFFFFFFFFFFFFFFFFFFFF`
		14	`0 1 8 9 31`
		15
		16	`The value V represented by the word may be determined as follows:`
		17
		18	`* If E=255 and F is nonzero, then V=NaN ("Not a number")`
		19	`* If E=255 and F is zero and S is 1, then V=-Infinity`
		20	`* If E=255 and F is zero and S is 0, then V=Infinity`
		21	`* If 0<E<255 then V=(-1)*S 2 ** (E-127) * (1.F) where "1.F" is intended to represent the binary number created by prefixing F with an implicit leading 1 and a binary point.`
		22	`* If E=0 and F is nonzero, then V=(-1)*S 2 ** (-126) * (0.F) These are "unnormalized" values.`
		23	`* If E=0 and F is zero and S is 1, then V=-0`
		24	`* If E=0 and F is zero and S is 0, then V=0`
		25
		26	`In particular,`
		27
		28	`0 00000000 00000000000000000000000 = 0`
		29	`1 00000000 00000000000000000000000 = -0`
		30
		31	`0 11111111 00000000000000000000000 = Infinity`
		32	`1 11111111 00000000000000000000000 = -Infinity`
		33
		34	`0 11111111 00000100000000000000000 = NaN`
		35	`1 11111111 00100010001001010101010 = NaN`
		36
		37	`0 10000000 00000000000000000000000 = +1 * 2*(128-127) 1.0 = 2`
		38	`0 10000001 10100000000000000000000 = +1 * 2*(129-127) 1.101 = 6.5`
		39	`1 10000001 10100000000000000000000 = -1 * 2*(129-127) 1.101 = -6.5`
		40
		41	`0 00000001 00000000000000000000000 = +1 * 2*(1-127) 1.0 = 2**(-126)`
		42	`0 00000000 10000000000000000000000 = +1 * 2*(-126) 0.1 = 2**(-127)`
		43	`0 00000000 00000000000000000000001 = +1 * 2*(-126) `
		44	`0.00000000000000000000001 =`
		45	`2**(-149) (Smallest positive value)`
		46
		47	`!Double Precision`
		48
		49	`The IEEE double precision floating point standard representation requires a 64 bit word, which may be represented as numbered from 0 to 63, left to right. The first bit is the sign bit, S, the next eleven bits are the exponent bits, 'E', and the final 52 bits are the fraction 'F':`
		50
		51	`S EEEEEEEEEEE FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF`
		52	`0 1 11 12 63`
		53
		54	`The value V represented by the word may be determined as follows:`
		55
		56	`* If E=2047 and F is nonzero, then V=NaN ("Not a number")`
		57	`* If E=2047 and F is zero and S is 1, then V=-Infinity`
		58	`* If E=2047 and F is zero and S is 0, then V=Infinity`
		59	`* If 0<E<2047 then V=(-1)*S 2 ** (E-1023) * (1.F) where "1.F" is intended to represent the binary number created by prefixing F with an implicit leading 1 and a binary point.`
		60	`* If E=0 and F is nonzero, then V=(-1)*S 2 ** (-1022) * (0.F) These are "unnormalized" values.`
		61	`* If E=0 and F is zero and S is 1, then V=-0`
		62	`* If E=0 and F is zero and S is 0, then V=0`

Last edited on Sunday, August 10, 2003 5:42:22 pm by CraigBox

Edit PageHistory Diff Info LikePages