Here, we focus on Binary number representation only.
To store a real number (a number with fractional component), there are two major approaches: fixed point number representation and floating point number representation.
First, what is a binary point?
A binary point is like the decimal point in a decimal system. It acts as a divider between the integer and the fractional part of a number.
A binary point represents the coefficient of the term 20 = 1. All digits (or bits) to the left of the binary point carries a weight of 20, 21, 22, and so on. Digits (or bits) on the right of binary point carries a weight of 2-1, 2-2, 2-3, and so on. For example, the number
For example, the number 11010.12 =
1 * 24 + 1 * 23 + 0 * 22 + 1 * 21 + 0* 20 + 1 * 2-1
= 16 + 8 + 2 + 0.5
= 26.5
Fixed point
This representation implicitly fixes the binary point to be at some position of a numeral. In this case, both the integer part and the fraction part is with fixed number of bits.
A negative fixed-point number is not hard to represent as long as we use 2’s compliments.
Floating point
This representation reserves a certain number of bits for the number (called the mantissa or significand) and a certain number of bits to indicates where within that number the decimal place sits (called the exponent).
For 32-bit system, a floating point number contains 3 parts:
– Sign (1 bit)
– Exponent (8 bit)
– Mantissa (23 bit) it can be fraction or an integer.