# CSE 245 Lecture Notes

CSE 246: Computer Arithmetic Algorithms and Hardware Design Fall 2006 Lecture 9: Floating Point Numbers Instructor: Prof. Chung-Kuan Cheng Motivation Maximal information with given bit numbers. Arithmetic with proper precision. Fairness of rounding. Features at the expenses of the

complexity of the operations. CSE 246 2 Topics: Floating Point Numbers (IEEE P754) CSE 246

Standard Operations Exceptional Situations Rounding Modes Numerical Computing with IEEE Floating Point Arithmetic, Michael L. Overton, SIAM 3 Standard 232 Typically Goal: Dynamic Range: largest #/ smallest #

If too large, holes between #s CSE 246 4 Standard ulp (unit in the last place) Difference between two consecutive values of the significand. 3 Parts x = ~s be:sign, significand, exponent Sign Bit

23-bit Significand 8-bit exponent CSE 246 5 Standard ~e1e2e3e4e5e6e7e8s1s2s3s22s23 1.s1s2s3s22s23 normalized number 0.s1s2s3s22s23 denormalized number e1e2e3e4e5e6e7e8

00000000 00000001 00000010 0 x=0.s1s2s3s22s23 2-126 1 x=1.s1s2s3s22s23 2-126 2 x=1.s1s2s3s22s23 2-125 . 127 0 1 1 1 1 1 1 1 x=1.s1s2s3s22s23 20 . 253 1 1 1 1 1 1 0 1 x=1.s1s2s3s22s23 2126 254 11111110 x=1.s1s2s3s22s23 2127 255

11111111 x= Inf if (s1 s23)= 0, NaN otherwise. NaN Not a Number CSE 246 6 Standard 0.01x2-3 = 0.001x2-2 Same number, so normalize to remove redundancy Use a default 1 in front for one more bit precision. Smallest Number 0.0001x2-126 = 1.0x2-23x2-126 = 1x2-149 CSE 246 7

Standard - Example ~ eeeeeeee sssss sssss sssss sssss sss 0 00000000 00000000000000000000000 = 0.0000x2-126 1 00000000 00000000000000000000000 =-0.0000x2-126 0 00000000 00000000000000000000001 = 0.0001x2-149 0 00000001 00000000000000000000000 = 1.0000x2-126 normalized minimum 0 00000001 00000000000000000000001 = 1.0001x2-126 . . 0 01111111 00000000000000000000000 = 1.0000x20 0 01111111 00000000000000000000001 = 1.0001x20 0 10000000 00000000000000000000001 = 1.0001x21 CSE 246 8 Standard Example Cont.

0 11111110 00000000000000000000000 = 1.0000x2127 0 11111110 00000000000000000000001 = 1.0001x2127 0 11111110 11111111111111111111111 = 1.1111x2127 - Normalized Maximum 0 11111111 00000000000000000000000 = Inf Nmin = 1.0 x 2-126 Nmax = (2 2-23)2127 CSE 246 9 Double Floating Point ~ e1e2e11 s1s2s52 0 00000 s1s2s52 x=0.s1s2s52 2-1022 0 00001 s1s2s52 x=1.s1s2s52 2-1022 . .

0 01111 s1s2s52 x=1.s1s2s52 20 0 10000 s1s2s52 x=1.s1s2s52 21 . . 0 11110 s1s2s52 x=1.s1s2s52 21023 0 11111 s1s2s52 x=Inf if (s1s52)=0 CSE 246 10 Overflow/Underflow Underflow Denser Sparser Overflow

Nmin CSE 246 11 Nmax Addition/Multiplication ~s1xbe1 + (~s2xbe2) = ~sxbe = ~s1xbe1 + ~s2/be1-e2 x be1 = (~s1 + ~s2/be1-e2) x be1 (~s1xbe1) x (~s2xbe2) = ~(s1xs2)be1+e2 CSE 246

12 Exceptions a/0 = Inf if a > 0 a/Inf = 0 if a != 0 a0 = 0 aInf = Inf if a > 0 a + Inf = Inf 0Inf = invalid operation (NaN) 0/0 = invalid operation (NaN) Inf - Inf = NaN NaP op a = NaN CSE 246 13 Rounding Mode

Adder Output = Cout z1z0.z-1z-2z-l GRS Guard Bit Round Bit Sticky Bit, OR of all bits below bit R 1.101 x 23 +1.110 x 23 11.011 x 23 1.1011x24 CSE 246 Normalize need to round or 14 Rouding 1.110 23 - 1.101 23

0.001 23 1.000 20 1.101 23 - 1.111 22 1.101 23 - 0.1101 23 0.1101 23 1.101 22 CSE 246 normalize Guard bit 15 Rounding Round to the nearest even

CSE 246 1.10111 toward 0 1.1011 Toward +Inf 1.1100 Toward -Inf 1.1011 16 Conventional Rounding Error Rounding 1.10100 1.10101 1.10110

1.10111 1.101 1.101 1.110 1.110 Error = = = = 0 -0.25

+0.5 +0.25 Average Error = 0.5/4 = 0.125 CSE 246 17

## Recently Viewed Presentations

• Zoom and Lenses - Most digital cameras have the ability to zoom in on an object before taking the picture, ... With a Pandora account, you can listen on a game console, Blu-ray player, computer, Internet-enabled TV, set-top box, or...
• Case 3 Resolution. Younger sister hospitalized for fecal impaction and a clean-out through her G-J tube. Hospitalization utilized as a springboard to convene all concerned team members to review the cases of the two sisters and determine treatment plan. Several...
• Lasts ~30 seconds (Peterson & Peterson Distracter Task). Maintenance Rehearsal (repetition) will maintain. Requires attention to maintain. These are the items you are currently remembering. Most argue that information proceeds to working memory prior to permanent storage
• The plate count (VIABLE COUNT) However, if the sample is serially diluted and then plated out on an agar surface in such a manner that single isolated bacteria form visible isolated colonies, the number of colonies can be used as...
• MCAD, MVP SQL Server Consultor Solid Quality Learning Iberoamericana ... Escenarios Herramienta final de informes Intranet, informes corporativos, … Informes integrados en aplicaciones Informes B2B - B2C Intranet, extranet, clientes … Creación Gestión Distribución Arquitectura ...
• EPMA - what is it? EPMA - is it for me? Goal of this course How this course is structured Use for Reference--In Library on Reserve Also On Reserve in Geo Library EPMA - "ideal" case The devil in the...
• Animal Life Cycles When Animals Reproduce they make more living things of the same kind Young Animals Some look a lot like parents Young Animals Some look very different Most Amphibians change form as they grow will look like their...
• In a hypothesis test, the evidence from the sample is a . test statistic.(In this case, we've taken a sample by counting let-handed people, and found the test statistic of "number of lefties was 5") The . level of significance...