Let us say that one has some function of measureable quantities. For each of those measureable quantities you have decided that your measurement has some uncertainty. Furthermore, and this is crucial, you believe that those uncertainties are independent. Ie, if you measure two quantities, lets call them x and y, and you believe that the uncertainty in x is Dx, and in y is Dy, you also believe that that the uncertainty in y does not depend on that in x. Ie, in making the measurement you do not believe that if you find a value of x that is "too big" that the value of y will also be "too big". Ie, there is not something in the measurements which make the uncertainty in any measurement of x and y be related in their sign to each other.

Now, consider some measurement, in which x is measured to have value x0+dx, and y to have value y0+dy where dx will like somewhere aroung the interval from x0-Dx to x0+Dx and similarly for y0+dy (I will take x0 and y0 to be the "true value" that you would get if you did an infinitely precise measurement of the value. Of course you have no idea what x0 or y0 are). What you are trying to detemine is the value of some function f(x,y), a function of the two variables. In this particular case you would calculate f(x0+Dx,y0+Dy). I can write this as f(x0+dx,y0+dy)= f(x0,y0)+ ((f(x0+dx,y0+dy)-f(x0,y0+dy)/ dx)+ ((f(x0,y0+dy)-f(x0,y0))/dy)dy From the definition of a derivative, limit dx->0 ((f(x0+dx,y0+dy)-f(x0,y0+dy)/ dx is the derivative of f(x.y) with respect to x evaluated at x=x0,y=y0+dy. From what the limit means, if x is close to x0 and y is close to y0+dy, then ((f(x0+dx,y0+dy)-f(x0,y0+dy)/ dx is approximately df(x0,y0+dy)/dx. Also since lim y-> y0 f(x0,y) =f(x,y0), we can write the above value of f(x0+dx,y0+dy) = f(x0,y0) + (df(x0,y0)/dx) dx + (df(x0,y0)/dy) dy This is not exact, but it should be a pretty good approximation if dx and dy are very small. Thus the value of f(x,y) should be a pretty well approximated by this expression. Now, in combining our uncertainties, if we took dx in the above to be its estmated value of Dx, and y of Dy, we would be typically overestimating the uncertainty in f, since sometimes dx would be positive and sometimes negative and similarly for dy. Thus although sometimes the uncertainty in f would be big, sometimes it would cancel. If the undertainties in x and y are independent as above, one takes into account this partial cancellation by taking the square root of some of squares. Ie, Df = sqrt( [df/dx}^2 Dx^2 +[df/dy]^2 Dy^2]

If we define our uncertainty by the average square uncertainty, and one assumes that the distribution of the uncertainties is Gaussian, then this is a good approximation to the "right" value. It is anyway what tends to be used in the literature (even at times when it is not a good approximation to the "right value". Taleb has written a whole book, Black Swans, on cases where this is used in situtations where it is inappropriate.)