{"items": [{"author": "Edward", "source_link": "https://www.facebook.com/jefftk/posts/145102848912347?comment_id=145113248911307", "anchor": "fb-145113248911307", "service": "fb", "text": "Jeff: that's the variance. The standard deviation is sqrt( mean( (x-mu)^2 )). A couple of reasons to square: pythagoras' theorem generalized, aka Euclidean distance, says that the sum of squares is a natural measure of distance in a variety of situations. This makes some statistical inferences, such as linear models, quite easy to implement. Also for a range of common probability distributions, most prominently the normal distribution/gaussian distribution/bell curve, the mean and variance most easily describe the distribution. However, there are some situations where |x-mu| is more useful. For example, the variance can very sensitive to a handful of outliers for which (x-mu) is large, because you square them, but |x-mu| is less sensitive to these. There is an area called 'robust statistics' dedicated to similar ideas.", "timestamp": "1314878442"}, {"author": "Peter", "source_link": "https://www.facebook.com/jefftk/posts/145102848912347?comment_id=145118892244076", "anchor": "fb-145118892244076", "service": "fb", "text": "http://en.wikipedia.org/wiki/Moment_generating_function", "timestamp": "1314879500"}, {"author": "Peter", "source_link": "https://www.facebook.com/jefftk/posts/145102848912347?comment_id=145119072244058", "anchor": "fb-145119072244058", "service": "fb", "text": "Variance is just the 2nd central moment.", "timestamp": "1314879532"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/145102848912347?comment_id=145121432243822", "anchor": "fb-145121432243822", "service": "fb", "text": "@Edward: I just fixed it; I'd forgotten the square root", "timestamp": "1314880001"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/145102848912347?comment_id=145123532243612", "anchor": "fb-145123532243612", "service": "fb", "text": "@Edward: thinking about euclidian distance.  You're saying we can think of our N data points as N dimensions.  So we have an N-tuple: (x_1-mu, x_2-mu, ..., x_N-mu).  The distance between that N-tuple and the origin would be the standard deviation.  Which means the origin represents no deviation.  Does that work?", "timestamp": "1314880383"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/145102848912347?comment_id=145182518904380", "anchor": "fb-145182518904380", "service": "fb", "text": "@Edward: I had that a little wrong.  Actually there are two tuples: the data tuple (x_1, x_2, x_3, ..., x_N) and the imaginary tuple representing everything being at the mean: (mu, mu, .... , mu).  Then the standard deviation is the euclidian distance between the data tuple and the mean tuple.", "timestamp": "1314890024"}, {"author": "David&nbsp;Chudzicki", "source_link": "https://plus.google.com/106120852580068301475", "anchor": "gp-1314921491381", "service": "gp", "text": "Edward's answer on FB is good, and your distance-minimizing formulation seems right, though I haven't seen before.\n<br>\n<br>\nA related question is about \"mean\" vs. \"median\" (say you're summarizing a set of 1-dimensional measurement): It's nice to think of mean and median in terms of what they each accomplish -- median minimizes the sum of the absolute values of the differences (sum of |x_i  -  mu| -- but actually the solution isn't unique if the number of data points is even), where mean minimizes the sum of the \nsquared\n differences (x_i - mu)^2.\n<br>\n<br>\nIf your process is unknown to be drawing from a normally distributed random variable with unknown mean  MU  , then the above is part of why the mean is the maximum likelihood estimate for   MU.  The squaring penalizes outliers extra, because with a normal distribution, outliers are assumed to be very unlikely.", "timestamp": 1314921491}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://plus.google.com/103013777355236494008", "anchor": "gp-1314930955351", "service": "gp", "text": "@David&nbsp;Chudzicki\n \"median minimizes the sum of the absolute values of the differences\"\n<br>\n<br>\nReally?  I hadn't thought of that.  But with some playing around with numbers it does seem to be true.", "timestamp": 1314930955}, {"author": "David&nbsp;Chudzicki", "source_link": "https://plus.google.com/106120852580068301475", "anchor": "gp-1314932141233", "service": "gp", "text": "Yeah -- I guess you can think of it like this: moving left one small unit increases the distance to all points on the right (by that same amount, one small unit), and moving right increases the distance to all points to the left. So if you're trying to minimize the sum of the distances, moving either left or right will help, \nunless\n you have equal numbers of points on both sides.  (So if you have an even number of points, anywhere in the middle of the middle to points is equally good. By convention, I guess we adopt the midpoint between them.)", "timestamp": 1314932141}, {"author": "David&nbsp;Chudzicki", "source_link": "https://plus.google.com/106120852580068301475", "anchor": "gp-1314932409032", "service": "gp", "text": "On the original question, it probably works better to ask \"Why is variance a good concept?\" The answer will have to do with all sorts of nice mathematical properties. And then we often take the square root, to have something that's in the same units as what we originally cared about.\n<br>\n<br>\nBut I'll admit that your suggestions might be more intuitive/useful for some of the applications that we use standard deviations for...\n<br>\n<br>\nAlso, note that variance and standard deviation are, to statisticians, more naturally properties of probability distributions than a set of points. (Your concept could be defined that we too.) We're often trying to estimate the variance of a distribution from a finite sample.", "timestamp": 1314932409}, {"author": "Edward", "source_link": "https://www.facebook.com/jefftk/posts/145102848912347?comment_id=145458695543429", "anchor": "fb-145458695543429", "service": "fb", "text": "\"Then the standard deviation is the euclidian distance between the data tuple and the mean tuple.\" exactly. Which isn't to say it's always the best thing to measure, but is an argument for thinking about it. This is also how least squares fitting works, an argument i learnt teaching college linear algebra.", "timestamp": "1314932508"}]}