Statistics

Top  Previous  Next

To understand one of the key ideas of statistics, consider the following.

 

Statistics

Here we have a coordinate system, x-y axis (black lines). We draw black dots. Let us look at these black dots. The blue line is the best-fit line that can be drawn between these dots. Some dots are above the line and some below, and the average distance is zero. When we draw this line, we can predict the future. We go somewhere on the x-axis, and get the probable value of the dot on the y-axis.

 

Now let us look at the red dots. The blue line is also the best-fit line. Yet there is something different about the red dots. They are spread out further. The way we handle this is to take the average of the squares of the distances (and then take the square root). This way lines above and below contribute, and do not cancel (squares are always positive). This is called the standard deviation. The standard deviation of the red dots is larger than the standard deviation of the black dots. It also means that when we guess the future of the red dots, we will have a larger error.

 

For more examples, see Percentages and blonde girls.

 

Students must be told that statistics is very different than other subjects. It is not possible to take simple examples that you can calculate with a pencil. They have to spend time at home thinking about the stuff.