Microsoft Excel spreadsheets have become somewhat of a standard for data storage, at least for smaller data sets. This, along with the program often being packaged with new computers, naturally encourages its use for statistical analyses. This is unfortunate, since Excel is most decidedly not a statistical package.

Now, here are the values obtained when using the regression program available in the Analysis Toolpak of Microsoft Excel 2002 (the same results came from earlier versions of Excel; I will say something about Excel 2003 later).

Each of the nine numbers given above is incorrect! The slope estimate has the wrong sign, the estimated standard errors of the coefficients are zero (making it impossible to construct t–statistics), and the values of R2, F and the regression sum of squares are negative! It’s obvious here that the output is garbage (even Excel seems to know this, as the #NUM!’s seem to imply), but what if the numbers that had come out weren’t absurd — just wrong? Unless Excel does better at addressing these computational problems, it cannot be considered a serious candidate for use in statistical analysis.

Download pdf Statistical analysis using Microsoft Excel