Disgruntled PhD: Jun 25, 2010

I use R for most of my statistical work these days. Why you say? Well, its open source, free and has the most comprehensive set of add-on packages I have ever seen.

Its also sometimes incomprehensible and annoying. Take for instance, an error message I got a while back: Error in cov.wt(z) : 'x' must contain finite values only

I was attempting to do a factor analysis, and the above popped up. Naturally I was a little confused, as I hadn't allowed for participants in my surveys to respond "infinite". However, upon Googling, I discovered that factor analysis (along with most statistical methods) is highly sensitive to missing values, and R appears to treat these as infinite - i would prefer to call them undetermined, but I didn't write R.

The same thing happened to me today, when working out some correlations and I kept getting NA as the result. Given that i knew these correlations should exist in some form, I was confused. However, the problem was again missing values and when i used "pairwise complete obs" i got some sensible results.

The point (insofar as I have one) is that I had been using SPSS for many years, and never really copped to the issue that missing values were. The convenience of the GUI was preventing me from learning about the methods I was using.

And that, ladies and gentlemen, is one of the many reasons why I will continue to use R. (Don't worry, I'll go into excruciating details about the benefits thereof at another time).

Disgruntled PhD

Stat Counter

Friday, June 25, 2010

Error messages and their value