Thursday, January 13, 2011

There is no such thing as ESP

There is something of a major controversy brewing over an article set to be published in a highly respected social psychology journal which purportedly demonstrates scientific evidence for the existence of extra-sensory perception.  Now, I've read the actual paper... and I don't buy it, not for a second.  Here is a decidedly strong opinion from someone whose trained not to have them about potential behavioral phenomena: There is no such thing as ESP.

Though the author of the paper has published work on ESP before, the catalyst of the current controversy seems to be an article published in the New York Times.  The issue at hand isn't the existence of ESP, which most people dismiss outright, but the use of significance testing as the primary statistical tool in psychology.  I can't really get into the ins and outs of significance testing here, but it is the method most commonly used to determine if the differences observed in psychology studies are due to experimental manipulation or simply chance.  The vast majority of my formal statistics training has been in learning how to properly apply these methods, with the vast majority of the results section of my masters thesis devoted to reporting how my experiments failed to show statistical differences using significance testing.

A follow-up article in the Times discusses this in more detail and ends up condemning an entire field for using a form of statistics thats been widely accepted for almost a century.  Hilariously, this same article implies that psychology should adopt statistical methods similar to those used in medical studies as well as new methods such as Bayesian statistics.  This is hilarious for a few reasons.  First, and I'm not going to cite specific people/departments/papers here as a professional courtesy, but in my experience the closer a study is to examining anything medically related, the worse the statistics.  Second, any psychology researcher worth his or her salt is already using a variety of statistical tools including both significance testing AND Bayesian methods.  Psychologists may be behind in a lot of things, but learning about how to use (or abuse) new methods in statistics is not one of them (especially for those of us who use fMRI).  Also.  Bayesian statistics, for all their current popularity, are not without their own myriad of problems.

In terms of statistics in psychology (or really, statistics in anything) the issue isn't which method is better than another, but more when should any given method be applied.  More succinctly, you need to know how to use statistics to properly use statistics.  Different methods are useful for different things.  Standard significance testing may be wholly inappropriate for use in some medical studies, but Bayesian statistics is not at all useful for answering questions in most branches of psychology (I actually apply it in fMRI analysis, but thats beside the point).  Also, and I realize this may be particularly jarring to people in "hard" sciences, no statistics are objective.  This may be especially evident with significance testing, but any number generated in any scientific pursuit is subject to interpretation.  In significance testing, where p=0.05 is supposedly the threshold denoting statistical significance, results are commonly reported as significant when they are as "insignificant" as p=0.1.  Its not the math thats potentially problematic, its the scientists using them.  Incidentally, that previous sentence is why I think philosophy of science is so interesting.

Back to the ESP paper... I don't think the statistical methods are the source of the result.  From what I can tell, the proper statistics were used given the experimental design.  ESP obviously isn't real, so where do the results come from?  Reading between the lines of the actual paper, it seemed to me that a lot of the experiments featured rather lax experimenter controls (poor RA oversight, subjects were not blind to the purpose of the study) that likely contributed to the statistical error that led the performance differences ever so slightly above chance level that could potentially be interpreted as possible evidence for a phenomena that may or may not be ESP.  So really the controversy should be about poor laboratory controls in research published in top tier journals and overstating what your results actually mean, phenomena that I feel are startlingly widespread, rather than the (mis)use of a particular statistical methodology.

Also. There is no such thing as ESP.

Obviously.

No comments: