The first level coloring decision is typically based on calculation of a so-called test statistic (e.g., a T-, F- or Z-score) for each voxel or brain region from the fMRI data. Under the null hypothesis that no true activation has occurred, a p-value can be determined, representing the probability that the calculated test statistic score or larger has occurred by chance. Whenever the p-value is less than an arbitrary preselected level of significance, we conclude the measurement is unlikely to have occurred by chance and classify the voxel as "activated/correlated".
Advanced Discussion (show/hide)»
Perhaps the most widely known procedure to account for multiple comparison errors in standard statistics is the Bonferroni correction. In its simplest form, the Bonferroni method merely divides the required Type I error level (α) by the number of independent tests (N) performed. Thus, one wishes to maintain an α = 0.05 error level for 10 tests, the p-value used would need to be set at 0.05/10 = 0.005. You can see that for an fMRI data set with N=~100,000 voxels being tested, the required p-value would be on the order of 5 x 10−7, an extremely stringent requirement. Using such a strict criterion to avoid Type I errors would severely impact the power of the fMRI data analysis leading to an increasing number of false negative results (Type II errors). Accordingly, several Bonferroni variants (Holm, Hochberg, Simes) including step-wise sequential testing have been devised. An alternative and increasingly popular approach is to control the false discovery rate (FDR), the expected proportion of falsely rejected voxels.
Colquhoun D. An investigation of the false discovery rate and the misinterpretation of p-values. R Soc Open Sci 2014; 1:140216.
Cohen J. The earth is round (p < .05). Am Psychologist 1994; 49:997-1003.
Engel SA, Burton PC. Confidence intervals for fMRI activation maps. PLOS ONE 2013; 8:e82419. (Paper demonstrating some of the errors made by naive viewers in their interpretation of activation maps, including the false idea that it is possible to compare brain areas based on their map colors.)
Goodman S. A dirty dozen: twelve p-value misconceptions. Semin Hematol 2008; 45:135-140.
Nichols TE. Multiple testing corrections, nonparametric methods, and random field theory. NeuroImage 2012; 62:811-815.
Nichols T, Hayasaka S. Controlling the familywise error rate in functional neuroimaging: a comparative review. Stat Methods Med Res 2003; 12:419-446.