Flexible handling of time-correlation

We proceed to model jointly signals and dependence so as to obtain sharper test statistics after eliminating, as much as possible, the impact of dependence. This impact is particularly obvious in the following illustrative example. The dataset ‘simerp’, which is provided in the ERP package, has been simulated using the same covariance structure as observed in the ‘erpcz’ dataset. A relationship with an arbitrary continuous covariate has been added by means of a simple regression model at each timepoint between 450ms and 550ms. Outside of this peak interval, the ERPs are not correlated with the covariate. Therefore, it is expected that any multiple testing procedure finds a significant peak around 500ms.

First, let us load the simulated dataset:

data(simerp)

In ‘simerp’, the covariate is names ‘y’. Therefore, the BH procedure can be implemented like this:

tests = erptest(simerp[,1:251],design=model.matrix(~y,data=simerp), method="BH")

with corresponding commands for plotting the F-test process:

plot(frames,tests$test,type="l",xlab="Time (ms)",ylab="F-test")#,ylim=c(-1,1))
points(frames[tests$significant],rep(-1,length(tests$significant)),
   pch=16,col="blue")
title("Simulation")

As revealed by the resulting plot below, the procedure clearly fails at finding the peak around 500ms.

erptestsim

This suggests to handle dependence in the multiple testing procedure. As suggested in Friguet et al. (2009) and adapted to the ERP framework in Sheu et al. (2014), latent factors are introduced to capture linearly the time-dependence among residuals.
Similar approaches with different estimating strategies can be found in Leek and Storey (2008), Sun et al. (2012) for multiple testing in high-dimensional settings. As in Friguet et al. (2009), the estimating algorithm is EM-like, alternating the estimation of covariate effects and the factors. Because the estimation of covariate effects is severely affected by dependence, estimation takes advantage of a prior knowledge of timepoints in which no signal is exepected to better separate the covariate effect and the noise.

If, for example, it is conjectured that the covariate should not be related to the ERPs in the 100 first and last milliseconds, then let us store this set of timepoints in ‘s0’:

s0 = c(1:25,226:251) # Remember that the lag between two timepoints is 4ms

The ‘erfatest’ can now be used to calculated decorrelated F-tests:

tests = erpfatest(simerp[,1:251],design=model.matrix(~y,data=simerp),s0=s0)

The following commands generate a plot of the decorrelated F-test process:

plot(frames,tests$test,type="l",xlab="Time (ms)",ylab="F-test")#,ylim=c(-1,1))
points(frames[tests$significant],rep(-1,length(tests$significant)),
   pch=16,col="blue")
title("Simulation")

The resulting plot shows a clear improvement w.r.t the above approach ignoring dependence:

erpfatestsim1

Note that the ERP package is provided with the method proposed by Friguet et al. (2009) to determine the number of factors. By default, the argument ‘nbf’ is set to NULL, which means that the number of factors should be deduced from the data. Also, if you do not want to introduce a prior knowledge on ‘no-signal’ intervals, you can choose ‘s0=NULL’. In this case, ‘s0’ will be deduced from a previous standard analysis ignoring dependence. Try these commands, you will see that the result is almost as good:

tests = erpfatest(simerp[,1:251],design=model.matrix(~y,data=simerp),s0=NULL)
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s