Causal Mediation in Additive Survival Models with the multimediate package

We provide a short toy example that allows interested readers to reproduce the analysis from Ruiz-Hernandez et al. (2017) using the multimediate package which implements counterfactual multiple mediation analysis in the setting of additive survival.

Now, we load the needed packages. We can verify that the package correctly uploaded by running packageVersion("multimediate"). Which should return [1] ‘0.1.4’.

The toy dataset provided in this vignette is a subset of the study population used in the manuscript by Ruiz-Hernandez et al. (2017). The original dataset included imputation of missing completely at random urine cadmium values and blood lead values below the detection limit (for a deeper understanding of the imputation, see Ruiz-Hernandez et al. (2017). Here we provided the subset of actually measured (i.e. unimputed) urine cadmium values. Also we extended the followup for mortality data collection to 10 years.

Multiple mediation analysis with `correlated=FALSE`

In Ruiz-Hernandez et al. (2017), the estimated absolute reduction in cardiovascular deaths in the US comparing 1999-2004 to 1988-94 was 230.7 deaths/100,000 person-years in models adjusted for cardiovascular risk factors. The product of coefficients method estimated that, among those avoided deaths, 52.0 (95% CI 84.4, 96.7) and 19.4 (4.3, 36.4) deaths/100,000 person-years were attributable to changes in lead and cadmium, respectively.

The multimediate() function with correlated=FALSE option will reproduce the Ruiz-Hernandez et al. (2017) results, using instead the counterfactual frame, yielding fairly consistent findings.

# Please provide the corresponding path or place file in working directory.
nhanes <- read.csv("251003_Nhanes_Multimediate.csv")

# Multimediate requires that in the Aalen model, factors are recoded as dummy variables
nhanes$race.1 <- as.factor(ifelse(nhanes$race == 1, 1, 0))
nhanes$race.2 <- as.factor(ifelse(nhanes$race == 2, 1, 0))
nhanes$race.3 <- as.factor(ifelse(nhanes$race == 3, 1, 0))
nhanes$race.4 <- as.factor(ifelse(nhanes$race == 4, 1, 0))
nhanes$riagendr.1 <- as.factor(ifelse(nhanes$riagendr == 1, 1, 0))
nhanes$riagendr.2 <- as.factor(ifelse(nhanes$riagendr == 2, 1, 0))
nhanes$smoking.1 <- as.factor(ifelse(nhanes$smoking == 1, 1, 0))
nhanes$smoking.2 <- as.factor(ifelse(nhanes$smoking == 2, 1, 0))
nhanes$smoking.3 <- as.factor(ifelse(nhanes$smoking == 3, 1, 0))
nhanes$bmi_cat <- as.factor(I(nhanes$bmxbmi >= 30))
nhanes$lbxtc_cat <- as.factor(I(nhanes$lbxtc >= 200))
nhanes$lowhdl <- as.factor(nhanes$lowhdl)
nhanes$cholmed <- as.factor(nhanes$cholmed)
nhanes$hbp <- as.factor(nhanes$hbp)
nhanes$sedent <- as.factor(nhanes$sedent)
nhanes$diab <- as.factor(nhanes$diab)

# In this example the multiple mediators are continuous variables.
# We log-transform to improve normality.

nhanes$log.cr.ucd <- log(nhanes$cr.ucd) # Urine cadmium
nhanes$log.bpb <- log(nhanes$lbxbpb) # Blood lead

# Cautionary note: treatment variables must NOT BE factors
# In this example treatment is time period.

nhanes$wv.gr.1 <- ifelse(nhanes$wv.gr == 1, 1, 0) # Indicator for 1988-1994 survey
nhanes$wv.gr.2 <- ifelse(nhanes$wv.gr == 2, 1, 0) # Indicator for 1999-2004 survey

dim(nhanes)

# We first run the outcome model including indicator for 1999-2004 (treatment)
# the two mediators and confounders:

variables_y <- paste0(
  "Surv(ridageyr,peryr.age.10yr,cvd.10yr==1) ~ const(wv.gr.2) +
       const(log.bpb) + const(log.cr.ucd) + const(race.2) + 
       const(race.3)+const(race.4) + const(diab) + 
       const(bmi_cat) + const(riagendr.2) + const(smoking.2) + 
       const(smoking.3) + const(lbxtc_cat) + const(lowhdl) + 
       const(cholmed) + const(hbp)+const(sedent)"
)

model.y = aalen(
  as.formula(variables_y),
  start.time = 40,
  weights = nhanes$wtshm12yr,
  data = nhanes,
  robust = T
)

We subsequently run the mediator models. Cautionary note: the mediators must be entered in the same order as in the outcome model.

variables1 <- paste0("log.bpb ~ wv.gr.2 + riagendr.2 +  ridageyr +  race.2 + race.3 +
                           race.4 + smoking.2 + smoking.3")
variables2 <- paste0("log.cr.ucd ~ wv.gr.2 + riagendr.2 +  ridageyr + race.2 + race.3 +
                           race.4 + smoking.2 + smoking.3")
      
# NOTE: Here we have not adjusted one given mediator model for other correlated 
# mediators, since `multimediate` internally simulates marginal pair-wise correlations 
# between the multiple mediators and we want to compare results with correlated=TRUE
# option in next section. Generally, if we are to set the correlated=TRUE option it is 
# not recommended to adjust one given mediator for other correlated mediators.

modM1 = lm(variables1, data = nhanes)
modM2 = lm(variables2, data = nhanes)

lmodel.m=list(modM1, modM2)
      
est_1 <- as.data.frame(paste0(round(summary(modM1)$coef[2, 1], 2), ' (', round((
  summary(modM1)$coef[2, 1] - (summary(modM1)$coef[2, 2] * 1.96)
), 2), ', ', round((
  summary(modM1)$coef[2, 1] + (summary(modM1)$coef[2, 2] * 1.96)
), 2), ')'))

# Note that exp(est_1) will estimate the relative change in blood lead levels 
# comparing time periods

est_2 <- as.data.frame(paste0(round(summary(modM2)$coef[2,1],2), ' (', round((summary(modM2)$coef[2,1] -
                    (summary(modM2)$coef[2,2]*1.96)),2), ', ', round((summary(modM2)$coef[2,1] +
                    (summary(modM2)$coef[2,2]*1.96)),2), ')'))

# Note that exp(est_2) will estimate the relative change in urine cadmium levels 
# comparing time periods

Once we have fitted the outcome and the mediator models we can call the multimediate() function to estimate direct, indirect and total effects.

      
multim=multimediate(lmodel.m,correlated=FALSE,model.y,treat="wv.gr.2",treat.value=1,
                        control.value=0,J=1000,conf.level=0.95,data=nhanes)
     

sum <- suppressWarnings(summary(multim, option='avg'))

.	Estimation	IC.inf	IC.sup	P.val
ACME.joint.treat	-94.3997	-147.0478	-42.2592	0.000
PM(treat)	0.5207	0.1863	1.1778	0.000
ACME.joint.control	-94.3997	-147.0478	-42.2592	0.000
PM(control)	0.5207	0.1863	1.1778	0.000
ACME.treat.log.bpb	-66.0982	-116.9757	-15.9155	0.006
PM(treat).log.bpb	0.3509	0.0724	0.8268	0.006
ACME.control.log.bpb	-66.0982	-116.9757	-15.9155	0.006
PM(control).log.bpb	0.3509	0.0724	0.8268	0.006
ACME.treat.log.cr.ucd	-28.3015	-48.3761	-9.3259	0.002
PM(treat).log.cr.ucd	0.1586	0.0388	0.3838	0.002
ACME.control.log.cr.ucd	-28.3015	-48.3761	-9.3259	0.002
PM(control).log.cr.ucd	0.1586	0.0388	0.3838	0.002
ADE.treat	-111.5327	-241.2674	19.1708	0.098
ADE.control	-111.5327	-241.2674	19.1708	0.098
Total Effect	-205.9324	-320.6998	-94.3693	0.000

Note that we set option='avg' for obtaining average mediation effects in the absence of exposure-mediator interactions. Also note the output table shows identical mediation effects for treatment and control because we have not included exposure-mediator terms in the outcome model.

We next run and outcome model without the mediator.


variables_y2 <- paste0("Surv(ridageyr,peryr.age.10yr,cvd.10yr==1) ~ const(wv.gr.2) +
              const(race.2)+const(race.3)+const(race.4)+const(diab)+const(bmi_cat)+const(riagendr.2)+
              const(smoking.2)+const(smoking.3)+const(lbxtc_cat)+const(lowhdl)+const(cholmed)+const(hbp)+const(sedent)")
 
model.y2 = aalen(as.formula(variables_y2), start.time=40,
                 weights=nhanes$wtshm12yr, data = nhanes, robust=T)
              
    m_y2 <- as.data.frame(as.matrix(paste0(round(model.y2$gamma[1,1]*100000,2), ' (', round((model.y2$gamma[1,1] - (sqrt(model.y2$robvar.gamma[1,1])*1.96))*100000,2), ', ', round((model.y2$gamma[1,1] + (sqrt(model.y2$robvar.gamma[1,1])*1.96))*100000,2), ')')))

We expect the coefficient for time period will equal the total effect in the multimediate output (sanity check). Happily it does as shown next:

V1
-201.65 (-317.7, -85.61)

Next, we show interesting extensions of multiple mediation that we could not carry out with the difference and product of coefficient methods in the original Ruiz-Hernandez et al. (2017) paper.

Multiple mediation analysis with `correlated=TRUE`

We will allow for correlation between multiple mediators (in this case blood lead and urine cadmium, which are exposure biomarkers), which may share common sources of exposure and metabolic pathways.

multim=multimediate(lmodel.m,correlated=TRUE,model.y,treat="wv.gr.2",treat.value=1,
                        control.value=0,J=1000,conf.level=0.95,data=nhanes)
     

sum <- suppressWarnings(summary(multim, option='avg'))

.	Estimation	IC.inf	IC.sup	P.val
ACME.joint.treat	-95.5885	-146.3312	-44.0272	0.000
PM(treat)	0.5137	0.1805	1.1810	0.000
ACME.joint.control	-95.5885	-146.3312	-44.0272	0.000
PM(control)	0.5137	0.1805	1.1810	0.000
ACME.treat.log.bpb	-67.0958	-116.6899	-15.8575	0.004
PM(treat).log.bpb	0.3580	0.0759	0.8411	0.004
ACME.control.log.bpb	-67.0958	-116.6899	-15.8575	0.004
PM(control).log.bpb	0.3580	0.0759	0.8411	0.004
ACME.treat.log.cr.ucd	-28.4927	-48.7361	-9.3144	0.002
PM(treat).log.cr.ucd	0.1528	0.0390	0.3529	0.002
ACME.control.log.cr.ucd	-28.4927	-48.7361	-9.3144	0.002
PM(control).log.cr.ucd	0.1528	0.0390	0.3529	0.002
ADE.treat	-112.5341	-246.3659	17.7682	0.088
ADE.control	-112.5341	-246.3659	17.7682	0.088
Total Effect	-208.1226	-320.3532	-95.7705	0.000

We observed that, in this case, the estimated mediation effects do not substantially change after allowing for correlation between multiple mediators. Thus, supporting the correlation between mediators is not relevant in the original Ruiz-Hernandez et al. (2017) paper.

Multiple mediation analysis including exposure-mediator interactions

Interestingly, multimediate can also accommodate exposure-mediator interactions because the counterfactual framework enables so.


variables_y <- paste0("Surv(ridageyr,peryr.age.10yr,cvd.10yr==1) ~ const(wv.gr.2) +
              const(log.bpb) + const(log.cr.ucd)+const(wv.gr.2*log.bpb)  +const(race.2)+
              const(race.3)+ const(race.4)+ const(diab)+ const(bmi_cat)+ const(riagendr.2)+
              const(smoking.2)+const(smoking.3)+ const(lbxtc_cat)+const(lowhdl)+const(cholmed)+
                const(hbp)+const(sedent)")


model.y = aalen(as.formula(variables_y),start.time=40,
                weights=nhanes$wtshm12yr, data = nhanes, robust=T)
                      
                      
 
# Warning: at the current moment multimediate only takes one exposure-mediator 
# interaction at a time (work in progress to accomodate more). Note that the first 
# mediator model in the lmodel.m list must be the one with the interaction, and this 
# ordering must be consistent with the order mediators have been included in the outcome
# model.
        
variables1 <- paste0("log.bpb ~  wv.gr.2 + riagendr.2 +  ridageyr +  race.2 + race.3 +
                     race.4 + smoking.2 + smoking.3")
variables2 <- paste0("log.cr.ucd ~ wv.gr.2 + riagendr.2 +  ridageyr + race.2 + race.3 +
                     race.4 + smoking.2 + smoking.3")
      
    
modM1 = lm(variables1, data = nhanes)
modM2 = lm(variables2, data = nhanes)
lmodel.m=list(modM1, modM2)
      
    
multim=multimediate(lmodel.m,correlated=TRUE,model.y,treat="wv.gr.2",treat.value=1,
                             control.value=0,J=1000,conf.level=0.95,data=nhanes)

sum <- suppressWarnings(summary(multim, option='navg'))

Note the option option='navg' for obtaining average mediation effects in the presence of exposure-mediator interactions. Now, we obtain different mediation effects estimates for both NHANES periods.

.	Estimation	IC.inf	IC.sup	P.val
ACME.joint.treat	-58.3744	-143.3592	24.1236	0.140
PM(treat)	0.3390	-0.0947	1.0828	0.140
ACME.joint.control	-119.7350	-170.3306	-67.5683	0.000
PM(control)	0.6118	0.2874	1.2170	0.000
ACME.treat.log.bpb	-30.1140	-108.2133	47.5489	0.476
PM(treat).log.bpb	0.1542	-0.2536	0.6806	0.476
ACME.control.log.bpb	-91.4746	-141.8467	-40.3954	0.000
PM(control).log.bpb	0.4674	0.1645	1.0498	0.000
ACME.treat.log.cr.ucd	-28.2603	-47.8338	-9.0230	0.002
PM(treat).log.cr.ucd	0.1508	0.0399	0.3480	0.002
ACME.control.log.cr.ucd	-28.2603	-47.8338	-9.0230	0.002
PM(control).log.cr.ucd	0.1508	0.0399	0.3480	0.002
ADE.treat	-97.2185	-217.6141	22.8343	0.126
ADE.control	-158.5790	-333.0215	7.8943	0.072
Total Effect	-216.9534	-337.8046	-100.3597	0.000

The estimated mediation effects are similar as in the model not including the interaction. Indeed, the interaction p-value in the outcome model is far from being statistically significant (equal to 0.96). Thus, our results do not suppor exposure-mediator interaction for cadmium models.

Causal Mediation in Additive Survival Models with the `multimediate` package

Tellez-Plaza, María

2025-10-08

Multiple mediation analysis with `correlated=FALSE`

Multiple mediation analysis with `correlated=TRUE`

Multiple mediation analysis including exposure-mediator interactions

`multimediate` can do more!

References

Causal Mediation in Additive Survival Models with the multimediate package

Tellez-Plaza, María

2025-10-08

Multiple mediation analysis with correlated=FALSE

Multiple mediation analysis with correlated=TRUE

Multiple mediation analysis including exposure-mediator interactions

multimediate can do more!

References

Causal Mediation in Additive Survival Models with the `multimediate` package

Multiple mediation analysis with `correlated=FALSE`

Multiple mediation analysis with `correlated=TRUE`

`multimediate` can do more!