Skip to main content

Table 3 Deviations from the registered stage 1 protocol, following the Preregistration Deviation Table Template (https://osf.io/et6km/)

From: Assessing causal links between age at menarche and adolescent mental health: a Mendelian randomisation study

Deviations

 

No

Details

Original wording

Deviation description

Reader impact

 

1

Type

Variables

In the analysis plan, we stated that we would run negative control MR analyses using diagnoses during childhood as the outcome (ages 0–8). We also proposed to include diagnostic status prior to puberty as a covariate in observational analyses if rates were sufficiently high

Due to the low numbers of cases in childhood (7 depression, 85 anxiety, 21 DBD, and 87 ADHD cases), negative control analyses could not be conducted with these as outcomes. In addition, depression case status prior to puberty could not be included as a covariate

We can assure readers that we proposed to “co-vary for depression status prior to puberty (ages 0–8) if rates of pre-puberty diagnoses are sufficiently high and evenly distributed across levels of the outcome variable to allow model convergence”. This criterion was not met for childhood depression cases

 

Reason

Plan not possible

 

Timing

After data access

 

2

Type

Covariates

In the “Methods” section, on the MR models: “We will include covariates in these analyses to increase statistical efficiency, and to control for any residual population stratification”

After seeking expert guidance, we did not include covariates in the MR models, since this may bias results. Including covariates in MR is not common, which was the result of a miscommunication

This deviation avoids introducing bias into the MR analyses (which assume there are no confounders), and so readers should trust results more because of this change

 

Reason

Miscommunication

 

Timing

After results known

 

3

Type

Covariates

In the “Methods” section, we stated that we would include the number of children in the household (age 8) as a covariate

Due to a high number of missing values at age 8, parity (based on birth registry data with no values missing) was used instead

This change to a similar but more complete covariate is unlikely to influence results, beyond improving the imputation model performance

 

Reason

Typo/error

 

Timing

After data access

 

4

Type

Variables

In the “Methods” section, on two-sample MR instruments: “We will infer the forward strand alleles using allele frequency information for palindromic SNPs (SNPs with minor allele frequency > 0.3 will be discarded, as these cannot be reliably inferred)”

Since the summary statistics did not include the effective allele frequency, we could not infer the forward strand for palindromic SNPs. Instead, we contacted the senior author of the GWAS, who could confirm that the sumstats were formatted to be on the forward strand

Assuming that summary statistics are formatted to be on the forward strand has become a common approach in the field. Moreover, since we knew the age at menarche summary statistics were on the forward strand, this deviation should have little impact on the readers’ interpretation of the results

 

Reason

Plan not possible

 

Timing

After data access

 

5

Type

Analysis

We stated that we would run the observational analyses in the largest available sample of 14-year questionnaire responders and restrict these analyses to only genotyped individuals as a sensitivity analysis

We did not restrict to genotyped individuals only, as all variables in the analytic dataset (including the genetic instrument) were multiply imputed. This was according to the stage 1 protocol

Restricting to genotyped individuals only would likely not have been that informative, so this should not affect readers’ interpretation of the results

 

Reason

Typo/error

 

Timing

After data access

 

6

Type

Data preparation

We proposed to impute: “early-life diagnoses for the oldest MoBa participants (because linkage is only available since 2008). In addition, at the time of carrying out the analyses, the linked registries will have missing data about diagnoses in the later years of adolescence for younger MoBa participants (because they will not have yet turned 18 by 2021)”

Attempts to multiply impute diagnostic outcomes were unsuccessful, likely due to the sparse and binary nature of the data. Censoring of early-life diagnoses was a minor issue, since numbers were too low for inclusion in analyses. However, the censoring of diagnoses in the later years remains a limitation

Since we did not impute diagnoses, the precision of the MR analyses was lower than anticipated for these outcomes. Readers should take this into account when interpreting the results. Censoring may lead to reduced precision and, potentially, underestimation of effect sizes. This is mentioned as a limitation

 

Reason

Plan not possible

 

Timing

After data access

 

7

Type

Analysis

In the protocol, we stated: “When outcomes are excessively skewed (based on the skewness test implemented in the moments package in R [76]) or for binary outcomes (for which a logistic model will be used in the second stage), we will apply a post-estimation correction of the standard errors (the HC1 option in the sandwich R package [77])”

After seeking expert input, we calculated robust standard errors for both continuous and binary outcomes. In addition, non-normality was handled by transforming all symptom outcomes, using log or square root transformation, depending on the impact of each transformation on skewness and kurtosis

These deviations from the registered protocol led to small differences in the results, except for conduct disorder. For this highly skewed outcome, the MR point estimate became more extreme after transformation. Our deviations made sure 2SLS analyses accounted for uncertainty in the first stage and that the data was appropriate for linear regression

 

Reason

New knowledge

 

Timing

After results known

 

8

Type

Data preparation

In the section on outliers: “For other phenotype data, values > 4 standard deviations from the mean were treated as outliers and coded as missing (e.g., to remove implausible height/weight values used to calculate BMI)”

According to the quality control procedures for MoBa phenotypic data implemented in the phenotools package, we used > 3 standard deviations from the mean to define outliers

This deviation, using a stricter criterion for defining outliers in the calculation of BMI, was very minor and should not influence readers’ interpretation of the results

 

Reason

Others (please explain)

 

Timing

After data access

 

9

Type

Variables

We stated: “We will therefore conduct an MVMR analysis with genetic instruments for age at menarche, childhood body size and adult BMI included in the same model”

Since conditional F-statistics were less than 10 when including either childhood body size or adult BMI in MVMR analyses, we did not include both as additional instruments due to low power

It should be apparent to the readers that the decision to not add further complexity when simpler models did not satisfy the criterion for instrument strength (F > 10) was well justified, to avoid weak instrument bias

 

Reason

Plan not possible

 

Timing

After results known

 

10

Type

Analysis

In the analysis plan, we stated that MVMR analyses including BMI and estradiol as additional exposures would be conducted for all other symptom domains, as part of the sensitivity analyses

We did not conduct MVMR analyses with BMI and estradiol as additional exposures for the other symptom domains, since (a) precision was low, (b) one-sample MR estimates were consistent with the null for most domains, and (c) these analyses were especially relevant to depression

This deviation should have a limited impact on readers’ interpretation of the results, since the addition of these analyses would not be very informative, both due to low precision and because the primary interest was in whether BMI confounded the observed causal relationship with depression

 

Reason

Typo/error

 

Timing

After results known

 
 

Type

Variables

About MVMR: “This will be based on the latest GWAS meta-analysis of major depressive disorder, which identified 223 variants independently associated with depression [72]”

The summary statistics from the GWAS we intended to use were not available; therefore, we used summary statistics from Howard et al. [70] for depression

These summary statistics are similar to what we intended to use, as part of a relatively minor sensitivity analysis and so should not affect readers’ interpretation of the results

 

11

Reason

Plan not possible

 
 

Timing

After data access