Trial design
We performed a ‘split-manuscript’ RCT with blinded outcome assessment. The methods section of each manuscript was divided into six different domains: ‘trial design’, ‘randomization’, ‘blinding’, ‘participants’, ‘interventions’, and ‘outcomes’. Each participant was randomly allocated to a different real RCT protocol. Participants had to write the six domains of the methods section of the manuscript for the protocol they received over a 4-hour period. They had access to the tool for a random three of the six domains. Thus, the unit of randomization was the domain, embedded within the manuscript (Fig. 1).
Authorization by the CNIL (“Commission Nationale de l’Informatique et des Libertés”; file number 1753007) whose remit is to protect participants’ personal data and the institutional review board of INSERM ethics committee (IRB 00003888) was obtained and the study protocol was registered at ClinicalTrials.gov (http://clinicaltrials.gov NCT02127567).
Informed consent was obtained from all participants. The consent was obtained electronically. All parts of the trial were conducted in Paris.
Randomization
Sequence generation
The randomization sequence was computer-generated with the use of SAS 9.2. For each participant-manuscript, three domains were allocated to the ‘writing aid tool’ group and the three remaining domains were allocated to the ‘usual writing’ group. Allowing for 20 possible combinations of domains (i.e. three of six domains with the tool and three without), randomization was performed with permuted blocks of 20.
Implementation
Only the independent statistician and the computer programmer who developed the online writing aid tool and the website had access to the randomization list. The statistician who generated the list (BG) provided the list to the programmer, who uploaded it on the study’s secure website. The list was not available to the researchers who enrolled the participants and were present at the various study sessions (CB, IB).
Allocation concealment
The sequence was concealed by a computer interface.
Blinding
Participants could not be blinded to intervention assignments. However, outcome assessors were blinded to intervention assignments.
Participants
Study participants were masters or doctoral students in the field of public health and medical research who were based in Paris and who were affiliated with Paris Descartes University, Pierre and Marie Curie University, and Paris Diderot University, or the Mailman School of Public Health of Columbia University, in New York.
An e-mail advertisement was sent to students to invite them to participate in a writing session. Participants were not informed of the study in the email advertisement. Before obtaining their consent, participants attended a small informational session with a PowerPoint presentation describing the writing task to perform. Participants were instructed to complete six sections of a manuscript describing the study protocol they were provided and that they would have assistance for three sections and no assistance for three sections, although they were not instructed as to which sections. They were informed that this design had a pedagogical purpose as they could see how useful it was to have the writing aid tool and use reporting guidelines when writing the first draft of a manuscript. They were instructed that we would use their results to evaluate the impact of the tool. Before beginning the exercise, participants provided their consent electronically.
Selection of protocols
We retrieved all protocols of RCTs published between January 1, 2013, and March 28, 2014, in the New England Journal of Medicine or the Journal of Clinical Oncology. We chose these journals because they provide access to the protocol for all the RCTs they publish.
One researcher (CB) searched MEDLINE via PubMed (search strategy is reported in Additional file 1) and screened all titles and abstracts retrieved to select all reports of two-arm parallel-group RCTs. All available protocols published in English were retrieved for all identified reports of RCTs. Then, we constituted a sample of protocols reporting various pharmacologic interventions and non-pharmacologic treatments (surgery, implantable devices, rehabilitation, education, etc.; see sample size below).
Experimental interventions
Objective of the tool
The writing aid tool based on CONSORT was developed to provide guidance to authors when writing a manuscript of a RCT evaluating pharmacologic treatment or non-pharmacologic treatment. The tool was individualized according to the type of treatment evaluated (drug, surgery, participative interventions such as rehabilitation, education).
Content of the tool
The content of the tool was based on the checklist and the Explanation and Elaboration document for CONSORT 2010 [3] and the checklists and the Explanation and Elaboration documents of the CONSORT extension for non-pharmacologic treatment [7]. For each domain, the tool comprised the corresponding CONSORT checklist item(s), bullet points with the key elements that need to be reported extracted from the Explanation and Elaboration document of the CONSORT 2010, and non-pharmacologic treatment extension, as well as (an) example(s) of good reporting. For the domain dedicated to the intervention, the bullet points and examples of adequate reporting were individualized according to the treatment evaluated (i.e. medication or treatment strategy; surgical procedures or devices; or participative interventions such as rehabilitation, education, behavioral treatment, or psychotherapy). For example, when the experimental treatment was a surgical procedure, the bullet points with the key elements that needed to be reported were specific to surgical procedures (e.g. anesthesia, preoperative care, postoperative care) and the examples of adequate reporting concerned surgical trials.
An example is included in Fig. 2. The entire tool is available at [28] (enter any username) and in Additional file 2.
Format of the tool
For each domain, the online writing aid tool consisted of a single or several large text boxes in which the participants could write the corresponding part of the methods section. Above each text box was a reminder of the information that should be reported. This reminder consisted of the related CONSORT item followed by the statement “Please describe” and bullet points with all the information that needed to be reported.
According to the domain and the complexity of the CONSORT item, the tool could contain one or several text boxes. For example, two boxes were dedicated to the domain trial design: one to describe the trial design and one to report important changes to methods after the trial commencements with reasons.
Control intervention
For each domain, the intervention consisted of a large text box in which the participant could write this part of the methods section. They did not have the CONSORT checklist item.
Writing session
Interventions were administered in the context of a practice writing session. Participants were asked to write six sections of a manuscript based on a protocol they were provided describing an RCT over a 4-hour period. Each participant was provided a protocol randomly selected in our sample of protocol in both electronic copy and paper copies. Two study monitors (CB and IB) supervised these writing sessions after providing a brief explanation of the task to be performed. Participants were aware that they would have access to the writing aid tool for some of the sections. They were not allowed to use any materials. Participants were told that all data would be anonymous and confidential.
For both the experimental and control interventions, participants were instructed to indicate any important or necessary information they would have wished to report that was not available in the provided study protocol (Fig. 2).
Outcomes
Primary outcome
The primary outcome was the mean global score for completeness of reporting (scale 0–10) for all domains written with or without the writing aid tool.
For each domain, within each protocol selected, we pre-specified a series of keywords that should be reported. For example, in a study using a 1:1 randomization with a computer generated randomization list with blocks of four and stratification on the study site, the following key words were pre-specified ‘Computer generated’, ‘blocks of 4’, ‘1:1’, ‘stratification on site’. We also pre-specified a weight for each keyword.
For each protocol, completeness of reporting was determined by the presence or absence of the pre-specified keywords and their respective pre-specified weights. If the information was not available in the protocol but was described by the writer as missing, it was rated as completely reported.
Because the number of keywords varied among the domains by the domain type, the type of treatment evaluated, and the context of the protocol, we standardized the scores for each domain on a scale of 0–10. An example for the scoring system for completeness of reporting is in Additional file 3. Therefore, we obtained six scores for completeness of reporting for each participant-protocol pair, three associated with domains written with the writing tool and three with domains written without the tool. These scores were the unit of analysis (cf. Statistical methods section), and the statistical analysis allowed for estimating mean scores for completeness of reporting with and without the writing tool.
Two independent researchers blinded to intervention assignment and to the writer identity assessed the presence of these keywords for all protocols by domain. To maintain blinding, outcome assessors measured the outcome for each of the six domains separately, all text appearing in a different random order in the same format (with the same font and text size). After the researchers had assessed all the domains, they met to resolve any disagreements by consensus.
Secondary outcomes
Secondary outcomes were (1) the scores for completeness of reporting for each individual domain (trial design, randomization, blinding, participants, interventions, and outcomes of participant reports) and (2) the mean score for completeness of reporting of pre-specified essential elements of each domain (Additional file 4).
Ancillary study
We aimed to compare the mean global score for completeness of reporting scores (scale 0–10) for all domains of the manuscript written by the participants with and without the tool to the methods sections of the published articles. For this purpose, we retrieved all the published reports corresponding to the selected protocols and their related appendices. The same two outcome assessors were asked to read the methods section of the published articles as well as all appendices referenced in the article and evaluate the presence or absence of the same pre-determined keywords. They were not blinded to the journal or authors’ names.
Sample size calculation
The sample size was calculated using the same method as for a cluster randomized cross-over trial [29]. We assumed a mean score of 4 (0–10 scale) for the domains written without the tool (i.e. control group) and considered a standard deviation (SD) of 4 (which is a rather conservative assumption). Our hypothesis was that the mean score would be 6 for the domains written with the tool (i.e. experimental group). We specified an intraclass correlation coefficient of 0.8 (i.e. the correlation between scores of reports of two domains with the same interventional assignment, written by the same student). Such a conservative value was motivated by the nature of the design: two domains within a cluster are actually two domains completed by the same participant. We hypothesized that the intraclass correlation coefficient was half the intraclass correlation (0.4), and we considered a two-sided 5 % Type I error and a nominal power of 90 %. The inflation factor then was 1.4, the required number of observations (i.e. domains) per group 120, and the required number of participants 40.
Statistical methods
Descriptive statistics were reported as number and percentage for categorical variables and median and interquartile range (IQR) for quantitative variables. The statistical unit of analysis was the section, which was embedded in the couple participant-protocol (since each participant had a different protocol, there is no distinction between participants and protocols). Therefore, we had six observations for each participant: three in the experimental group in which the writing aid tool was used, and three in the control group. For the main analysis, sections were considered exchangeable (i.e. the intervention effect was assumed to be the same whatever the domain of the section). Then, such a data-structure is the same as the classical data structure encountered in split-mouth designs or cluster randomized cross-over trials. Therefore, data were analyzed using a mixed model, which included a fixed intervention effect, a random participant effect, and a random participant-group effect [30].
Furthermore, because the hypothesis of a common intervention effect to all six domains was strong, we completed the primary analysis with a series of six independent substudies (i.e. one for each domain). For each of these substudies, we had only one statistical unit associated with each couple participant-protocol, which implies independence between the statistical units. Therefore, we performed classical Student t-tests. To evaluate the robustness of our results, we performed a sensitivity analysis with simulations of different possible weighting systems (Additional file 5).
For the ancillary study, we also considered the domain’s score for completeness of reporting as the unit of analysis. For each domain, within each protocol, we had one score for the participant of the present study and another for the authors of the published report. These paired data were split by whether the participant used the writing tool or not. Differences in paired scores were then analyzed in the framework of mixed models with no other fixed effect than an intercept and with the protocol as a random effect.