Blinded Reader Training for Clinical Trial Imaging

Joseph Pierro and David Raunig | |

Our past blogs have addressed the underlying principles related to the operationalization of blinded independent central reads (BICR) and expert reader selection. This article will provide insights into our thinking with regards to reader training.

ICH E8(R1) guidelines define the purpose of a clinical study as “to generate reliable information to answer key questions and support decision-making while protecting study subjects. The quality of the information generated should, therefore, be sufficient to support good decision-making.”1 Careful consideration and identification of factors that are perceived as important to image interpretation and controlling for these factors are fundamental concepts that may be handled with reader training.

Reader training provides quality control in clinical trial imaging

It’s important to remember that many of the response assessments or scoring methods used in clinical research studies are not used in routine imaging (e.g. RECIST, Barkhof criteria, RANO, Genant score, etc). The primary purpose of reader training is to standardize readers’ assessments to fulfill the quality objectives of the study.

The Importance of Reader Training

The importance of reader training is defined in the Imaging Study Documents (i.e. Imaging Charter and Reader Performance Manuals) which provide information to the reader on key protocol elements such as:

  • Study eligibility criteria
  • Trial design
  • Study endpoints (efficacy and safety) and assessments

These study documents define the operational BICR process to guide the reader’s performance during the study.

Training components provided to readers may include a review of the principles of ICH-Good Clinical Practice (GCP) guidelines; however, the primary focus of reader training will align around the study imaging endpoints, image scoring, and response criteria assessments. Training will include additional protocol-specific definitions, explanations or modifications to the response assessment/scoring criteria as well as any important updates or modifications that may have been released after the initial response criteria publication. For example, the RECIST Expert Working Group released several update and clarification publications in the years following the initial RECIST 1.1 publication in 2009.

Reader training is also very important when trials include novel interpretation methodologies that are either being developed de novo or are the result of newly approved methods or products (e.g., recent approvals of PET imaging agents for prostate or Alzheimer’s imaging) which would allow new readers to achieve an acceptable level of reader performance in terms of diagnostic accuracy and reporting reproducibility.

Reader Training:  How and When

Reader training is optimally performed before starting study-specific reads to help readers become familiar with the read platform, display software, and study assessment methods.  Typically the initial training meeting is performed in a group session (assuming there is more than one reader) where representatives of the imaging provider and clinical experts (often the most experienced reader or adjudicator) lead a case review and per protocol assessment of several representative cases.

When additional cases are available for review, the Imaging Charter may require individual readers (i.e. the entire reader pool) to perform additional “test reads” (often these are referred to as calibration reading sessions since all of the readers will review the same set of images) using a standard set of non study related cases. These cases are read/scored individually by the reader(s) and the results then discussed among the entire group of readers. The overall goal is to ensure readers understand study objectives; improve understanding of the response endpoints/assessments; and discuss the subjective aspects of assessments where the scoring or response endpoints are somewhat ambiguous. Following this calibration read review, the project team will document the discussions and reader’s agreements on performing assessments and will also include relevant instruction on handling unique imaging or data issues (e.g. reading non-contrast CT exams or the study protocol may exclude prior irradiated target lesion selection). These reader rules are provided to all readers and are used during reads completed for the entire duration of the trial.

Initial reader training focuses on:

  • Standardizing and improving the precision of reader measurements/assessments (ensures signal detection)
  • Maintaining a high level of reader performance and consistency
  • Lowering data variability by periodically monitoring the readers during the study.

The Plan for Reader Monitoring

The Imaging Charter defines the reader monitoring plan which is typically performed at periodic intervals after the readers have completed an initial number of study cases. These assessments may be time-based such as quarterly or annually; enrollment threshold-based, or at specific times as agreed with the study sponsor (e.g. a sponsor may want to more heavily schedule [skewed] reader monitoring time points to the early stages of study enrollment to optimally align reader performance and data quality).

Reader performance monitoring includes the analysis of study level reads to provide inter-reader variability metrics when 2 or more readers are used. Intra-reader variability analysis may be performed from re-reading cases selected from either the study image datasets or from a standard set of test images. These same set of cases will be periodically re-read (provided that an acceptable amount of time has passed between assessments to minimize memory recall) over the life of the trial to monitor the reader pools’ consistency with themselves over time (i.e. intra reader performance of each reader on study).

Thoughtfully crafted reader monitoring plans include an assessment of reader discordance (e.g. differences in the discordance percentages between individual readers when using an adjudicator model), discussion and analysis of the expected or unacceptable rate of discordance and the types of reader errors which may occur (e.g., incorrect use of the read system or misapplication of the response assessment criteria) with predefined reader performance actions or decisions to handle these issues.

Best practices may rely on past study results, published reader performance metrics or experience provided by your imaging partner to establish expected performance thresholds (within and outside of expected study performance parameters) before initiating study reads and include instruction on reader re-training (individual or group) and reader replacement.


In closing, reader training is used to address some of the interpretative imaging challenges by designing quality control into the clinical study. Our next installment to this blog series will discuss various statistical approaches that may be appropriate to use when reporting reader performance.

Joseph Pierro is the Medical Director of Imaging at ERT and David Raunig is the Senior Principal Imaging Statistician at ERT.