Background: High mammographic density was an independent risk factor for breast cancer and has a higher associated risk than most other known risk factors. The reproducibility remains a major issue in assessment of breast parenchymal patterns. Misclassification of mammographic pattern can lead to significant underestimation of risk estimates. The purpose of this study was to assess the inter-rater and intra-rater reliability based on visual subjective mammographic density measurements. Method: Three density measures, Wolfe parenchymal pattern, Boyd classification scale, and a percentage of densities in total breast, were investigated. The study included 101 women who were participants of the International Breast Cancer Intervention Study I (IBIS I) for up to 7 years. Seven sets of mammograms were collected for each woman. Left breast mediolateral oblique films were digitized, and the scanned images were independently reviewed by two readers. These images were reassessed by one reader after a year. The agreements of measures were evaluated by Kappa statistics (Wolfe and Boyd scale) and intraclass correlation coefficient (percentage densities). Results: For the inter-rater agreement, Weighted Kappa for Wolfe scale was 0.89 (P < 0.0001) and for Boyd scale was 0.84 (P < 0.0001). The intraclass correlation coefficient was 0.94 for percentage densities. For the intra-rater agreement, Weighted Kappa for Wolfe scale was 0.87 (P < 0.0001) and for Boyd scale was 0.86 (P < 0.0001). The intraclass correlation coefficient was 0.96 for percentage densities. Conclusion: The study concludes that both visual qualitative and quantitative measurements on mammographic density are highly reproducible in the breast cancer research studies if appropriate training is provided. The method is appropriate for risk assessment in a prevention trial.
Breast Cancer Research and Treatment Vol. 108, Issue 1, p. 121-127