Abstract
Background/Aim: To investigate whether quantitative analysis of diffusion weighted images allows for improved risk stratification of transition zone lesions in prostate magnetic resonance imaging (MRI) evaluated according to PI-RADSv2.1 [Prostate Imaging Reporting and Data System, target variable: clinically significant prostate cancer (csPCa)]. Patients and Methods: Consecutive patients with transition zone lesions in 3T prostate MRI were enrolled in the study. All lesions on MRI were histopathologically verified by transperineal MRI-TRUS fusion biopsy. Two blinded radiologists re-evaluated all lesions according to PI-RADSv2.1. A consensus reading was performed after reading of all cases. Additionally, mean apparent diffusion coefficient values (mADC) were derived from blinded lesion segmentation. ROC analysis was performed for PI-RADS categories and PI-RADS categories with separate subcategories and diffusion coefficient values (ADC). Data were examined for optimal mADC cut-off values that improve stratification of csPCa and benign lesions. Results: Among 85 patients (mean age=66.2 years), 98 transition zone lesions were detected. Biopsy confirmed csPCa in 24/98 cases. Area under the curve (AUC) was 0.89/0.90 for reader 1, 0.92/0.91 for reader 2 and 0.92/0.91 for the consensus reading (5 category analysis/analysis with subcategories separately). Inter-reader agreement was substantial, with lower PI-RADS categories assigned by the more experienced reader (p<0.05). AUC for mADC alone was 0.81. When a cut-off threshold of 950 μm2/s mADC is used to downgrade PI-RADS 3 lesions to PI-RADS 2, biopsy could be avoided in all benign PI-RADS 3 cases. Conclusion: Quantitative analysis of diffusion weighted images may help avoid unnecessary biopsies of transition zone PI-RADS 3 lesions.
Multiparametric magnetic resonance imaging of the prostate (mpMRI) has been established as the de facto standard for imaging assessment of possible prostate cancer (PCa). It is implemented in European and American guidelines for diagnosis in biopsy-naive patients, patients with prior negative biopsy and for PCa patients before inclusion into active surveillance (1, 2). Prostate lesions are classified into five categories according to the latest Prostate Imaging Reporting and Data System lexicon (PI-RADS), version 2.1 (3). Higher categories have higher probabilities to detect clinically significant cancer. The cancer detection rates (CDR) for each category do not significantly differ between the peripheral zone (PZ) or the transition zone (TZ), being very low (<5%) for clinically significant prostate cancer (International Society of Urological Pathology grading of prostate cancer (ISUP) >1) (csPCa) in PI-RADS 1 and PI-RADS 2 categories (4, 5). However, inter-reader agreement regarding PI-RADS assessment categories is lower for TZ lesions than for PZ lesions, especially for low PI-RADS categories (4, 6-8). Kappa values here range between 0.42 and 0.70 for overall lesion classification (9). This is, at least partly, due to the appearance of the enlarged prostate (originating in the TZ), which leads to a pattern referred to as organized chaos (10). Oftentimes, this leads to a dilemma for the referring physicians when low PI-RADS categories (especially PI-RADS 3) are assigned to TZ lesions, regarding the decision to perform biopsy or not. Thus, further parameters which allow avoidance of unnecessary biopsies without risk of overlooking clinically significant prostate cancer are desirable.
Recent studies have shown that assessing mpMRI quantitatively using radiomic features and mean ADC (apparent diffusion coefficient) values can improve diagnostic accuracy regarding TZ lesions compared to qualitative PI-RADS reading alone (11-13). For example, improving specificity and preserving sensitivity is feasible by downgrading PI-RADS lesions ≥4 based on mean ADC values or machine learning algorithms (13).
Thus, the aim of this study was to determine whether transition zone lesions can be downgraded dependent on their mean ADC values without increasing false-negative results and therefore avoiding unnecessary biopsies.
Patients and Methods
Study population. This retrospective single-center cohort study was approved by the Institutional Review Board without the requirement for written informed consent (application number 20-1256). The analysis included all patients examined between April 2017 and May 2020 with multiparametric prostate MRI at the Department of Diagnostic and Interventional Radiology, Medical Center - University of Freiburg and subsequent MRI-TRUS fusion biopsy (target and volume adapted systematic biopsy, overall, 343 patients with 508 lesions). All prostate MRI examinations were acquired because of suspicious PSA values or abnormal findings at digital rectal examination. First, all patients without known PCa with target lesions located in the transition zone were identified (86 patients with 100 lesions). One patient with two lesions was excluded because the MRI examination was performed with a 1.5 T MRI scanner. The study population, thus, comprised 85 patients with 98 lesions. For each patient PSA serum level was recorded before the MRI examination.
Magnetic resonance imaging. All mpMRI examinations were acquired with the same 3T scanner (MAGNETOM VIDA, Siemens Healthineers, Erlangen, Germany) according to the PI-RADSv2.1 acquisition protocol with the following parameters: T2 weighted images had 3 mm slice thickness and no gap (2D TSE, TR: 7500 ms, TE: 104 ms, FA: 160°). The field-of-view was 200 × 200 mm (768×768 matrix). The in-plane resolution was 0.26 × 0.26 mm. A diffusion weighted sequence was acquired with 3 mm slice thickness and no gap, TR 4300 ms, TE 74 ms, flip angle 90° and number of averages: 1/2/5/9 at b-values of 0/100/400/1,000 s/mm2. B-value images of 1,400 s/mm2 were calculated from acquired lower b-values. The field-of-view was 200×200 mm (228×228 matrix). Axial dynamic contrast-enhanced (DCE) 3D gradient echo volume interpolated sequences were acquired with TR 4.09 ms, TE 1.83 ms, flip angle 12°, field-of-view 260×260 mm, matrix 224×224, slice thickness 3 mm and temporal resolution <5 s. Endorectal coils were not employed. Butylscopolamine was injected intravenously prior to the scan with dose adjustment considering body weight. Acquired images were then prepared for MRI-TRUS fusion biopsies. Both the prostate gland and scored lesions were segmented manually by a trained radiologist in axial T2 weighted slices.
Transperineal MRI-TRUS fusion biopsy. The biopsy technique was MRI-guided TRUS fusion transperineal biopsy realized with the MonaLisa system (Biobot Surgical Pte Ltd, Singapore). The procedure was carried out by a trained urologist in a sterile environment in the operating room. For the duration of the procedure, patients remained under laryngeal mask anesthesia. A three-dimensional model of the prostate gland was generated with the help of an endorectal ultrasound probe. The ultrasound model was then fused with the MR volume derived by manual segmentation of the axial T2 weighted images. Lesions previously defined and scored by the radiologist were transferred to the in-vivo ultrasound model. Deformations of the gland were corrected by the software’s algorithm. Transperineal access for biopsy was established by two perineal incisions. The software automatically provided biopsy angle and biopsy depth. An automated robotic arm was guided by imaging data and located in position for biopsy. A multi-use biopsy device (Uromed, Germany, REF6020) and trocar-like needles (Uromed, REF 6025.10) were triggered manually. At first fusion biopsies of the target lesions were performed, then saturated systematic biopsies of the whole gland were taken according to the volume adapted Ginsburg study scheme (14). The mean number of all target lesions was 2.0 (SD=0.8, range=1-4), mean number of target lesions in TZ was 1.2 (SD=0.4, range=1-3). Mean number of cores was 35.5 including saturated systematic biopsy and target lesion biopsy (SD=6.1, range=23-54). Mean number of biopsy cores of target lesions in TZ was 3.1 (SD=1.2, range=1-7).
Histopathological work up. After biopsy, all tissue specimens were immediately transferred into buffered formalin for fixation. Formalin fixation was conducted for at least 12 hours. Next, all tissue specimens were paraffin embedded and stained for hematoxylin and eosin. All slides were examined under the microscope by at least one board certified pathologist. In case of tumor diagnosis, a second board certified pathologist was consulted for the confirmation of diagnosis. All biopsies were evaluated according to national guidelines for prostate cancer with regard to tumor cell content and tumor cell grading (according to Gleason). Additional immunohistochemical stainings were performed in cases of histologically inconclusive invasive growth to confirm the diagnosis of invasive cancer. For this purpose, the relevant sections were stained with cytokeratin 5/6, p504s and p63. All steps described above were carried out according to routine protocols.
PI-RADS assessment. Target lesions were marked in T2 weighted images in our Institution’s PACS (Impax, Agfa HealthCare, Agfa-Gevaert Group, Mortsel, Belgium) by a radiologist in knowledge of the manual segmentation for the MRI-TRUS fusion biopsy, the pathological report, and the clinical information (H.E.). This radiologist was not involved in the blinded reading afterwards. Then, each lesion was reviewed independently and retrospectively by two radiologists with different experience levels in prostate MRI interpretation (reader 1, B.O., radiology resident and reader 2, M.B., board-certified radiologist, with 3- and 7-years’ experience in reading prostate MRI, respectively). Before the blinded evaluation, PI-RADSv2.1 scoring guidelines were formally repeated by the readers. Patients were pseudonymized and readers were blinded to the pathologic results and all clinical information. Each lesion was scored in T2 weighted images, high b-value images, and ADC maps, e.g., 2+1 for atypical transition zone lesions that were upgraded because of a marked diffusion restriction, differing from the background signal. After the initial reading session, a consensus-reading session consisting of reader 1 and reader 2 was performed during which the lesions with different scores were re-evaluated and given a final score. Readers agreed in 54 scorings. Differences in 44 lesions were resolved.
Lesion segmentation. For lesion segmentation, T2 weighted images and ADC maps of all patients were imported pseudonymized into a web-based framework for medical image analysis developed in-house at our medical physics department (www.nora-imaging.com). All target lesions were marked by a radiologist in knowledge of the manual segmentation for the MRI-TRUS fusion biopsy, the pathological report, and the clinical information (H.E.). Then, all target lesions were manually segmented in T2 weighted images and ADC maps by an experienced radiologist, who was blinded to the pathological report and all clinical information.
Data analysis and statistics. For estimation of the inter-observer agreement on PI-RADSv2.1 assessment category assignment, weighted (squared) kappa statistics were calculated. Interpretation of kappa values was based on the work of Landis and Koch (15). Systematic deviation in lesion scoring between reader 1 and reader 2 was evaluated by the sign test (16). To analyze the diagnostic performance of the PI-RADSv2.1 assessment categories, receiver-operating-characteristics (ROC) analyses of a) reader 1, b) reader 2 and c) consensus reading were performed. Moreover, an analysis with upgraded lesion scores as separate classifications (e.g., 2+1 vs. 3+0) was carried out. Resulting areas under the curve (AUC) were compared using the DeLong method (17). Statistical analyses were implemented in R version 4.1.1 (18). A p-value of 0.05 or less was considered statistically significant. Data was analyzed to derive a mean ADC (mADC) cut-off to improve the stratification of lesions into benign and clinically significant prostate cancer (csPCa).
External validation of proposed mADC-cutoff values/algorithms. Zhang et al. proposed a downgrading algorithm for PI-RADS ≥4 lesions with an mADC cut-off at a threshold value of 732 μm2/s (13). By applying this downgrading algorithm, they achieved to increase specificity and accuracy without sacrificing sensitivity for detection of csPCa (ISUP >1) in their patient cohort. We applied this threshold to our population for external validation of the findings of Zhang et al.
Results
Study population demographics. Among 85 patients with 98 transition zone lesions included in the analysis mean patient age was 66.2 years (SD=7.7 years), mean PSA level was 12.2 ng/ml (SD=8.8 ng/ml), mean PSA level of csPCa (ISUP >1) lesions was 14.0 ng/ml (SD=9.5 ng/ml), mean PSA level of lesions without csPCa was 12.4 (SD=9.5 ng/ml), mean prostate volume was 76 ml (SD=40 ml), mean TZ lesion-volume was 1.2 ml (SD=1.6 ml), number of TZ lesions with csPCa was 24, number of TZ lesions with clinical non-significant cancer (ISUP grade 1) was 9, number of benign TZ lesions was 65, number of patients with csPCa overall (both TZ and PZ lesions and systematic biopsy) was 34.
Inter-reader reliability. Weighted (squared) kappa statistics showed good inter-reader agreement with a systematic tendency of reader 2 (more experienced) to assign lower PI-RADS categories compared to reader 1 (sign test, p<0.0001). Scores mostly differed in lower categories (≤3+0). Higher categories (≥3+1) showed excellent agreement. Readers’ distribution of scores were as follows (reader 1 vs. reader 2): PI-RADS 1: 3% vs. 11%, 2+0: 14% vs. 27%, 2+1: 15% vs. 7%, 3+0: 24% vs. 12%, 3+1: 8% vs. 4%, 4+0: 10% vs. 14%, 5: 24% vs. 23%. Cohen’s weighted squared Kappa for the overall reading was 0.75 for 5 PI-RADS categories (p<0.001) and 0.77 for analysis with 7 scores differentiating between upgraded and non-upgraded lesions within the same PI-RADS category (p<0.001).
ROC analyses. ROC analyses regarding PI-RADS scores 1-5 and csPCa as outcome showed AUC values of 0.89 for reader 1, 0.92 for reader 2, and 0.92 for the consensus reading. Analyses of upgraded lesions as separate classification with csPCa as outcome also reveal AUC values to only slightly differ between reader 1 (0.90), reader 2 (0.91), and the consensus reading (0.91). The AUC for mADC values alone was 0.81. All ROC curves were shown in Figure 1. There was no significant difference between the ROC curves of reader 1 and reader 2 (p=0.23 for 5 categories and p=0.56 for 7 categories). The ROC curve of the consensus reading with 5 categories was significantly different from the ROC curve of the mADC values (p=0.04), for 7 categories the p-value of 0.06. We provided the distributions of PI-RADS categories, outcomes and mADC for reader 1, reader 2 and the consensus readings in Table I.
Receiver operating characteristic (ROC) curves with area under the curve (AUC) values for Standard Prostate Imaging Reporting and Data System version 2.1. (PI-RADSv2.1) categories for the detection of clinically significant prostate cancer. PI-RADSv2.1 categories: 5 categories. Subcategories 2+1 and 3+1 analyzed separately: 7 categories.
Distribution of Prostate Imaging Reporting and Data System (PI-RADS) categories for reader 1, reader 2 and the consensus reading with outcome [benign, International Society of Urological Pathology grading of prostate cancer (ISUP) 1, ISUP >1] and associated mean apparent diffusion coefficient (mADC) values in μm2/s.
Results when applying a downgrading algorithm to PI-RADS 3 lesions. Applying our proposed downgrading algorithm, in which all PI-RADS 3 lesion with mADC values >950 μm2/s are downgraded to PI-RADS 2, resulted in 22 benign lesions being downgraded to PI-RADS 2 and 3 PCa lesions remaining in PI-RADS category 3 (1 ISUP>1 and 2 ISUP=1 lesions). mADC values, PI-RADS categories and cut-off values are shown in Figure 2. Figure 3 provides imaging examples of application of the proposed mADC cut-off.
Mean apparent diffusion coefficient (mADC) values of target lesions depending on the Prostate Imaging Reporting and Data System (PI-RADS) category. The color of the lesions indicates the International Society of Urological Pathology grading of prostate cancer (ISUP) classification; green: benign (ISUP 0), blue: not clinically significant prostate cancer (ncsPCa) (ISUP 1), red: clinically significant prostate cancer (csPCa) (ISUP >1). The green box includes the benign PI-RADS 3 lesions that are downgraded if our proposed mADC cut-off value of >950 μm2/s is applied. The grey box contains all PI-RADS 4 lesions that would be downgraded if the downgrading algorithm by Zhang et al. with a mADC cut-off value of >732 μm2/s is applied.
A 64-year-old patient with prostate-specific antigen (PSA) 5.2 ng/ml. (a) Axial T2 weighted image (T2w); (b) calculated b1400; (c) apparent diffusion coefficient (ADC). T2 weighted image shows a Prostate Imaging Reporting and Data System (PI-RADS) 3 lesion with a mean ADC value of 1,109 μm2/s (arrows). Biopsy yielded no prostate cancer. If our proposed cut-off value of 950 μm2/s would be applied, this lesion could be downgraded to PI-RADS 2.
External validation of proposed mADC cutoff values. If the downgrading algorithm for PI-RADS ≥4 lesions with a mADC cutoff value of ≥732 μm2/s proposed by Zhang et al. was applied to our study population, 37 PI-RADS lesions would be downgraded, of which 21 lesions (56.8%) were csPCa and thus would be falsely classified as negative. In our study population only 3 lesions with a PI-RADS score ≥4 had mADC values <732 μm2/s, 2 csPCa and 1 ncsPCa (Figure 4). Applying the proposed downgrading algorithm to our study population would result in extremely low sensitivity of 0.083 and a specificity of 0.986.
An 80-year-old patient with prostate-specific antigen (PSA) 14 ng/ml. (a) Axial T2 weighted image (T2w); (b) coronal T2w; (c) calculated b1400; (d) apparent diffusion coefficient (ADC). T2 weighted images show a Prostate Imaging Reporting and Data System (PI-RADS) 3 (2+1) lesion with a mean ADC value of 819 μm2/s (arrows). Biopsy revealed clinically significant prostate cancer (ISUP 2).
Discussion
In this study, it was shown that the diagnostic accuracy of the PI-RADSv2.1 lexicon for TZ lesions was high, with AUC values of 0.92 both for the consensus results and the experienced reader (reader 2). Furthermore, the diagnostic accuracy did not differ greatly between the more experienced and the less experienced reader with an AUC value of 0.89 for reader 1. This indicates that the PI-RADSv2.1 lexicon is a helpful tool also for less experienced readers. Inter-reader agreement in this study seemed to be slightly higher than previously reported with Cohen’s weighted squared Kappa of 0.75 for PIRADSv2.1 TZ lesions compared to the range of 0.42-0.70 for all PI-RADS v2.1 lesions in the meta-analysis of Lee et al. (9).
Compared to the peripheral zone, the evaluation of the transition zone of the prostate is more challenging. This is because benign prostatic hyperplasia is almost always present to a certain degree, resulting in the so-called organized chaos of the transition zone. In addition to easily recognizable hyperplastic glandular nodules (T2w hyperintense and/or capsulated), there is also hyperplastic stromal tissue, some of which cannot be differentiated from malignant changes (T2w hypointense and/or restricted diffusion) (10). These difficulties result in lower inter-reader agreement in the transition zone compared to the peripheral zone for lower PI-RADS categories (19). Bhayana et al. described that especially the ADC signal descriptor “markedly hypointense” had only a low inter-reader agreement (K=0.26) (19). The cancer detection rates (CDR) for transition zone lesions with PI-RADS categories <3 are very low, ranging from 0.00 to 0.11 (5). Consequently, in clinical routine a biopsy is usually omitted for these lesions. However, PI-RADS 3 lesions are often a diagnostic challenge. CDR for PI-RADS 3 lesions (2+1/3+0) in the transition zone range from 0.09 to 0.33 (5). In addition, the meta-analysis by Lee et al. suggests that the specificity of PI-RADSv2.1 has decreased in v2.1 compared to v2.0 for TZ lesions (9). Overall, it appears desirable to establish further criteria for TZ lesions to avoid unnecessary biopsies.
It has been shown that the diagnostic accuracy of prostate MRI examinations can be improved by considering PSA density, quantitative image criteria such as mean ADC values and radiomic features, as well as by using new techniques like MR fingerprinting and machine learning (11, 13, 20-24). Furthermore, 68Ga-PSMA PET/CT also showed good accuracy in the primary diagnosis of csPCa and could be used additionally in case of uncertainty whether a biopsy should be performed or not (25). A combination of PI-RADSv2.1 categories with PSAD values can avoid unnecessary biopsies, although various cut-off values for PSAD are discussed at the moment. Wang et al. found that CDR of PI-RADS categories ≤2 are very low, independently from PSAD (4). Therefore, it is particularly important to distinguish between PI-RADS 2 and PI-RADS 3 lesions and further differentiation implementing another independent criterion such as mADC may improve diagnostic accuracy. We showed that a cutoff mADC value of >950 μm2/s (above threshold indicates benign lesions) could result in downgrading of all benign PI-RADS 3 lesions (22/25), that would allow to avoid an unnecessary biopsy in 88% of all PI-RADS 3 lesions in our study population. To externally validate proposed mADC cutoff values, we applied the proposed downgrading algorithm by Zhang et al. for PI-RADS ≥4 lesions to our study population, which resulted in a false-negative classification of more than half of all PI-RADS ≥4 lesions (Figure 2). This is likely due to the tendency that the mADC values of our study population were generally higher compared to the mADC values of their study population, both for benign lesions and for PCa (13).
A limitation of this retrospective single-center study is the fact that all MRI examinations were performed with the same MRI scanner. On the positive side, quantitatively analyzing the data of only one scanner with the same parameters leads to a homogenous dataset and robust results. However, it is known that ADC values differ markedly between scanners (26), and external validation of thresholds with the employment of various scanners is necessary. With the findings mentioned above we propose a cut-off mADC for lesion downgrading that differs from the work of Zhang et al. which demonstrates that multi-center studies with MRI scanner calibration for ADC values are desirable to establish robust and reproducible mADC cutoff thresholds. Nevertheless, our proposed cut-off value for mADC can be considered conservative: PI-RADSv2.1 explicitly states that “using a threshold of 750-900 μm2/s, may assist differentiation between benign and malignant prostate tissues, with ADC values below the threshold correlating with clinically significant cancers” (3). Choosing a threshold at 950 μm2/s therefore will not sacrifice sensitivity and consequently keep the negative predictive value high.
Although implementation of clinical parameters and quantitative imaging features increases accuracy in lesion detection, qualitative evaluation remains a rather subjective decision made by the reader. Scoring multiple lesions in one patient may have influenced the readers’ decision to not rate categories independently but to increase or decrease the score according to the other lesions’ scores. As this error is most likely to occur in most clinical settings as well, we consider it to reflect clinical reality. However, we acknowledge the need for futures studies addressing this topic.
In summary, this study showed that the diagnostic accuracy of the PI-RADSv2.1 lexicon for TZ lesion is high both for experienced and less experienced readers with substantial inter-reader agreement. In addition, we propose a downgrading algorithm for PI-RADS 3 lesions in the TZ with mADC values >950 μm2/s to PI-RADS 2, which would avoid all unnecessary biopsies of benign lesion in our study population. Our proposed cut-off value can be considered conservative, as sensitivity and the negative predictive value are kept high, as desired for the prostate MRI diagnostic test. Our study provides further evidence that quantitative information from diffusion weighted images could be included in the PI-RADS decision algorithm for lesion classification in order to improve risk stratification.
Footnotes
Authors’ Contributions
HE was responsible for data collection, data analysis and interpretation and writing the report. BO was responsible for reading the transition zone lesions, contributed to data extraction, data analysis and writing the report. MR and EK programmed NORA and contributed to data extraction. AS, CG, PB, TK and FB contributed to data analysis, interpreting the results, and provided feedback on the report. MB was responsible for designing the study, reading the transition zone lesions, writing the report, analyzing data, and interpreting the results. All Authors reviewed the results and approved the final version of the article.
Conflicts of Interest
The Authors report no conflicts of interest concerning the materials or methods used in this study or the findings reported in this article.
- Received July 12, 2022.
- Revision received July 23, 2022.
- Accepted July 27, 2022.
- Copyright © 2022, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY-NC-ND) 4.0 international license (https://creativecommons.org/licenses/by-nc-nd/4.0).