Abstract
Background: Preservation of organ function is important in cancer treatment. The ‘watch-and-wait’ strategy is an important approach in management of esophageal cancer. However, clinical imaging cannot accurately evaluate the presence or absence of residual tumor after neoadjuvant chemoradiation. As a result, using radiomics to predict complete pathological response in esophageal cancer has gained in popularity in recent years. Given that the characteristics of patients and sites vary considerably, a meta-analysis is needed to investigate the predictive power of radiomics in esophageal cancer. Patients and Methods: PRISMA guidelines were used to conduct this study. PubMed, Cochrane, and Embase were searched for literature review. The quality of the selected studies was evaluated by the radiomics quality score. I2 score and Cochran’s Q test were used to evaluate heterogeneity between studies. A funnel plot was used for evaluation of publication bias. Results: A total of seven articles were collected for this meta-analysis. The pooled area under the receiver operating characteristics curve of the seven selected articles for predicting pathological complete response in eosphageal cancer patient was quite high, achieving a pooled value of 0.813 (95% confidence intervaI=0.761-0.866). The radiomics quality score ranged from −2 to 16 (maximum score: 36 points). Three out of the seven studies used machine learning algorithms, while the others used traditional biostatistics methods. One of the seven studies used morphology class features, while four studies used first-order features, and five used second-order features. Conclusion: Using radiomics to predict complete pathological response after neoadjuvant chemoradiotherapy in esophageal cancer is feasible. In the future, prospective, multicenter studies should be carried out for predicting pathological complete response in patients with esophageal cancer.
Esophageal cancer is a common gastrointestinal malignancy that causes more than 500,000 cancer deaths per year (1). The 5-year survival rate of patients with esophageal cancer is less than 25% (1). Developing a better treatment strategy for esophageal cancer is important.
The current treatment strategy for locally advanced esophageal cancer is neoadjuvant chemoradiotherapy followed by surgery. If complete metabolic response follows neoadjuvant treatment, then an active surveillance strategy can be considered (2). However, with the commonly used metabolic imaging methods, such as positron-emitted tomography/computed tomography, it is difficult to differentiate between inflammation and residual malignancy after concurrent chemoradiotherapy. In addition, residual malignancy is not always detected by PET/CT scan (3, 4). Other imaging methods, such as magnetic resonance imaging scan, and computed tomography alone, also have the same problem (3, 4). For this reason, attention has turned to using radiomics to predict complete pathological response in order to help patients and physicians to choose the best treatment approach when the watch-and-wait strategy is being considered.
Radiomics is a quickly growing field in which clinical images are transformed into quantitative features. These radiomics features can be further used to predict clinical outcomes. In esophageal cancer, radiomics has been widely used, as a previous review has reported (5). However, radiomics features were found to be associated with the imaging equipment used, technical setting, and processing kernel (6). As a result, there is an unmet need to conduct a meta-analysis to investigate the pooled predictive power of the current model. The results can be used as a benchmark for future large-scale radiomics studies for predicting complete pathological response in esophageal cancer.
Patients and Methods
PRISMA guidelines were used to conduct this study (7).
Eligibility criteria. Articles were included according to the rules provided below: i) Patients with esophageal cancer received neoadjuvant chemoradiation. ii) Radiomics was used to predict complete pathological response versus other responses and to report the area under the receiver operating characteristics curve (AUC). The radiomics features were defined in (8) as the Image Biomarker Standardization Initiative. iii) Full text article. iv) Article in English. v) Published up to 2021.
We excluded articles according to the rules provided below: i) Abstract only or short communication (9). ii) Deep learning only was used to generate features.
Literature search. The Authors searched PubMed, Cochrane, and Embase using key words: (esophageal cancer) AND (radiomics OR texture OR histogram) AND (response OR remission).
Data extraction. We extracted the primary endpoint, which was set as the highest AUC in the validation set (testing set). When there was no external validation set, the result from the internal validation set was chosen. When there was no internal validation, the result from the training set (development set) was chosen. When the result reported the C-index, it was used as the AUC. The collected model had to contain radiomics-related features but was also allowed to contain other features, such as clinical or pathological features.
We also extracted other information, including the region of interest, patient number, image type, radiomics feature type, and algorithm used in the outcome prediction.
Statistical analysis. The random-effects model was used to calculate the pooled AUC and to form the forest plot. The calculation of pooled AUC requires the standard error of the AUC; therefore, when the value was not reported in the literature, we used the formula proposed by Hanley and McNeil to calculate it (10). We also evaluated the study heterogeneity by I2 score and Cochran’s Q test. I2 scores of less than 50% mean that there was low to moderate heterogeneity between the studies; otherwise, there is high heterogeneity between them. p-Values of less than 0.05 indicate statistical significance. The confidence interval calculation of the AUC was calculated using Excel (Microsoft, Redmond, Washington, MA, USA); all other statistics were accomplished using MedCalc (version 19.6.1; MedCalc, Acacialaan, Ostend, Belgium).
Risk of bias. Bias between the selected studies in meta-analysis was measured using a funnel plot. Since the number of collected articles was smaller than 10, we were unable to use Egger’s test according to the recommendation provided by the Cochrane handbook (11).
Quality assessment. Study quality was evaluated by the radiomics quality score (RQS) (12). The quality of the selected literature was evaluated by both Authors. The inter-rater intra-class correlation was calculated based on the total RQS in order to evaluate inter-rater reliability (13).
Results
A total of 210 articles were gathered. The article selection flowchart is provided in Figure 1. After the selection process, 10 articles remained for qualitative meta-analysis. Three articles were conducted by Beukinga et al. (14-16), therefore we only chose one of these for further analysis. Two articles were conducted by Hu et al. (17, 18), therefore we selected only one of them for the subsequent analysis. As a result, seven articles were chosen for quantitative meta-analysis. The details of the collected literature are listed in Table I. All articles extracted radiomics features from the gross tumor volume. In addition, one article also extracted radiomics features from the peri-tumoral area.
The article collection flowchart.
The details of articles in the quantitative meta-analysis.
Overall literature assessment. The pooled AUC for the seven collected studies was 0.813 (95% confidence intervaI=0.761-0.866). The forest plot is provided in Figure 2. The I2 score was 70.28% (Cochran’s Q test: p=0.0026), which means that high heterogeneity existed within our studies. The funnel plot appeared asymmetric (Figure 3). However, because we only included seven studies in our meta-analysis, we cannot determine whether publication bias exists or not by examining the funnel plot.
Forest plot for the area under the receiver operating characteristics (ROC) curve for predicting pathological complete response in patients with esophageal cancer.
Funnel plot.
Quality of radiomics studies. The quality of each of the seven studies was assessed by RQS, as seen in Table II. The intra-class correlation between the two reviewers who independently evaluated the articles was 0.9317 (95% confidence interval=0.6027-0.9883). High intra-class correlation means that our quality assessment was reliable. Inconsistencies were discussed by the reviewers, and the finalized results (Table II) were reached by consensus. The RQS ranged from −2 to 16 in the selected studies. Given that the maximum was 36 points, the highest-rated study received only 40% of the points.
Radiomics Quality Score table.
Review of type of radiomics feature and other features in selected studies. According to International Symposium on Biomedical Imaging (ISBI) standards, the radiomics features can be divided into morphology class, first-order class, and second-order class. The second-order class includes the gray-level (GL) co-occurrence matrix, GL run-length matrix, GL size-zone matrix, GL distance-zone matrix, neighborhood gray tone difference matrix, and neighboring GL dependence matrix (8). We reviewed the radiomics feature type used and other types of features of the selected articles, and the results are provided in Table III. The feature type provided in the table is the feature used in the model with the highest AUC. One of the seven studies used the morphology class feature (19), four studies used the first-order feature (17, 20-22), and five studies used the second-order feature (14, 17, 21-23).
Features used in the model with the highest area under the curve of the selected studies.
Review of the algorithms used in predictive models. The details of the algorithms used in the selected studies are provided in Table IV. Of the selected studies, three used machine learning algorithms (LASSO and SVM) (14, 17, 21), while the others used traditional biostatistical methods (19, 20, 22, 23).
Algorithms used in the predictive model.
Discussion
To our knowledge, this study is the first meta-analysis of the use of radiomics to predict a complete pathological response after neoadjuvant chemoradiotherapy in esophageal cancer. The pooled AUC of the seven selected articles was found to be quite high, at 0.813 (95% CI=0.761-0.866), and the I2 score was 70.28% (p=0.0026). High heterogeneity existed between the studies, which is reasonable given that radiomics features are influenced by imaging equipment, technical setting, and processing kernel (6). The funnel plot shows a symmetric pattern, indicating no publication bias between the selected studies.
Although the predictive power of radiomics was good, the overall quality of the selected studies was poor. The radiomics quality score ranged from −2 to 16 (maximum score: 36 points). Although RQS is widely used in evaluation of the quality of radiomics studies, some of its items are difficult to achieve. For example, it is difficult to conduct analysis of cost-effectiveness for radiomics in cancer. In a radiomics review about lung cancer, among the 14 collected studies, none had performed a cost-effectiveness analysis (24). In addition, carrying out phantom studies for all scanners for radiomics studies is not routine. In a review of renal cell carcinoma studies, none of the 57 studies had performed phantom studies for use of different scanners (25). None of the studies included in this meta-analysis of esophageal cancer performed a cost-effectiveness analysis or carried out phantom studies.
Five studies in our analysis used second-order radiomics features in the predictive model. Second-order radiomics features show the inter-voxel relationship, which can be used to evaluate intra-tumoral homogeneity (26). The intra-tumoral heterogeneity was shown to be correlated to tumor resistance in many cancer types, such as glioblastoma (27) and lung cancer (28).
A machine learning algorithm was used in three out of the seven studies. The machine learning approach has become highly popular in recent years. However, none of the selected studies used deep-learning techniques. This is not surprising as deep learning requires a large sample size to demonstrate its power (29). However, in a single-institution setting, it is difficult to achieve sample sizes of 500 or more patients. A prospective, multicenter radiomics study with deep-learning techniques is anticipated to be launched soon.
A limitation of this study is that no prospective radiomics study was found. Moreover, we only include seven studies in this meta-analysis, so evaluation of publication bias was not feasible. Radiomics features may be influenced by image equipment technical settings, reconstruction kernel, contrast infusion speed, tumor delineation ability, and radiomics software (6). A multi-institutional, prospective study should be conducted to further investigate the predictive power of a radiomics study.
Conclusion
Using radiomics to predict complete pathological response after neoadjuvant chemoradiotherapy in esophageal cancer is feasible.
Footnotes
Authors’ Contributions
Concept design: Kao Y.S.; Data collection: Kao Y.S. and Hsu Y.; Statistical analysis: Kao Y.S.; Manuscript writing: Kao Y.S. and Hsu Y. Final approval of manuscript: Kao Y.S. and Hsu Y.
This article is freely accessible online.
- Received February 3, 2021.
- Revision received March 13, 2021.
- Accepted March 18, 2021.
- Copyright© 2021, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved