Abstract
Background/Aim: MicroRNAs (miRNAs) are small non-coding RNA molecules that regulate gene expression and have been associated with the development of various cancers, including epithelial ovarian cancer (EOC). Accurate quantification of miRNA levels is important for determining their role in tumorigenesis and as biomarkers. Currently, U6 is widely used as a normalization control when investigating miRNAs in EOC; however, its variable expression across cancers has been reported. As only a few studies have been published to date on the identification of endogenous miRNA controls in EOC, our aim was to identify stable miRNAs based on global microarray profiling of 197 EOC patients and verify their stability in external datasets. Materials and Methods: We collected miRNA-microarray data from four datasets: the in-house “Pelvic Mass”, and three public datasets with primary EOC patients: The Cancer Genome Atlas, GSE47841, and GSE73581. The expression stability of endogenous control candidates was evaluated by their coefficient of variation. Results: All miRNA results in the used cohorts were produced by either Affymetrix or Agilent technologies, which show similar intra-platform patterns. Nonetheless, a clear difference in a cross-platform comparison was observed. We identified hsa-miR-92b-5p and hsa-miR-106b-3p as stable candidates shared between four datasets. Moreover, we investigated the stability performance of eight miRNAs that have been previously reported as stable endogenous controls in EOC and various performance was observed in four datasets. Conclusion: The selection of suitable endogenous miRNA normalization controls in EOC remains to be resolved, as variability in miRNA performance between platforms might have a crucial impact on the biological interpretation of data.
Ovarian cancer (OC) is a heterogenous disease comprising several histologic subtypes with approximately 90-95% of cases being of epithelial origin (1-3). The major subclasses of epithelial OC (EOC) include serous (75%), endometrioid (10%), clear cell (10%), and mucinous (3%) (4). Finding key molecular differences among these subtypes could help to develop new approaches for early detection and treatment. MicroRNAs (miRNAs) are small non-coding RNA molecules that function in transcriptional and post-transcriptional regulation of gene expression and have been associated with cancer development, including epithelial ovarian cancer (EOC) (1, 5). However, the lack of standardized protocols for performing miRNA detection has hampered research and the possible application of miRNAs in the clinic (6-10). Quantification of miRNAs is not a trivial task because of their short length, close sequence similarities within miRNA families, as well as occurrence of isoforms and O-methyl modifications (11).
Real-time qRT-PCR is considered as one of the most powerful techniques to analyze miRNAs and is widely employed to validate findings from large-scale microarray profiling (12). However, the results might be biased by the use of inappropriate normalizers (13). Currently, U6 (RNU6-1), a small nuclear RNA (snRNA), is the most common endogenous control in the research of miRNAs in OC tissues and cells (14-22), despite the reported high inter-individual variances and expression instability in cancers (13, 23-29).
To our knowledge, only a few reports on the identification of endogenous miRNAs in OC have been published (30-32). Yokoi et al. aimed to develop a screening strategy to discriminate cancer patients from healthy women based on miRNA profiling of 4,046 serum samples, which included 333 ovarian cancers, 66 borderline ovarian tumors, 29 benign ovarian tumors, 859 other solid cancers, and 2,759 non-cancer controls (30). The signals among the microarrays were normalized by use of three control miRNAs: hsa-miR-149-3p, hsa-miR-2861, and hsa-miR-4463. Bignotti et al. tested the stability of eleven putative endogenous miRNA candidates on a total of 75 high-grade serous OC (HGS-OC) and 30 normal tissues by using qRT-PCR. Hsa-miR-191-5p was identified as the best reference for miRNA studies, with prognostic intent on HGS-OC tissues (31). Elgaaen et al. analyzed the differences in miRNA expression between high-grade serous OC (HGS-OC, n = 12), clear cell OC (CCC, n = 9), and ovarian surface epithelium (OSE, n = 9) by global miRNA profiling and found that hsa-miR-24 and hsa-miR-26a had the lowest expression variation (32).
The careful choice of endogenous miRNA controls is essential to produce reliable miRNA data, as it drastically reduces the differences resulting from sampling and the quality of RNA, thus leading to identification of real changes in miRNA expression levels (33). Therefore, our aim was to identify stably expressed miRNAs based on global miRNA expression patterns derived from Affymetrix microarray profiling of 197 EOC patients. As the capability to detect miRNAs was reported to be platform-dependent (11), we validated our findings using three external datasets obtained either from Affymetrix or Agilent platforms, retrieved from the NCBI Gene Expression Omnibus database and from The Cancer Genome Atlas (TCGA) database. Moreover, we conducted a literature search to find potential endogenous control miRNAs that have been employed in qRT-PCR validation studies in OC (30, 31, 34, 35). The stability performance of eight previously reported reference miRNAs was assessed in four independent microarray profiling datasets.
Materials and Methods
Datasets. We collected data from four independent datasets: 1) one in-house dataset, Pelvic Mass (PM), and three publicly datasets available from patients with primary EOC: 2) The Cancer Genome Atlas (TCGA) (36), 3) GSE47841 (32), and 4) GSE73581 (37). For detailed information regarding biospecimen collection, clinical data, and sample processing, we refer to the original publications for each dataset.
PM dataset. MicroRNA microarray profiling was performed on 197 EOC patients (162 serous carcinomas, 15 endometrioid carcinomas, 11 mucinous carcinomas, and 9 clear cell carcinomas) by use of Affymetrix GeneChip miRNA 1.0 Array platform (Affymetrix, Santa Clara, CA, USA), as described previously (38-40). Processing of raw data by the robust multi-array average (RMA) method (41), resulted in 854 miRNAs. These miRNA data are deposited on the GEO database under reference number GSE94320.
The Cancer Genome Atlas (TCGA) (36). MicroRNA array profiling was performed on patients with ovarian serous adenocarcinoma by use of Agilent 8 x 15K Human miRNA platform, as previously described (36). The processed data (“Level 3”) were made available to the public through the Genomic Data Commons (GDC) Data Portal (42). We downloaded the file: OV.Merge_mirna_h_mirna_8x15kv2 unc_edu Level_3 unc_DWD_Batch_adjusted data.Level_3.2016012800.0.0.tar.gz by use of the RTCGA R package (43). The clinical data were obtained through The Cancer Imaging Archive (TCIA) Public Access database (44). From the original dataset, we excluded samples from the patients based on the following criteria: 1) including samples with tissues derived from ovary, and 2) excluding samples without assigned FIGO stage.
GSE47841 (32). Elgaaen et al. analyzed the differences in miRNA expression between high-grade serous OC (HGS-OC, n = 12), clear cell OC (CCC, n = 9), and ovarian surface epithelium (n = 9) by global miRNA profiling with the Affymetrix GeneChip miRNA 2.0 Array platform. We acquired the raw microarray data files for 12 HGS-OC and 9 CCC patients through GEO Series accession number GSE47841 and processed them by using Affy R package and RMA method (45).
GSE73581 (37). A total of 179 primary ovarian cancer samples were profiled on Agilent SurePrint 8x60K human miRNA arrays, as previously described (37).
miRNA stability ranking. The coefficient of variation (CoV) was defined as the ratio between standard deviation and mean of each miRNA’s expression value after normalization. The lower the CoV, the more stable the expression of miRNA (46). We employed the R package miRBaseConverter to convert miRNA annotation from all datasets to the latest miRbase version (version 22) (47). MiRNA entries removed from the miRbase database were excluded from the datasets. A total of 499 miRNAs were shared among the four datasets. For each dataset, two ranking lists were prepared: 1) a list with all miRNAs included in the dataset, 2) a list that contained stability-ranked 499 miRNAs, mutual for all four datasets.
Results
Table I provides an overview of the size of cohorts, histological type of tumors, FIGO stage, and the platform used for miRNA profiling. To assess the performance of microarray platforms in various studies, we calculated the coefficient of variation (CoV) for each miRNA. Figure 1A shows the differences in mean expression levels as a function of CoVs for the miRNAs, whereas in Figure 1B the frequency distribution of CoVs in each dataset is presented.
Characteristics of the four datasets used in the study.
The performance of four datasets measured by the coefficient of variation: (A) miRNA mean expression (log2) vs. coefficient of variation. (B) Distribution of coefficient of variation values for each dataset.
We identified 10 most stable miRNA candidates when considering all miRNAs available within each individual dataset (Table II and Figure 2A). There were no miRNAs shared between all four datasets; however, hsa-miR-24-3p, hsa-let-7b, hsa-miR-107, and hsa-miR-320c were mutual in both cohorts, including the results from the Affymetrix platform (PM and GSE47841).
Top 10 most stable candidates in each dataset.
Through literature search, we found eight previously reported miRNAs that have been used as endogenous control miRNAs in qRT-PCR validation studies in OC (30, 31, 34, 35). For each of these miRNAs, the rank position (if available) in the four datasets used in this study is presented in Table III.
The cohorts were filtered to include miRNAs mutual for all four datasets, resulting in 499 targets. Next, we identified the top 100 candidates in the datasets to identify any shared miRNAs (Figure 2). We found that two miRNAs: hsa-miR-106b-3p and hsa-miR-92b-5p were among the top 100 candidates for all datasets (Table IV).
Discussion
Identifying differentially expressed miRNA panels among subgroups of EOC may help to develop tools for clinical management and potentially early detection. Unfortunately, a consensus regarding optimal methods for miRNA quantification and validation across studies has yet not been reached, which results in contradictory reports. This could be because of the small cohort size, high tumor heterogeneity, different morphologies, and stage of disease, but also can be caused by various technical reasons, such as the normalization method and miRNA control employed. All may significantly impact the interpretation of results (48, 49). The inconsistency on miRNA expression levels or patterns has been previously observed between platforms [real-time qRT-PCR, microarray, next generation sequencing (NGS)] or even within the same platform provided by different vendors (11, 49-54). Mestdagh et al. found significant inter-platform differences with respect to reproducibility, specificity, sensitivity, and accuracy while investigating 12 commercially available platforms, including qPCR, microarray (Affymetrix, Agilent, Nanostring), and NGS (50). Interestingly, low concordance of differential miRNA expression with only 54.6% average validation rate between any two platform combinations was observed, which emphasizes the need of awareness in the choice of the platform for miRNA-based studies.
To perform our study, we collected the data from four independent datasets: one in-house dataset, PM, and three publicly available cohorts from patients with primary EOC. All miRNA results in the used cohorts were performed by either Affymetrix or Agilent, which showed similar intra-platform patterns, in regard to mean expression vs. CoV, and frequency distribution of CoV. Nonetheless, the difference was clear when comparing these platforms (Figure 1). Given that no shared miRNAs were observed between the top 10 candidates for all miRNAs available for each individual dataset (Table II), we investigated the top 100 candidates from the mutual 499 miRNAs across the four datasets. Two candidates were found: hsa-miR-106b-3p and hsa-miR-92b-5p (Table IV and Figure 2B). To our knowledge, these miRNAs have not been previously reported as endogenous controls for miRNA research.
Identification of stable miRNAs in four datasets: A. Venn diagram with top 10 stable candidates chosen from all miRNAs within each dataset. B. Venn diagram with top 100 stable candidates chosen from 499 miRNAs mutual for four datasets. Two miRNAs are shared between four datasets: hsa-miR-106b-3p and hsa-miR-92b-5p.
We investigated how eight previously reported miRNA candidates perform on the ranking lists for each dataset (Table III). All of them were in the top 50 candidates from a full list of available miRNAs for both Affymetrix datasets, in spite the fact that none of them have been reported as most stable on the Agilent-based studies. Yokoi et al. performed miRNA profiling from 4,046 serum samples, including 333 ovarian cancers, 66 borderline ovarian tumors, 29 benign ovarian tumors, 859 other solid cancers, and 2,759 non-cancer controls (30). The microarray signals were normalized by using three miRNAs: hsa-miR-149-3p, hsa-miR-2861, and hsa-miR-4463. These internal controls were chosen based on a previous study related to breast cancer research, though the details of the selection were not provided. In the current study, the stability of these controls was not in full agreement across the four datasets. For example, hsa-miR-149-3p ranked as follows: 7/826 in PM, 48/1,079 in GSE47841, but 700/899 in GSE73581 and 321/712 in TCGA. Bignotti et al. suggested hsa-miR-191-5p as the best normalization control for miRNA-based prognostic studies in HGS-OC. In our study, hsa-miR-191-5p ranked as 18/826 in PM, 12/1,079 in GSE47841, but 286/712 in TCGA.1063
Ranking results for miRNAs candidates selected based on the literature study.
Mean, coefficient of variation (CoV) and rank position for two miRNAs shared among the top 100 stable candidates chosen from 499 miRNAs mutual for four datasets.
The size and the subgroup characteristics might also influence the outcome of the studies (49). Four datasets varied in the number of samples included and the distribution of histological types or FIGO stages. PM (Affymetrix) and GSE73581 (Agilent) are similar in terms of the cohort size and FIGO stages, but do not show a similar panel of most stable miRNAs. Both Affymetrix datasets (PM and GSE47841) share 4 miRNAs among the top 10 stable miRNAs, although the size of employed cohorts was different, 197 and 21, respectively.
Conclusion
Our study emphasizes the need of awareness in the choice of normalization control, which is not a trivial task. It is crucial to achieve consensus on stable endogenous miRNA controls to make validation possible across studies. We found the two miRNAs, hsa-miR-106b-3p and hsa-miR-92b-5p, being stable and recommend those to be considered as endogenous miRNA controls in future miRNA studies in EOC. Nonetheless, further validation studies will be crucial to confirm their performance.
Acknowledgements
The Authors thank the Danish CancerBiobank and the Danish Gynecologic Cancer Database for providing data presented in this study. This work received financial support from: The Mermaid Foundation, available at: http://www.mermaidprojektet.dk/ (JLJ, CKH and EVH), Danish Cancer Research Foundation, available at: http://www.dansk-kraeftforsknings-fond.dk/ (EVH), and Herlev Hospital Research Council, available at:·https://www.herlevhospital.dk/forskning/ (EVH).
Footnotes
Authors’ Contributions
JLJ conceived of the presented idea and performed the computations. All Authors participated in data analysis, discussed the results, and contributed to the writing of the final manuscript.
Conflicts of Interest
The Authors declare that there are no conflicts of interest in relation to this study.
- Received February 17, 2022.
- Revision received March 22, 2022.
- Accepted March 23, 2022.
- Copyright © 2022, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY-NC-ND) 4.0 international license (https://creativecommons.org/licenses/by-nc-nd/4.0).