Abstract
Background/Aim: Although the expression of mucin 1(MUC1) and prostate stem cell antigen (PSCA) genes is correlated with gastric cancer development and progression, the utility of these two genes as biomarkers of gastric cancer prognosis still needs to be confirmed in clinical practice. This study aimed to develop a model predictive of gastric cancer that integrates several significant single nucleotide polymorphisms (SNPs) of MUC1 and PSCA genes, and some health-risk behavior factors in a Vietnamese population. Patients and Methods: A total of 302 patients with primary gastric carcinoma and 304 healthy persons were included in a case–control study. The generalized linear model was used with the profile of age, sex, history of smoking and using alcohol, personal and family medical history of stomach diseases, and the SNPs of MUC1 and PSCA. The prognostic value of the model was assessed by the area under a receiver operating characteristic curve (AUC) and Akaike Information Criterion (AIC) values. Results: In male participants, the final model, consisting of age, sex, history of smoking and using alcohol, personal and family medical history of stomach diseases and SNP MUC1 rs4072037, provided acceptable discrimination, with an AUC of 0.6374 and the lowest AIC value (539.53). In female participants, the predictive model including age, sex, history of smoking and using alcohol, personal and family medical history of stomach diseases, SNPs MUC1 rs4072037 and rs2070803 had an AUC of 0.6937 and AIC of 266.80. The calibration plots of the male model approximately fitted the ideal calibration line. Conclusion: The predictive model based on age, sex, medical history, and genetic and health-risk behavior factors has a high potential in determining gastric cancer. Further studies that elucidate other genetic variants should be carried out to define high-risk gastric cancer groups and propose appropriate personalized prevention.
According to the GLOBOCAN 2022 report, gastric cancer ranks as the fourth most common cancer in men and seventh in women worldwide (1). Although the incidence of gastric cancer in Western countries is gradually decreasing, this figure is still relatively high in Southeast Asia countries, including Vietnam (2). The pathogenesis of gastric cancer is complicated, and many attributable factors have been discovered, such as Helicobacter pylori infection, subsequent chronic atrophic gastritis, health-risk behaviors, and genetic factors (3). A previous study showed the role of genetic variants in developing cancer, which accounted for 15 to 20% of cases. Data on single-nucleotide polymorphisms (SNPs) and the polygenic risk score, combining multiple SNPs, can be applied in risk stratification to define high-risk populations (4).
The increased function of oncogenes and the loss of function of tumor-suppressor genes can be attributable to genetic alterations, which ultimately trigger gastric cancer (5). The first SNP in gastric cancer was detected in 2008 and presented an association between the SNP rs2976392 of the prostate stem cell antigen (PSCA) and the risk of developing gastric cancer (6). Another gene widely studied in recent years is mucin 1 (MUC1). This transmembrane glycoprotein is commonly overexpressed in various epithelial adenocarcinomas such as of the lung, breast, colon, liver, and stomach (7-10). In gastric cancer, MUC1 plays a vital role in forming a mucosal barrier on the gastric epithelium and is essential in intracellular signaling (11). MUC1 has been shown to modulate chronic gastritis caused by H. pylori (9). In particular, two SNPs, rs2070803 and rs4072037, that control the functional determinants of MUC1 have been shown to be associated with gastric cancer (12, 13). The PSCA gene encodes a membrane glycoprotein and is expressed in the epithelium of the stomach (14). Two SNPs of PSCA, rs2976392 and rs2294008, reduced the transcriptional activity of this gene (15).
Smoking is considered a risk factor for many types of cancer, including gastric cancer (16). The mechanism associating smoking with gastric cancer is well-established. The generation of free radicals and increased apoptosis from smoking can cause precancerous changes in gastric mucosa and promote carcinogenesis (17). Tobacco smoke containing several carcinogens is associated with human gastric cancer. DNA adducts linked to gastric mucosal DNA have been found in smokers experiencing gastric cancer (18). N-Nitroso compounds in cigarette smoke may also be associated with stomach cancer (19). The mechanism underlying alcohol-induced carcinogenesis seems to involve a chronic inflammatory response to direct toxic products of ethanol metabolism, and cytokines. The chronic inflammatory response can also disrupt the gastric mucosal barrier and enhance the absorption of nitrosamines (20). Toxins, including acetaldehyde and acetate, are increased from ethanol metabolism by the enzymes, alcohol dehydrogenase and aldehyde dehydrogenase. Multiple mutations in genes can increase concentrations of these substances (21).
Studies analyzing the association of SNPs, other risk factors, and gastric cancer have been performed in several countries with a high prevalence of gastric cancer, such as Japan, Korea, and China (22-24). In Vietnam, with the highest incidence of gastric cancer in Southeast Asia, research determining the relationship between SNPs and stomach cancer is still scarce (25). According to GLOBOCAN in 2018, Vietnam had 18,000 new cases and 15,000 deaths from gastric cancer (2). Therefore, this study aimed to develop a gastric cancer predictive model integrating several significant SNPs and health-risk behavior factors. Findings from this study are crucial to elucidate the characteristics of gastric cancer pathogenesis and propose novel methods for screening, diagnosis, and risk stratification of gastric cancer in the Vietnamese population.
Patients and Methods
Study settings and subjects. A total of 606 participants were included in a case–control study from January 2016 to December 2018. The case group comprised 302 patients with gastric cancer from four hospitals: Hanoi Medical University Hospital, National Cancer Hospital, 108 Military Central Hospital, and Viet Duc Hospital. The control groups included 304 healthy persons. The inclusion criteria of case groups were: (i) All patients were diagnosed with primary gastric cancer based on the pathology results; (ii) age ≥18 years; (iii) the patients were newly diagnosed and had had no treatments at blood sample collection. The exclusion criteria were: Other malignant tumor, secondary gastric cancer, a history of other cancer, complicated heart diseases, kidney disease, and pregnancy. The control group had a similar age and sex to the case group. The healthy participants had normal gastric endoscopy results.
Data collection. A questionnaire for patients enrolled in the hospital was built to record the data about the risk factors of gastric cancer. The survey content included: Age, sex, educational level, occupation, personal history of gastric diseases, family history of gastric cancer, and smoking and alcohol consumption. Smoking status was classified according to the American Centers for Disease Control and Prevention definition: smoke: smoking at least 100 cigarettes in his/her life; and non-smoker: never ever smoked or smoked fewer than 100 cigarettes in his/her life (26). Alcohol status was assessed based on the Alcohol Use Disorders Identification Test (AUDIT-C), approved by the World Health Organization. A harmful level was identified when a total score of three questions was ≥4 in men and ≥3 in women (27).
Sample collection. Peripheral blood was drawn in an EDTA tube. The samples were stored at −20°C until DNA analysis. DNA was extracted using an Exgene™ Blood SV Kit (Gene All, Seoul, Republic of Korea). A total of 406 samples were assayed at the Quality Control Center for Medical Laboratories (Hanoi Medical University); 200 samples (150 cases and 50 healthy subjects) were analyzed at the Department of Applied Biology (Kyoto Institute of Technology, Japan) using the same method. Ten percent of the total samples were randomly selected for repeat analysis at both laboratories to ensure the accuracy of the results.
Identification of MUC1 and PSCA SNPs in patient DNA. MUC1 and PSCA gene sequences in DNA from patients’ blood samples were determined by the polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) technique with the primer sequences presented in Table I. PCR products were incubated with restriction enzymes Restriction enzymes were manufactured by New England BioLabs, Beverly, MA, USA: MUC1 rs4072037: AlwNI at 37°C for 8 h, MUC1 rs2070803: TaqαI at 65°C for 8 h; PSCA rs2294008: NlaIII at 37°C for 8 h, PSCA rs2976392: PvuII at 37°C for 8 h. The products were analyzed by electrophoresis on 1.5% agarose gel at a voltage of 130 V for 30 minutes. To confirm the results of gene-sequencing analysis, 10% of samples were randomly selected for gene sequencing on an ABI 3500 system (Thermo Fisher Scientific, Waltham, MA, USA).
Primer sequence used in polymerase chain reaction for determining single nucleotide polymorphisms (SNPs) of mucin 1 (MUC1) and prostate stem cell antigen (PSCA).
Data analysis. Data were entered into Epidata 3.0 software (EpiData Association, Odense, Denmark) and analyzed by R software 3.6.2. The differences between the case and control groups were tested using the chi-squared test for qualitative variables and Student’s t-test for quantitative variables. Univariate and multivariate logistic regression models were applied to determine factors associated with gastric cancer.
The model for gastric cancer prediction was built from independent factors of the study, including age, sex, history of smoking, using alcohol, personal medical history of stomach diseases, and family history of stomach cancer using the algorithm “glm” (generalized linear model) in R software. The optimal model was selected by the “summary” algorithm and defined as having the lowest Akaike Information Criterion (AIC) and statistical significance of the independent variables. The diagnostic value of the model was assessed by the area under a receiver operating characteristic curve (AUC), in which a value of ≥0.6 was accepted. The prognostic model was calibrated to determine the model’s accuracy with an accepted Brier index of <0.25. Finally, a predictive map was developed using the “DynNom” algorithm.
Ethical approval statement. This study was approved by the Ethics Committee of Hanoi Medical (decision number 198/HĐĐĐĐHYHN dated September 21, 2016).
Results
Risk factors for gastric cancer. Table II presents the participants’ general characteristics, and the epidemiological and clinical risk factors related to gastric cancer. There were no significant differences in age, sex, and history of smoking between the case and control groups. The history of alcohol use, history of gastric disease, and family history of gastric cancer were statistically significantly different between the groups (p<0.05). In the multivariate logistic regression model, history of alcohol use, history of gastric disease, and family history of gastric cancer were independent predictors of gastric cancer (all p<0.05).
Analysis of general characteristics of participants, and epidemiological and clinical risk factors associated with gastric cancer.
Profile of MUC1 and PSCA SNPs. The four different genotypes (MUC1 rs4072037, rs2070803; PSCA rs2294008, rs2976392) in the case and control groups are shown in Table III. For MUC1 rs4072037, the AA genotype accounted for the highest percentage in the case group (49.4%) and the frequency was statistically higher than in the control group (34.5%). For MUC1 rs2070803, the percentage of AG bearers in the control group was the highest (54.0%) and significantly higher than in the case group (38.1%).
Genotypic characteristics of participants.
Table IV presents the association between the four polymorphisms and the risk of gastric cancer. Only two polymorphisms, MUC1 rs4072037 and rs2070803, had p-values less than 0.01 regarding the risk prediction model. People with the AA genotype had a higher risk of gastric cancer than those with the AG genotype [odds ratio (OR)=2.15, 95% confidence interval (CI)=1.51-3.05] and those carrying a G allele (AG+GG; OR=1.87, 95% CI=1.35-2.60). In addition, having the GG genotype was also related to a higher likelihood of experiencing gastric cancer than having an A allele-bearing genotype (OR=2.01, 95% CI=1.41-2.86 and OR=1.74, 95% CI=1.25-2.43 vs. AG and AG+AA genotype, respectively).
Associations of single nucleotide polymorphisms (SNPs) with susceptibility in gastric cancer risk.
Predictive models of gastric cancer. The assessment of models for gastric cancer prediction among men is shown in Table V and women in Table VI. Regarding male participants, the inclusive model (model 1), which consisted of genetic factors and some risk factors, provided acceptable discrimination, with an AUC of 0.6422 (95% CI=0.5886-0.6957). When comparing the four models, model 4 (age, four lifestyle/history risk factors, and MUC1 SNP rs4072037) had the lowest AIC value (539.53) with an AUC of 0.6374 (95% CI=0.5836-0.6912). For the predictive model for female patients, the inclusive model provided significantly higher discriminatory ability than the model of genetic factors with some risk factors (AUC=0.7073, 95% CI=0.6359-0.7787). Finally, age, four lifestyle/history risk factors, MUC1 SNPs rs4072037 and rs2070803 were chosen in the predictive model, with an AUC of 0.6937 (95% CI=0.6208-0.7665) and the lowest AIC (266.80).
Assessing the performance of the predictive models among male participants.
Assessing performance of the predictive model among female participants.
Figure 1 presents the discriminatory ability of the model for gastric cancer prediction in males and females. In men, the model included age, history of smoking, history of alcohol use, history of stomach disease, family history of stomach cancer, and MUC1 rs4072037. The AUC of the model for men was 0.6374 (95% CI=0.5836-0.6912) with an accuracy of 0.6035. Furthermore, the women’s model included age, history of smoking, history of alcohol use, history of gastric disease, family history of gastric cancer, MUC1 SNPs rs4072037 and rs2070803. The AUC of the model for women was 0.6937 (95% CI=0.6208-0.7665) with an accuracy of 0.6231.
Area under the receiver operating characteristics curve (AUC) for predictive model 4 among male (A) and predictive model 3 among female (B) participants. CI: Confidence interval.
Data for predictive models were calculated and shown in Figure 2. The ideal line was consistent with the logistic calibration for both sexes. The Brier scores of model 4 for male and model 3 for female were 0.232 and 0.261, respectively.
Calibration of the gastric cancer predictive models in men (A) and women (B).
Using the DynNom algorithm of the R software, a mathematical graph to represent the predictive models of gastric cancer risk based on the independent variables (regardless of sex) is shown in Figure 3. The model with age, history of smoking, history of alcohol use, family history of gastric cancer, MUC1 SNPs rs4072037 had the highest predictive value for gastric cancer.
The 10 models predicting gastric cancer using various variables including age, personal history of gastric disease, family history of gastric cancer and mucin 1 (MUC1) single nucleotide polymorphism rs4072037. Data shown in the chart are probabilities for each model with the 95% confidence interval.
Discussion
To our knowledge, this is the first study in Vietnam to develop a gastric cancer predictive model which combined genetic risk factors and health-risk behaviors. Regarding sex, the model for men consisted of one SNP (MUC1 rs4072037). Two SNPs were present in the female model (MUC1 rs4072037 and rs2070803), and this risk assessment model showed a higher discriminative ability than the male model. Discriminatory power was slightly enhanced in the final model, which integrated genetic and other factors (history of smoking, alcohol use, gastric disease, and family history of gastric cancer) than in single models. Thus, this finding suggests the critical role of gene and lifestyle-related factors in estimating the risk of experiencing gastric cancer.
Several previous studies presented predictive models for gastric cancer in a large-scale population. Charvat et al. conducted a cohort study on 19,028 Japanese individuals to estimate a simple risk scoring system for stomach cancer, which covered lifestyle-related factors (age, sex, smoking status, salty-food habit) and classification based on the status of screening for gastric cancer-risk method (gastritis A, B, C and D) using a combined assay for serum H. pylori IgG antibody and serum pepsinogen levels, called the ABC method (28). In addition, another study developing a gastric cancer risk assessment tool for the Japanese population revealed good discrimination and calibration with the combination of age, sex, clinical features (hemoglobin A1c level, obesity, smoking, and alcohol use) and biological markers (H. pylori antibody and atrophic gastritis) (29). Taninaga et al. suggested that a model of H. pylori antibody status and the presence of atrophic gastritis (serum pepsinogen level) can accurately assess the risk of gastric cancer (30). A study by Cai et al. in China developed a gastric cancer estimation model using participants of multiple nationwide centers based on age, sex, serum pepsinogen I/II, gastrin-17 level, H. pylori infection and consumption of pickled and fried food, which provided good performance (31). However, these studies solely relied on clinical and subclinical characteristics and did not take genetic factors into account. In Japan, Ishikura et al. selected 14 gastric cancer-susceptible SNPs reported in previous research as candidate genetic risk factors. These SNPs were genotyped using TaqMan Single Nucleotide Polymorphism Genotyping Assays (Applied Biosystems, Foster City, CA, USA). Based on the genotyping results, three SNPs named MUC1 rs4072037, PSCA rs2294008, and histo-blood group alpha 1-3-N-acetylgalactosaminyltransferase and alpha 1-3-galactosyltransferase (ABO) gene rs7849280 were chosen to develop a predictive model combined with some environmental and lifestyle factors of gastric cancer (smoking, alcohol consumption, energy-adjusted fruit and vegetable intake, family history of gastric cancer, and the ABC classification method (22). In a case–control study on a Korean population, 12 SNPs analyzed from peripheral blood were proven to be associated with gastric cancer, including PSCA rs2294008. When these SNPs were applied in a sex-specific gastric cancer risk assessment model that was developed for the Korean population (including age, BMI, some eating behaviors alcohol consumption smoking amount physical activity), the predictive value was improved (23).
In this study, we chose four SNPs, MUC1 rs4072037 and rs2070803, and PSCA rs2294008, and rs2976392 as genetic risk factors. In the final models, only MUC1 rs4072037 was included in the male model, and both MUC1 rs4072037 and rs2070803 were selected in the female model. This finding is in agreement with a previous study which showed that MUC1 rs4072037 was an important SNP in a gastric cancer-predictive model (15, 22). MUC1 is a type of mucin that is primarily distributed on the apical surface of epithelial cells of the gastrointestinal and respiratory tracts (32). rs4072037 is an SNP located in the second exon of the gene (33). Membranous MUC1 is known to be a ligand for H. pylori in the stomach (34). In a previous study, abnormal expression of MUC1 was mentioned as a factor that was positively related to the development and metastasis of tumors (35), such as stomach, colon, lung, and breast cancer (36); MUC1 rs4072037 was especially associated with the risk of gastric cancer (34, 37). Along with rs4072037, rs2070803 is another representative SNP of MUC1. However, the role of rs2070803 in gastric cancer is still a controversial issue. A previous study revealed a positive association between MUC1 rs2070803 and diffuse-type gastric cancer in Japanese and Korean populations (13). However, MUC1 rs2070803 played a protective role in gastric cancer among a Chinese population (24).
The discriminatory powers of PSCA rs2294008 and rs2976392 in the genetic factor model were lower than that of MUC1 rs4072037 and rs2070803 for both sexes. PSCA is a cell-surface antigen and is presented on differentiating gastric epithelial cells (14). PSCA rs2294008 might reduce the transcriptional activity of the PSCA promoter by modulating its upstream region (38). A two-stage genome-wide association study for stomach cancer in Japan and Korea showed that two SNPs in PSCA (rs2976392 G>A and rs2294008 C>T) were significantly associated with the occurrence of gastric cancer, especially of the diffuse type (6). Moreover, previous studies on Chinese populations also confirmed the relationship between these two SNPs and stomach cancer susceptibility (14, 39). In our study, the combination of four SNPs in the genetic factor model had a moderate ability to detect gastric cancer, and its AUC was higher than that of the model including lifestyle/history factors. This suggests the utility of including gene polymorphisms in risk prediction compared to other health risk factors.
Several strengths can be drawn from our study. This is the first study developing a predictive model for gastric cancer for the Vietnamese population with information on genetics and lifestyle characteristics. Moreover, our study findings can improve the identification of gastric cancer susceptible in the population. In addition, we controlled potential confounding by matching the age and sex of participants in the two groups. However, some limitations should be acknowledged. Information on medical history and health-risk behaviors was collected retrospectively, which may have led to recall bias. Although we built a mathematical graph to represent a predictive model of gastric cancer, the classification of high- and low-risk was not calculated. Consumption of salty food and pickled food were not considered despite their being known as risk factors for stomach cancer. Finally, we only included four SNPs in the predictive model, leading to the analysis lacking well-established susceptible loci.
Conclusion
In conclusion, our predictive model based on age, sex, medical history, and genetic and health-risk behavior factors in the Vietnamese population has high potential in determining gastric cancer susceptibility. From a clinical perspective, physicians and health personnel now have one more effective predictive model for gastric for patients before having gastroscopy results. Further studies that elucidate other genetic variants should be carried out to define high-risk gastric cancer groups and propose appropriate personalized prevention.
Acknowledgements
The Authors would like to acknowledge graduate students of the Department of Biochemistry, Hanoi Medical University; Center for Quality Control of Medical Tests, Hanoi Medical University; and Professor Masamitsu Yamaguchi of the Department of Applied Biology, Advanced Insect Research Promotion Center, Kyoto Institute of Technology, Kyoto, Japan. This research was funded by the Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 106-YS.02-2015.37. The funding included research design, samples, data collection, and gene analysis.
Footnotes
Authors’ Contributions
Conceptualization: Ngoc Lan Nguyen, Ngoc Dung Dang and Van Ta. Data curation: Ngoc Lan Nguyen and Quy Vu. Formal analysis: Ngoc Lan Nguyen, Quy Vu. Investigation, Ngoc Lan Nguyen, and Ngoc Dung Dang. Methodology: Ngoc Lan Nguyen and Ngoc Dung Dang. Project administration: Van Ta. Resources: Ngoc Lan Nguyen and Quy Vu. Software: Ngoc Lan Nguyen and Quy Vu. Supervision: Van Ta. Validation: Ngoc Lan Nguyen and Quy Vu. Visualization: Quy Vu and Anh Dang. Writing – original draft: Ngoc Lan Nguyen, Quy Vu, and Anh Dang. Writing – review and editing: Ngoc Lan Nguyen, Ngoc Dung Dang, Quy Vu, Anh Dang, and Van Ta. All Authors have read and agreed to the published version of the article.
Conflicts of Interest
The Authors report that there are no competing interests to declare.
- Received June 3, 2023.
- Revision received July 2, 2023.
- Accepted July 5, 2023.
- Copyright © 2023, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY-NC-ND) 4.0 international license (https://creativecommons.org/licenses/by-nc-nd/4.0).