Abstract
Background/Aim: Although a negative appendectomy in female patients with acute abdominal pain (AAP) can be twice as frequent as in male patients, the accuracy of diagnostic scores (DSs) in acute appendicitis (AA) is rarely considered among patients with AAP. The aim was to study the gender-specific performance of a DS in AA. Patients and Methods: As an extension of the World Organisation of Gastro-Enterology Research Committee (OMGE) AAP study, 1,333 patients presenting with AAP were inclu ded in the study. The clinical history and diagnostic symptoms (n=22), signs (n=14) and laboratory tests (n=3) were recorded in each patient. Results: The most significant diagnostic predictors were used to construct DS formulas for AA diagnosis, separately for both genders. The formulas were tested at 6 different cut-off levels to find the best diagnostic performance for AA in females and males. The highest specificities of the DSLC− [DS without leucocyte count (LC)] and DSLC+ (DS with LC) scores in detecting AA were 98% (95% CI=97-99%) and 98% (95% CI=96-99%), respectively. In the ROC comparison test, there was no statistically significant difference in the performance of DSLC− and DSLC+ in female and male patients. Conclusion: Our gender-specific DS reached very high AUC values for AA (0.948-0.956) in both genders, and there was no statistically significant difference in the AUC values of DSLC− and DSLC+ between women and men with AAP.
- Acute appendicitis
- acute abdominal pain
- diagnostic score
- female
- male
- leucocyte count
- ROC
- HSROC
- diagnostic accuracy
Acute appendicitis (AA) with appendectomy is the most common cause of acute surgical abdomen in the Western countries (1). In a recent meta-analysis, Ferris et al. (1) showed that the pooled incidence of the AA with appendectomy was 100/100,000 person years (py) (95% CI=91-100%) in the 2000's in Northern America, and they estimated that the number of AA cases in 2015 was 378,614. In Europe, the pooled incidence ranged from 151/100,000 py in Western Europe to 105/100,000 py in Eastern Europe. The incidence of AA has remained quite stable in most Western countries but in the newly industrialized countries AA is rapidly rising, with the pooled incidence of AA in South Korea being 206/100,000 py, Turkey 160/100,000 py and Chile 202/100,000 py.
We recently designed a diagnostic score (DS) to improve the diagnostic precision in distinguishing AA from non-specific abdominal pain (NSAP) (2). Our experience suggests that DS could assist the clinician in differentiating AA from NSAP and other causes of acute abdominal pain (AAP), although leucocyte count (LC) does not improve the diagnostic performance of a DS in AA (2). Given that the incidence of AA is higher in female patients presenting with AAP than in male patients (3) and to get more insight into the important symptoms, signs and tests related to the clinical diagnosis of AA, we assessed the performance of the DS model i) without leucocyte count (DSLC−) and ii) with leucocyte count (DSLC+), separately in female and male patients with AA.
Patients and Methods
Crite ria for inclu sion in this study and the diagnos tic criteria were those set out by the Research Committee of the World Organization of Gastroentero logy (OMGE) (2, 4, 5, 6, 7, 8). There were 636 men (47.7%) and 697 women (52.3%) with mean age (±SD) of 38.0±22.1 years (2).
The examination of the clinical symptoms, signs and tests were conducted using a standard tech nique and the results were graded positive or negati ve as previously described (2) (Tables I and II). The diagnosis of AA was done by considering all symptoms, signs and results of the laboratory tests together and the diagnostic criteria of AA (2, 4-7).
Identifying the DS models. In the computation of the diagnostic score (DS), a multivariate logistic (stepwise) regression analysis (SPSS Statistics 26.0.0.1; IBM, NY, USA) was used to disclose the variables with an independent predictive value. All the variables presented in Tables I and II were included in the analysis as binary data e.g. AA=1 and other diagnosis of AAP=0. Using the coefficients of the regression model, a DS was built and its predictive value for AA was studied. The coeffi cient of the multivariate analysis shows the relative risk (RR=e_, n=β) of a patient with a given symptom or sign to have an AA.
The formula without LC (DSLC−) in women. The formula without LC, showing the highest diagnostic performance for AA in HSROC analysis is as follows DSLC−=2.98×tenderness (positive endpoint=1, negative endpoint=0)+2.45×rigidity (positive endpoint=1, negative endpoint=0)+2.08×guarding (positive endpoint=1, negative endpoint=0)+1.33×pain at diagnosis (positive endpoint=1, negative endpoint=0)+0.88×renal tenderness (positive endpoint=1, negative endpoint=0)-7.22. The mean (SD) of DSLC− values for AA in women (n=121) were 0.61 (1.81) and the DSLC− mean (SD) values for all female patients with AAP (n=697) were −3.41 (2.84) (Table III).
The formula with LC (DSLC+) in women. The formula with LC, showing the highest diagnostic performance for AA in women is as follows DSLC+=3.17×tenderness (positive endpoint=1, negative endpoint=0)+2.39×rigidity (positive endpoint=1, negative endpoint=0)+2.00×guarding (positive endpoint=1, negative endpoint=0)+1.63×LC+1.45×pain at diagnosis (positive endpoint=1, negative endpoint=0)+0.77×renal tenderness (positive endpoint=1, negative endpoint=0)-7.80. The mean (SD) of DSLC− values for AA in women (n=111) were 1.32 (1.81) and DSLC+ mean (SD) values for all women with AAP (n=575) were −3.10 (3.21) (Table IV).
The formula without LC (DSLC−) in men. The formula without LC, showing the highest diagnostic performance for AA in HSROC analysis is as follows DSLC−=1.97×tenderness (positive endpoint=1, negative endpoint=0)+1.88×previous abdominal surgery (positive endpoint=1, negative endpoint=0)+1.61×rebound (positive endpoint=1, negative endpoint=0)+1.43×rigidity (positive endpoint=1, negative endpoint=0)+1.30×pain at diagnosis (positive endpoint=1, negative endpoint=0)+1.14×guarding (positive endpoint=1, negative endpoint=0)+1.05×body temperature (positive endpoint=1, negative endpoint=0)−7.69. The mean (SD) DSLC− values for AA in males (n=149) were −1.13 (1.74) and DSLC− mean (SD) values for all men with AAP (n=636) were −2.63 (3.05) (Table V).
The formula with LC (DSLC+) in men. The formula with LC, showing the highest diagnostic performance for AA in men is as follows DSLC+=2.51×previous abdominal surgery (positive endpoint=1, negative endpoint=0)+2.18×LC (positive endpoint=1, negative endpoint=0)+1.58×pain at diagnosis (positive endpoint=1, negative endpoint=0)×1.41×tenderness (positive endpoint=1, negative endpoint=0)+1.25×rigidity (positive endpoint=1, negative endpoint=0)+1.04×rebound (positive endpoint=1, negative endpoint=0)+0.97×guarding+0.81×rectal digital tenderness+0.74× body temperature-8.86. The mean (SD) DSLC− values for AA in males (n=133) were 1.58 (1.77) and DSLC+ mean (SD) values for all male patients with AAP (n=476) were −2.33 (2.21) (Table VI).
Statistical analysis. The other statistical analyses were performed using STATA/SE version 16.1 (StataCorp, College Station, TX, USA). Statistical tests presented were two-sided, and P-value <0.05 was considered statistically significant. Using 2×2 tables, we calculated sensitivity (Se) and specificity (Sp) with 95% confidence intervals (95% CI) for each symptom, sign or test, and created separate forest plots for showing each set of data, separately for each diagnostic variable. We calculated the summary estimates of sensitivity (Se) and specificity (Sp), positive (LR+) and negative likelihood ratio (LR−) and diagnostic odds ratio (DOR), using a random effect bivariate model and fitted the summary hierarchical receiving operating characteristic (HSROC) curves, including all diagnostic variables in the DSLC− and DSLC+ models, using the AA endpoint.
Using the STATA's predict tool, we also made posterior predictions [Empirical Bayes (EB) estimates] of the Se and Sp in each variables in both female and male AA patients in DSLC− and DSLC+ Analogous to its use in meta-analysis, EB estimates here give the best estimates of the true Se and Sp for each diagnostic variable, the variable-specific point estimates usually shrinking toward the summary point of the HSROC. We explored the statistical heterogeneity between diagnostic variables and DS models through visual examination of the forest plots and the HSROC curves. To study the potential bias, we used the Cook's distance to check for the particularly influential variables, together with a scatter plot of the standardised (level 2) residuals to find out the variables that are distinct outliers.
Results
Diagnostic performance of the symptoms. The pooled overall gender-specific (F vs. M) Se of the diagnostic symptoms for detecting AA was 80% (95% CI=67%-90%) and 81% (95% CI=82-94%), respectively (Figures 1 and 2). In women Se was higher than 80% for 12 diagnostic symptoms, while in men Se was higher than 81% for 11 diagnostic symptoms. The five best diagnostic symptoms in women (vertigo, jaundice, micturition, drugs for abdominal pain and use of alcohol) showed 95-99% Se, whereas in men (vertigo, jaundice, micturition, drugs for abdominal pain and use of alcohol) showed 99-100% Se in diagnosis of AA (Figure 2). The pooled overall Sp of the diagnostic symptoms for detecting AA was 30% (95% CI=19%-42%) and 31% (95% CI=20-43%) for women and men, respectively (Figures 3 and 4). In women 9 diagnostic symptoms showed Sp higher than 30%, while in men Sp exceeded 31% for 10 diagnostic symptoms. The five best diagnostic symptoms in diagnosis of AA among women (location of initial pain, location of pain at diagnosis, type of pain, relieving factors, vomiting) showed 60-84% Sp while those (location of initial pain, location of pain at diagnosis, type of pain, relieving factors, vomiting) in men showed Sp of 51-89% (Figure 4).
Diagnostic performance of the signs and tests. The pooled overall Se of the diagnostic signs and tests for detecting AA was 86% (95% CI=79%-92%) and 88% (95% CI=82-94%), for women and men, respectively (Figures 5 and 6). In women 10 diagnostic signs and tests had Se exceeding 86%, while in men Se was higher than 88% for 10 diagnostic signs and tests. In diagnosis of AA the five best diagnostic signs and tests in women (distension, tenderness, mass, Murphy's sign, urine) showed 93-99% Se whereas those (scar, distension, mass, Murphy's sign, urine) in men showed 95-99% Se (Figure 6). The pooled overall Sp of the signs and tests was 34% (95% CI=20%-50%) and 34% (95% CI=20-51%) for women and men, respectively (Figures 7 and 8). In women 8 diagnostic signs and tests showed Sp higher than 34%, whereas in men Sp was 34% for 7 diagnostic signs and tests. The seven best diagnostic signs and tests in women showed 59-90% Sp whereas those in men showed 53-86% Sp in diagnosis of AA (Figure 8).
Diagnostic performance of the DS without leucocytes (DSLC−) in women. The most important predictors of AA in women without LC (n=697) were tenderness, rigidity, guarding, location of pain at diagnosis, and renal tenderness. The significant predictors were used to construct the DSLC− formula for AA diagnosis. In practice, the use of the DS formula is relatively simple as shown by the following; “DSLC−=a female patient is admitted to the emergency room with abdominal pain; at diagnosis the pain was localized at RLQ (1 point×1.33); clinical examination showed RLQ tenderness (1 point×2.98), rigidity (1 point×2.45), guarding (1 point×2.08) and the renal tenderness test was positive (1 point×0.88)”. The best diagnostic performance level for DSLC− formula in females (Se=93%, Sp=92%) in AA diagnosis was reached when the patients with a DSLC− value between −2.03 and −0.49 were considered as “grey area” patients=follow-up required before the decision to operate (n=123). The formula was tested at six different cut-off levels to disclose the best diagnostic performance in women (Figures 9 and 10). The pooled overall Se and Sp of these six DSLC− formulas were 88% (95% CI=83-92%) and 89% (95% CI=84-94%), respectively (Figures 9 and 10). Three of these formulas showed Se >88% and four formulas had Sp >89%. At the best diagnostic DSLC− formula in women (formula DS VI, Figure 9 and 10) showed Se of 93% (95% CI=87-97%) and Sp of 92% (95% CI=89-94%).
Diagnostic performance of the DS with leucocytes in women (DSLC+). Similar as for the DSLC− formulas, the significant independent predictors were used to build up the six different DSLC+ formulas. The pooled overall Se and Sp of these six DSLC+ models in women was 90% (95% CI=85-95%) and 85% (95% CI=74-94%), respectively (Figures 11 and 12). Four formulas showed Se >90% and four formulas Sp over 85%. The DSLC+ formula (formula DS XII, Figures 11 and 12) showed Se of 93% (95% CI=87-97%) and Sp of 91% (95% CI=88-94%) (Figures 11 and 12).
Diagnostic performance of the DS without leucocytes (DSLC−) in men. The most important predictors of AA in male patients without LC (n=636) were location of pain at diagnosis, previous abdominal surgery, tenderness, rebound, rigidity, guarding and body temperature (DSLC− formula is shown in patients and methods chapter). In male patients DSLC− formula model reached Se of 95% with Sp of 89% when the male patients with DS value between −2.00 and −0.48 were considered as “grey area” patients=follow-up required before the decision to operate (n=75). The DSLC− formula was tested at six different cut-off levels to find the best diagnostic performance for AA in men (Figures 13 and 14). The pooled overall Se and Sp of these six DSLC− formulas were 94% (95% CI=90-96%) and 79% (95% CI=68-88%), respectively (Figures 13 and 14). Four of these formulas showed Se >94% and three formulas had Sp >79%. At the best diagnostic DSLC− in men (formula DS XI, Figures 13 and 14) showed Se of 95% (95% CI=89-98%) and Sp of 89% (95% CI=86-92%).
Diagnostic performance of the DS with leucocytes in men (DSLC+). The pooled overall Se and Sp of the six DSLC+ formulas in men was 93% (95% CI=88-96%) and 84% (95% CI=74-92%), respectively (Figures 15 and 16). Four formulas showed Se >93% and three formulas had Sp >84%. The best diagnostic DSLC+ formula in men (formula DS XII, Figures 15 and 16) showed Se of 93% (95% CI=87-97%) and Sp of 93% (95% CI=90-96%) (Figures 15 and 16).
HSROC analyses and empirical Bayes (EB) estimates in both genders. STATA (metandiplot algorithm) was used to draw the HSROC curves and EB estimates to visualise the comparison of the pooled overall diagnostic performance of the different DS formulas in detecting AA in women (Figures 17 and 18) and men (Figures 19 and 20). In the HSROC analysis in women, there is no statistically significant difference between the DSLC− and DSLC+ formulas, with AUC=0.949 (95% CI=0.921-0.968) and AUC=0.953 (95% CI=0.923-0.969) (p=0.631, ROC comparison test). The same is true with the HSROC analysis in men, with no difference between the DSLC− and DSLC+ formulas, with AUC=0.948 (95% CI=0.920-0.964) and AUC=0.956 (95% CI=0.930-0.969) (p=0.321, ROC comparison test).
Discussion
We stu died patients presenting with AAP as a part of the survey by the OMGE Committee (4-8) and estimated the diagnostic accuracy of a combined history-taking, clinical examination and laboratory testing in verified AA (5), NSAP (5), acute small bowel obstruction (7) and in acute renal stone disease (8). Although there are several different DS systems designed for AAP diagnosis (5, 9-15) and the international guidelines recommend routine diagnostic scoring to improve the diagnosis of AA (16, 17), a debate continues on the shortcomings of the specific DS models in women and men with AAP. Thus, it was appropriate to compare the performance of our gender-specific DS models in both genders, using DSs with and without LC.
Comparison of the symptoms, signs and laboratory tests in women vs. men. There was no gender-specific difference in the clinical symptoms, since the five diagnostic symptoms with highest diagnostic accuracy were identical in women and men (vertigo, jaundice, micturition, drugs for abdominal pain and use of alcohol), showing 95-100% Se in diagnosis of AA. The same applies to gender-specific difference in Sp, the five most relevant symptoms being identical in both genders (location of initial pain, location of pain at diagnosis, type of pain, relieving factors and vomiting) presenting with 51-89% Sp in diagnosis of AA.
Similarly, there was no significant difference in signs and laboratory test results between women and men with confirmed AA in their pooled Se, because the five best diagnostic signs and tests were very similar in both genders; women (distension, tenderness, mass, Murphy's sign, urine) and men (scar, distension, mass, Murphy's sign, urine) showing 93-99% Se in diagnosis of AA. Also, the pooled Sp of the signs and tests in AA detection was equal in both genders.
Female DSLC− and DSLC+. Of interest was to assess, whether the addition of LC would give any added value to our DSs, herein performed by comparing the diagnostic accuracy of DSLC− and DSLC+ scores. The present analysis suggests that female patients with DSLC− below −2.03 should not be operated while women with DSLC− falling between −2.03 and −0.49 should be followed-up before the final decision. According to our data, only the AAP-women with the DSLC− values exceeding −0.49 should be operated without delay. In women whose LC was calculable (n=575) the important predictors of AA were the same as in women without LC, but LC is added to DSLC+ formula (LC≥10,000 μl) (see the methods for formula details). In women, the highest diagnostic accuracy for DSLC+ formula (Se=93%, Sp=91%) in AA diagnosis was reached when the patients with DSLC+ values falling between −2.33 and −0.41 were considered as “grey zone” patients, for whom follow-up was appropriate before the decision to operate (n=77). Taken together, i) the female AAP patients with DSLC+ value below −2.33 should not be operated, ii) those with DSLC+ value between −2.33 and −0.41 should be followed-up and iii) all AAP-women with the DSLC+ value higher than −0.41 should be operated without delay.
Male DSLC− and DSLC+. The same considerations can be made among male patients with AAP. Our present data suggest that male patients with a DSLC− value below −2.00 should not be operated. Those men with DSLC− values between −2.00 and −0.48 should be followed-up, whereas all those with DSLC− values above −0.48 should be operated with no delay. In males with LC available (n=476), the AA predictors were the same as earlier but added with the LC and rectal digital examination. When the men with a DSLC− value between −1.74 and −0.14 were considered equivocal (n=67, follow-up required), the Se of this DSLC− in AA was 93%, with a Sp of 93% and an efficiency of 93%. As to the males and DSLC+ the present analysis implicates that the patients with a DSLC+ value below −1.74 should not be operated, while those with DSLC+ values between −1.74 and −0.14 could be safely followed-up. This leaves only the male patients with DSLC+ values above −0.14, who should be immediately operated.
A right iliac fossa tenderness (RIFT) Study Group in UK suggested that female patients (272/964, 28.2%) with AAP were more than twice as likely as males to undergo surgery with histologically normal appendix as a result (i.e. negative appendectomy, FP) (120/993, 12,1%) (RR=2.33, 95% CI=1.92-2.84, p<0.001). Although several AA risk scorings (5, 9-15) demonstrate different predictive factors for AA diagnosis, as far as we know, only the RIPASA scoring provides gender-specific data (19). The present study is the second to provide such data, while reporting the diagnostic performance of the DSLC− and DSLC+ models in both genders: The aim was to elaborate the optimal combination of symptoms, signs and tests in the DS formulas with and without LC and using six different combinations (DSLC− and DSLC+) as diagnostic predictors of AA. Our DS is in line with APPEND score (20) that the LC testing was not an important predictor of AA diagnosis, while APPEND score does not account for the significance of gender in AA. Alvarado (9) and Appendicitis Inflammatory Response (21) scores emphasize LC analysis as a significant predictor of AA diagnosis and APPEND score identified neutrophil percent and neutrophil/lymphocyte ratio as important predictors of AA. AA diagnosis strategies in the future may include early markers of inflammation e.g. interleukin 6 (IL-6) blood levels of which were shown to increase even 3-fold from the reference levels in perforated AA (22), suggesting that IL-6 analysis may be useful in predicting AA complication risk. Although the IL-6 analysis is promising, the current antigen test method precludes its use as a rapid test in AA so far (23, 24).
Conclusion
In conclusion, our gender-specific DS reached very high AUC values (0.948-0.956) in both genders, and using the ROC comparison test, there was no statistically significant difference in the AUC values of DSLC− and DSLC+ between women and men. Kularatna et al. (15) reviewed available DS formulas in AA and showed between 84% and 96% AUC values in AA diagnosis. Although, weakness of the meta-analysis is heterogeneity and quality of included studies, it seems that the Tzanakis score (13) with ultrasound (US) and inflammatory markers reached the highest diagnostic performance with 96% AUC in AA. This is equal with the AUC values obtained in the present study. However, the advantage of our DS is that this scoring does not need US or LC analysis to reach a high diagnostic accuracy in AA.
Acknowledgements
The study was funded by the Päivikki ja Sakari Sohlberg Foundation.
Footnotes
Authors' Contributions
All Authors have met all of the following four criteria: 1. Substantial contributions to the conception or design of the work or the acquisition, analysis, or interpretation of data for the work. 2. Drafting the work or revising it critically for important intellectual content. 3. Final approval of the version to be published. 4. Agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
This article is freely accessible online.
Conflicts of Interest
The Authors report no conflicts of interest or financial ties to disclose. The Authors alone are responsible for the content and writing of this article.
- Received September 15, 2020.
- Revision received September 30, 2020.
- Accepted October 2, 2020.
- Copyright© 2020, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved