Original Article

The Effectiveness of Appendicitis Inflammatory Response Score in the Evaluation of Acute Appendicitis: A Meta-analysis

10.4274/hamidiyemedj.galenos.2023.98159

  • İlkay Güler
  • Dilay Satılmış
  • Sinan Ömeroğlu
  • Nurgül Balcı

Received Date: 30.05.2023 Accepted Date: 19.07.2023 Hamidiye Med J 2023;4(1):15-21

Background:

One of the most frequent causes of urgent abdominal diseases is appendicitis. Various diagnostic methods are used in the diagnosis of appendicitis, scoring systems are among them. We aimed to meta-analyze the studies evaluating the effectiveness of the appendicitis inflammatory response-score (AIR-S). In light of the studies done on the topic, it was intended to highlight the diagnostic benefits of scoring, which allow for the simultaneous evaluation of clinical, laboratory, and imaging findings as well as the patient’s medical history, and to add to the literature.

Materials and Methods:

All studies published in the last 15 years using the terms “All fields=appendicitis inflammatory response” (AND), “All fields=receiver operating characteristic” (AND) in Web of Science Core Collection, PubMed and Google Scholar databases were searched, systematic review and meta-analysis were performed.

Results:

Thirteen publications were included in the study according to inclusion and exclusion criteria. It was noted that the studies were conducted on 8.052 patients with a mean age of 30.00 and gender distribution as 48.00% male and 52.00% female. The cut-off point of the studies was found to be 5.00, Sensitivity 85.00%, Specificity 59.00%. The studies were homogeneous (I2=19.830; Cochran Q=14.968; p>0.05). AIR-S diagnostic distinguish ability was statistically significant (total-fixed effects=0.838; 95% confidence interval 0.800-0.875; p<0.001). There was no statistically significant publication bias (p>0.05).

Conclusion:

In this study, the sum of the values determined for the diagnostic parameters of AIR-S was below 170. This finding result that using AIR-S alone to diagnose acute appendicitis is insufficient, and that it is preferable to utilize it in conjunction with other diagnostic measures.

Keywords: Appendicitis, appendicitis inflammatory response score, receiver operating characteristic, bias

Introduction

One of the most frequent causes of urgent abdominal diseases is appendicitis. It is known that acute appendicitis (AA) affects 90-100/100,000 people in developed countries (1). If appendicitis is suspected in a patient presenting with acute abdominal symptoms, the diagnosis should be confirmed before performing emergency surgery. Rapid and accurate diagnosis is of great importance to reduce the complications of AA and even to decrease the mortality rates that may occur due to complications. For many other reasons (pregnancy, hematologic origin etc.), obtaining a reliable preoperative diagnosis may be difficult even for physicians and/or experienced surgeons (2,3).

Clinicians have developed various scoring tools with prognostic value based on the principle of evaluating many clinical findings together in order to minimize the margin of error in the diagnosis of AA and to confirm the preliminary diagnosis. Among these scores, Alvarado, Modified Alvarado, Lintula, Tzanakis, appendicitis inflammatory response score (AIRS), Ohmann, Fenyo-Lindberg and Raja Isteri Pengiran Anak Saleha Appendicitis (RIPASA) aim to improve diagnostic ability and reduce the rate of negative appendicectomy (4). Recently, scoring models have been reported to predict complicated acute appendicitis. Scoring systems that combine clinical and imaging features and scoring models that can be calculated using predictive equations have been proposed (5,6,7,8).

Important clues for AA can be obtained through tools that assess risk according to a score obtained by combining patients’ symptoms, clinical findings and laboratory results, and even calculating according to their severity and level. The appendicitis inflammatory response-score (AIR-S), one of these instruments, was created in 2008 and is currently the most used pre-operative tool. The World Society for Emergency Surgery’s 2020 clinical practice suggests using AIR-S for the diagnosis and management of AA (9).

In this study, we aimed to meta-analyze the studies evaluating the effectiveness of the AIR-S. In the light of these studies, it was aimed to draw attention to the diagnostic values of scoring, which enables the evaluation of clinical, laboratory and imaging methods together as well as the history obtained from the patient, to emphasize the importance of its use in daily practice and to contribute to the literature.


Material and Methods

The preferred PRISMA reporting elements for systematic reviews and meta-analyses were followed when conducting this investigation. This study’s execution did not require ethical approval.

Keywords and Search Strategy

In this study, studies using AIR-S in patients with AA and evaluating the diagnostic value of this score according to receiver operating characteristic (ROC) were analyzed. All studies published in the last 15 years using the terms “All fields=appendicitis inflammatory response” (AND), “All fields= ROC” (AND) in Web of Science Core Collection, PubMed and Google Scholar databases (accessed March, April and May 2023) were searched and systematic review and meta-analysis were performed.

AIRS

Vomiting, right iliac fosse-migrating abdomen pain, rebound tenderness, fever (°C), leukocyte count (PML), white blood cell count (WBC 109/L), and C-reactive protein (CRP) (mg/L) concentration are all considered indications for AIR-S. Each evaluation criterion has a score determined for the severity of the evaluation. Accordingly; vomiting (score: 1), abdominal pain that migrates to the right iliac fossa (score: 1), rebound tenderness or muscular defense (Light: 1, Moderate: 2, Strong: 3), fever of 38. 5 °C or more (score: 1), PML (70-84%: 1, ≥85%: 2), WBC (10.0-14.9 x 109/L: 1, ≥15.0x109/L: 2), CRP (10-49 g/L: 1, ≥50 g/L: 2). The assessment results in a final score ranging from 0 to 12. A total score between “0-4” means “low probability follow-up”, “5-11” means “re-evaluation/outpatient follow-up” and “9-12” means “high probability/surgical exploration” (9).

Inclusion and Exclusion Criteria

The inclusion criteria were that the studies were conducted in adult patients, published in the last fifteen years, the diagnosis was acute appendicitis, the AIR Score efficiency was performed by applying ROC analysis, the area under the curve (AUC) value and the standard error/95% confidence interval were calculated accordingly.

Exclusion criteria were defined as not including one or more of the inclusion criteria, studies in pediatric patients, case reports, systematic reviews, conference reports, animal experiments and missing data in ROC analysis results.

Outcome Measures

For the effectiveness of AIR scores on clinical decision-making, we used all calculated and reported ROC analysis results within the total number of cases in the studies.

Data Extraction and Assessment of Quality

A total of 315 studies were accessed in line with the search strategies. The researchers who planned the study (four independent researchers with academic qualifications in general surgery, emergency medicine, and family physician specialists) checked the “Title” and “Abstract” parts of the retrieved articles for compliance with the study strategies. At the end of all search and control processes, all data were recorded by the researchers on the designed data collection forms and all records were collected under a common file. The findings of a total of 13 studies that met the inclusion criteria were statistically evaluated. Figure 1A displays the study’s PRISMA flow diagram.

Statistical Analysis

Studies obtained through this systematic review were considered meta-analysis. Heterogeneity between studies I2 and the risk of publication bias were investigated with Funnel Chart and Egger’s Regression. In the included studies, the Cochrane Risk of Bias technique, which evaluates the presence of bias with seven criteria, was used. According to this technique, each study was evaluated according to the “low, unclear, high” bias criteria (10). The threshold value of 0.25 for the I2 value and 0.05 for statistical significance was accepted to determine whether heterogeneity was present. Calculations were performed with Medcalc (version 20.218 Free-Trial) and Meta-DISC (version 1.4) program.


Results

After a database search, a total of 315 articles containing patient data related to AIR-S were found. Thirteen publications in total were included in the study for evaluation after taking the inclusion and exclusion criteria into account. The articles’ titles and abstracts were read, and it was determined whether they were pertinent to the topic. The PRISMA flow chart of the study is shown in Figure 1A. Seven criteria were used to assess the risk of bias. Each study was assessed as low risk, high risk or unclear risk of bias. The highest risk was “blinding of participants and personnel” and the lowest risk was “allocation concealment” (Figure 1B).

It was seen that 13 studies using ROC curve, a statistical technique for performance measurement of AIR-S, were conducted with a total of 8.052 patients, 48% were male, 52% were female, and the mean age of all patients was 30 years. When the common features of the studies were analyzed, the cutoff point was found to be 5, sensitivity 86.90%, specificity 53.80% (n=10). While the 3 studies aimed to show the clinical efficacy of the AIR-S with ROC analysis results, they did not share the cutoff point and the sensitivity analysis results related to this point (Table 1).

The clinical efficacy of AIR-S was evaluated in 13 included studies. For the calculation of the standard error, it was extended with five to all cell counts in all runs to avoid division with zero error. Homogeneity was observed in the studies (I2=19.830; Cochran Q=14.968; p=0.243). According to the results, the area under the curve of AIR-S was statistically significant (total-fixed effects=0.838; 95% confidence interval 0.800-0.875; p<0.001; Figure 2A). The summary receiver operating characteristic (sROC) curve, which is shaped according to the common results of the studies and includes the sensitivity analysis values, is presented in Figure 2B. AIR-S diagnostic distinguishability was statistically significant (p<0.001).

There was no statistically significant publication bias (p=0.191). Funnel graps shows the symmetrical, the likely it is not that of bias will be substantial (Figure 3).


Discussion

Today, patients can present with symptoms of AA at any time of the day to the emergency department, general surgery outpatient clinic or family medicine centers of healthcare facilities. It is important for physicians to rely on clinical scoring systems during off-hours (nights, weekends, holidays) when access to imaging services (especially ultrasonography) is difficult. In this study, we aimed to examine the diagnostic efficacy of the AIR-S developed for the combined evaluation of clinical and laboratory parameters in the diagnosis of AA by meta-analysis.

The AIR-S has been accepted as one of the best-performing scores for the diagnosis of AA among the various clinical prediction tools available. AIR-S with other scoring systems can significantly reduce the risk of overdiagnosis of AA and thus provide a reliable diagnostic performance, while at the same time enabling treating surgeons to avoid the routine use of computed tomography (24).

The results obtained from validation studies were summarized and it was shown that the AIR-S had an area under the ROC curve (AUC) of 0.84, sensitivity (92%), specificity (63%), The AIR score performed best performance compared to other scoring systems (Alvarado, PIRASA, Ohmann, Eskelinen, Lintula, Modified Alvarado) in terms of sensitivity, specificity AUC values and usability (25). However, it is recommended that the sum of the sensitivity and specificity of the tests that should be used for diagnosis should be above 170 (26). In our study, the sum of the values determined for the diagnostic parameters of the AIR-S was below 170. This result indicates that the use of the AIR-S alone in the diagnosis of AA is not a sufficient diagnostic tool and that it is more appropriate to use it together with other parameters that help the diagnosis.

The AIR-S is a valid decision support system for clinical diagnosis and has high sensitivity for complicated appendicitis. In addition, it was emphasized that the AIR-S had a high discrimination capacity in children and patients with long-term symptoms and performed equally well in both sexes (13).

It has been stated that the diagnostic accuracy of the RIPASA system, which is used in the literature as one of the scoring systems, is better than the Alvarado and AIR scores, but the single-center study is a disadvantage because the results differ for region and ethnicity (20).

It was determined that Alvarado and AIR-S, which are most commonly used in the diagnosis of AAin pregnant women, are effective methods in the diagnosis of AA. However, the higher accuracy of AIRS, which includes CRP value, suggests that this system is an advantage (27).

In line with the results of this meta-analysis, it can be said that patients evaluated with the AIR-S can be discharged with detailed information (if the complaints of pain, nausea, vomiting, etc. do not go away and worsen, they should apply to the emergency department again) if the total score of the patients evaluated with the AIR-S is below 5, knowing that re-evaluation and careful follow-up are essential in case of change and/or worsening of symptoms. Recent studies have also shown that antibiotics as a non-operative treatment method have a low morbidity and treatment success rate after 30 days of follow-up (28,29).

AIR-S can be particularly useful in environments and situations where imaging methods are limited or unavailable and resources are scarce. The risk stratification of patients with suspected AIR-S AA can guide the decision-making process to optimize the utility of diagnostic imaging and avoid negative and unnecessary investigations.


Conclusion

The AIR-S has significant diagnostic efficacy in the diagnosis of AA. Risk stratification of patients with suspected AA according to the AIR-S can guide decision-making to reduce admissions, optimize the utility of diagnostic imaging, and avoid adverse and unnecessary explorations. Physician confidence in clinical scoring systems is also important.

However, in this study, the sum of the values determined for the diagnostic parameters of AIR-S was below 170. This result indicates to clinicians that the AIR-S alone is not an adequate diagnostic tool in the diagnosis of AA and that it is more appropriate to use it in conjunction with other diagnostic parameters.

Acknowledgement: I would like to thank Özlem Köksal (Public Hospitals Services Presidency-1, İstanbul, Türkiye) for her effort in preparing the meta-analysis statistics of the article.

Ethics

Ethics Committee Approval: Ethics committee approval was not required as it was a meta-analysis study.

Informed Consent: Not required as it was a meta-analysis study.

Peer-review: Externally and internally peer-reviewed.

Authorship Contributions

Surgical and Medical Practices: İ.G., D.S., S.Ö., N.B., Concept: İ.G., D.S., Design: İ.G., D.S., Data Collection or Processing: İ.G., D.S., N.B., Analysis or Interpretation: İ.G., D.S., N.B., Literature Search: İ.G., D.S., N.B., Writing: İ.G., D.S., S.Ö., N.B.

Conflict of Interest: There is no conflict of interest between the authors.

Financial Disclosure: The authors declared that this study received no financial support.


Images

  1. Rodriguez A, Barraco RD, Ivatury RR. Geriatric Trauma and Acute Care Surgery. In: Anderson TN, Moore F, Jordan J, editors. Acute Appendicitis 2018.
  2. Ellis H. The story of appendicitis and its treatment. J Perioper Pract. 2019;30:188.
  3. Bhangu A, Søreide K, Di Saverio S, Assarsson JH, Drake FT. Acute appendicitis: modern understanding of pathogenesis, diagnosis, and management. Lancet. 2015;386:1278-1287. [Crossref]
  4. Sharma K, Thomas S, Chopra A, Choudhury M. Evaluation of the Diagnostic Accuracy of Eight Reported Clinical Scoring Systems in the Diagnosis of Acute Appendicitis. Indian J. 2022;84:741-748.
  5. Kobayashi T, Hidaka E, Koganezawa I, Nakagawa M, Yokozuka K, Ochiai S, et al. Development of a scoring model based on objective factors to predict gangrenous/perforated appendicitis. BMC Gastroenterol. 2023;23:198.
  6. Atema JJ, van Rossem CC, Leeuwenburgh MM, Stoker J, Boermeester MA. Scoring system to distinguish uncomplicated from complicated acute appendicitis. Br J Surg. 2015;102:979-990.
  7. Eddama M, Fragkos KC, Renshaw S, Aldridge M, Bough G, Bonthala L, et al. Logistic regression model to predict acute uncomplicated and complicated appendicitis. Ann R Coll Surg Engl. 2019;101:107-118.
  8. Geerdink TH, Augustinus S, Atema JJ, Jensch S, Vrouenraets BC, de Castro SMM. Validation of a Scoring System to Distinguish Uncomplicated From Complicated Appendicitis. J Surg Res. 2021;258:231-238.
  9. Andersson M, Andersson RE. The Appendicitis Inflammatory Response Score: A Tool for the Diagnosis of Acute Appendicitis that Outperforms the Alvarado Score. World J Surg. 2008;32:1843-1849.
  10. Higgins JP, Green S. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 (updated March 2011). The Cochrane Collaboration, 2011.
  11. Ak R, Doğanay F, Akoğlu EU, Akoğlu H, Uçar AB, Kurt E, et al. Predictive value of scoring systems for the diagnosis of acute appendicitis in emergency department patients: Is there an accurate one? Hong Kong J Emerg Med. 2020;27:262-269.
  12. Andersson M, Rubér M, Ekerfelt C, Hallgren HB, Olaison G, Andersson RE. Can new inflammatory markers improve the diagnosis of acute appendicitis? World J Surg. 2014;38:2777-2783.
  13. Andersson M, Kolodziej B, Andersson RE. Validation of the Appendicitis Inflammatory Response (AIR) Score. World J Surg. 2021;45:2081-2091.
  14. Andrade JRR, Rivas MVU, Cumbe JCO, Abad KMC, Medina PPN. Evaluación de la Escala de Alvarado versus Score de Respuesta Inflamatoria de la Apendicitis, Hospital José Carrasco Arteaga 2018. Revista Médica HJCA. 2020;12:2.
  15. Birben B, Muge Sonmez B, Er S, Ozden S, Tugra Kosa M, Tez M. Comparison of diagnostic scoring systems with imaging methods for the diagnosis of acute appendicitis. Ann Med Res. 2020;27:3166-3170. [Crossref]
  16. Bokade S, Verma A, Kumar S. Comparative Study of Modified Alvarado Score, Appendicitis Inflammatory Response Score and New Adult Appendicitis Score In Predicting The Accuracy Of Diagnosing Acute Appendicitis. IOSR Journal of Dental and Medical Sciences (IOSR-JDMS). 2021;20:16-35.
  17. Chae MS, Hong CK, Ha YR, Chae MK, Kim YS, Shin TY, et al. Can clinical scoring systems improve the diagnostic accuracy in patients with suspected adult appendicitis and equivocal preoperative computed tomography findings? Clin Exp Emerg Med. 2017;20;4:214-221.
  18. Gadahire M, Wadaskar S, Yadav RK. Prospective Study To Evaluate Appendicitis Inflammatory Response Score And Ct-Scan To Diagnose Acute Appendicitis International. Journal of Academic Medicine and Pharmacy. 2022;5:246-251.
  19. Haak F, Kollmar O, Ioannidis A, Slotta JE, Ghadimi MB, Glass T, et al. Predicting complicated appendicitis based on clinical findings: the role of Alvarado and Appendicitis Inflammatory Response scores. Langenbecks Arch Surg.2022;407:2051-2057.
  20. March B, Leigh L, Brussius-Coelho M, Holmes M, Pockney P, Gani J. Can CRP velocity in right iliac fossa pain identify patients for intervention? A prospective observational cohort study. Surgeon. 2019;17:284-290.
  21. Martín-Del Olmo JC, Concejo-Cutoli P, Vaquero-Puerta C, López-Mestanza C, Gómez-López JR. Clinical prediction rules in acute appendicitis: which combination of variables is more effective at predicting? Cir Cir. 2022;90:42-49.
  22. Sammalkorpi HE, Mentula P, Leppäniemi A. A new adult appendicitis score improves diagnostic accuracy of acute appendicitis-a prospective study. BMC Gastroenterol. 2014;14:114.
  23. Scott AJ, Mason SE, Arunakirinathan M, Reissis Y, Kinross JM, Smith JJ. Risk stratification by the Appendicitis Inflammatory Response score to guide decision-making in patients with suspected appendicitis. Br J Surg. 2015;102:563-572.
  24. Di Saverio S, Podda M, De Simone B, Ceresoli M, Augustin G, Gori A, et al. Diagnosis and treatment of acute appendicitis: 2020 update of the WSES Jerusalem guidelines World J Emerg Surg. 2020;15:27.
  25. Kularatna M, Lauti M, Haran C, MacFater W, Sheikh L, Huang Y, et al. Clinical prediction rules for appendicitis in adults: which is best? World J Surg. 2017;41:1769-1781.
  26. Wians FH. Clinical laboratory tests: which, why, and what do the results mean? Laboratory Medicine 2009;40:105-113.
  27. Bardakçi O, Bahçecioğlu İB, Tatli F, Özgönül A, Güldür ME, Uzunköy A. Does one of the two most commonly used scoring systems have a decisive advantage over the other in diagnosing acute appendicitis in pregnant women? Medicine (Baltimore). 2023;102:e33596.
  28. Podda M, Gerardi C, Cillara N, Fearnhead N, Gomes CA, Birindelli A, et al. Antibiotic Treatment and Appendectomy for Uncomplicated Acute Appendicitis in Adults and Children: A Systematic Review and Meta-analysis. Ann Sur. 2019;270:1028-1040.
  29. Ceresoli M, Fumagalli C, Fugazzola P, Zanini N, Magnone S, Ravasi M, et al. Outpatient Non-operative Management of Uncomplicated Acute Appendicitis: A Non-inferiority Study. World J Surg. 2023;47:2378-2385.