To assess intra- and interobserver agreement among non-expert pathologists in identifying features of the eosinophilic esophagitis histologic scoring system (EoEHSS) in pediatric patients.
Patients and methodsThe authors used 50 slides from patients (aged 1-15 years; 72% male) with EoE. EoEHSS evaluates eosinophilic inflammation and other features including epithelial basal zone hyperplasia, eosinophilic abscesses, eosinophil surface layering, dilated intercellular spaces, surface epithelial alteration, dyskeratotic epithelial cells, and lamina propria fibrosis. Grade and stage of abnormalities are scored using a 4-point scale (0 normal; 3 maximum change). Four pathologists determined EoEHSS findings on two occasions. Intra- and interobserver agreement was assessed using Kappa (κ) statistics and intra-class correlation coefficients.
ResultsIntra- and interobserver agreement for the identification of eosinophil counts ≥ 15/high power field (HPF) was excellent, however varied when assessing additional features of the EoEHSS. For the more experienced pathologist, agreement for most EoEHSS items and the composite scores was substantial to excellent. For the less experienced pathologists, intraobserver agreement ranged from absent to substantial for individual features and ranged from moderate to substantial for the composite scores.
ConclusionMost items of the EoEHSS had substantial to excellent reliability when assessed by a pathologist experienced in the diagnosis of EoE but presented lower repeatability among less experienced pathologists. These findings suggest that specific training of pathologists is required for the identification of EoEHSS characteristics beyond eosinophil count, as these features are considered useful in the evaluation of response to treatment and correlation with clinical manifestations and endoscopic findings.
Eosinophilic esophagitis (EoE) is a chronic, inflammatory, immune- and/or antigen-mediated disease clinically characterized by symptoms of esophageal dysfunction and histologically by dense eosinophilic inflammation in mucosal biopsies.1–3 Since its initial description at the end of the 1970s and its identification as a distinct clinical entity in 1993, EoE has been increasingly recognized over the last 20 years.1–5 EoE symptoms vary according to age. Infants and younger children may present with feeding difficulties, vomiting and regurgitation, while older children, adolescents and adults may also present with dysphagia and the sensation of food lodged in the esophagus (food impaction).1,2
The criteria for the diagnostic and therapeutic approach to EoE have been continuously discussed since the initial description of the disease.6,7
Upper gastrointestinal endoscopy with biopsy is essential for diagnosis. Endoscopic findings include edema, furrows or vertical lines, concentric rings, and exudates or white spots. However, the macroscopic appearance may be normal.2,3.8 Upon diagnostic suspicion, biopsies should be obtained even when the endoscopic appearance is normal. As the inflammation may be focal, it is recommended that multiple biopsies of at least two esophageal segments be obtained to increase the diagnostic accuracy.2
The diagnosis of EoE involves clinical, endoscopic and histological factors, i.e., there must be clinical manifestations of esophageal dysfunction combined with mucosal changes on endoscopy and/or eosinophilic infiltrate in esophageal biopsies with a count of ≥ 15 eosinophils per high-power field (eos/HPF) in the area of greatest eosinophil density in one or more tissue samples.2,3
Other causes of esophageal eosinophilia should be excluded, especially gastroesophageal reflux disease (GERD), infections, connective tissue diseases, Crohn's disease, and hypersensitivity to medications, through a detailed clinical history and physical examination and diagnostic testing according to the clinical suspicion.1–3
The spectrum of histopathological changes in EoE determined the development of classifications and scores that improve diagnostic quality.9–13 A histological scoring system uses the intensity and extent of eosinophil granule protein deposition; however, the need for immunohistochemistry techniques hinders its use in clinical practice.10
In 2017, Collins et al. developed a histological scoring system (EoEHSS) to assess changes in the mucosa in addition to the peak eosinophil count. Figure 1 illustrates the most common histological features observed in EoE. The EoEHSS evaluates eosinophil infiltration (EI), basal zone hyperplasia (BZH), the presence of eosinophilic abscesses (EA), eosinophil surface layering (SL), dilated intercellular spaces (DIS), surface epithelial alterations (SEA), dyskeratotic epithelial cells (DEC) and lamina propria fibrosis (LPF). The severity (grade) and extent (stage) of the changes are classified using a scale (0 - normal to 3 - maximum change). The maximum score for grade and stage for each biopsy is 24. The final score is the ratio between the sum of the scores assigned to each evaluated item divided by the maximum possible score and varies from 0 to 1. If a feature is not evaluated, the maximum score is reduced by 3 points.14 The EoEHSS allows evaluation of the severity and extent of multiple histological features of EoE and can be applied in the routine histopathological analysis. The intraobserver and interobserver agreement of these findings was recently evaluated among pathologists specialized in gastrointestinal diseases with experience in the diagnosis of EoE after specific training in the analysis of items included in the EoEHSS.15
However, the repeatability and reproducibility of this system have not yet been evaluated in clinical practice among non-expert pathologists.
The objective of this study was to evaluate the intraobserver and interobserver agreement of the EoEHSS in pediatric patients with EoE.
Materials and methodsThis study was conducted at a tertiary pediatric referral center in the south of Brazil where approximately 2,000 endoscopic procedures are performed annually in children aged 0-18 years. A simple random sampling without replacement was performed by assigning a unique number to each pediatric patient diagnosed with EoE (≥ 15 eos/HPF) registered at the Endoscopy Unit database (n = 462) from 2005 to 2018. In the next step, these numbers were written on separate cards that were placed in a box, thoroughly mixed and taken out randomly. Slides of esophageal biopsies stained with hematoxylin & eosin from 50 patients were used in the study. In order to assess the quality of the material, all slides were reviewed by the most experienced pathologist participating in the study, two months prior to the beginning of the research.
The study protocol was approved by the Research Ethics Committee of Hospital Pequeno Príncipe - Curitiba, Brazil.
Four pathologists with different levels of experience in the diagnosis of EoE in clinical practice volunteered to participate in the study. Prior to the study, the pathologist considered to be the most experienced (more than 30 years in practice) had evaluated more biopsies of patients with EoE (> 1,200) than the pathologist considered moderately experienced (13 years in practice, approximately 150 biopsies) and the pathologists considered less experienced (less than 2 years in practice, < 15 biopsies). All pathologists were aware of the protocol to determine EI and other histological findings of EoE before evaluating the slides; however, no training on interpreting each feature in order to create histological standards was performed.14
The slide identification records were modified so that they could not be correlated with the patients or identified in two consecutive evaluations as belonging to the same patient.
The medical pathologists analyzed the histopathological findings (EoEHSS) under optical microscopy at two different times with an interval of at least two weeks between the evaluations using their own working binocular optical microscopes (CH30, BX23, CX31 - Olympus, Tokyo, Japan, and Zeiss Axiostar - Zeiss Inc., Göttingen, Germany). The area of greatest eosinophil density was selected, and the total number of eos/HPF (400x magnification; area of 0.238 mm2) was recorded. All individual EoEHSS components were analyzed, as well as the grade (severity) and stage (extent) scores, which were recorded in a standardized data collection form (Appendix 1).
At the end of each analysis, the information obtained from the database was entered in Microsoft Excel® spreadsheets (Microsoft Corporation, Redmond, USA) and imported into Stata/SE v.14.1 (StataCorp LP, USA) for processing.
Statistical analysisKappa (κ) is a measure of intraobserver and interobserver agreement that indicates the degree of agreement beyond what would be expected by chance alone and typically ranges from +1 to -1, where a greater value indicates better reliability. Values close to or less than zero suggest that the agreement is attributable to chance. A κ <0.0 indicates no agreement, from 0 to 0.20 indicates poor agreement, from 0.21 to 0.40 indicates fair agreement, from 0.41 to 0.60 indicates moderate agreement, from 0.61 to 0.80 indicates substantial agreement and from 0.81 to 1.00 indicates excellent agreement.16
To calculate the sample size, a scale with four classifications with uniformly distributed marginals was considered for the comparison of two evaluations. For this calculation, a κ value of 0.8 and a 95 percent confidence interval were considered acceptable. A sample of 50 slides was considered adequate to estimate the κ coefficient of agreement between two evaluations, considering a relative margin of error of 20%.
Descriptive statistics were used to characterize the patients under study. Frequencies and percentages were used for categorical variables.
The intraobserver and interobserver agreement for all EoEHSS components were calculated. The interobserver agreement was based on the results of each pathologist's first reading.
Fleiss’ κ coefficient was estimated to assess agreement for ordinal variables, and the intraclass correlation coefficient (ICC) was estimated to evaluate the measurements in the composite score variable.17 The student's t-test was used to evaluate the existence of a systematic difference between the two measurements performed by the same pathologist regarding the composite score. Repeated measures analysis of variance (ANOVA) was used to test the homogeneity of the four pathologists’ evaluations for the score. For the κ coefficients, 95% confidence intervals were constructed. P-values less than 0.05 indicated statistical significance.
ResultsDemographicsThe mean age was 9.6 years (SD ± 4), range 1-15 years and the majority were male (72%). Of the 50 patients, 40% (n = 20) had a diagnosis of asthma, 32% (n = 16) of allergic rhinitis, 24% (n = 12) of atopic dermatitis and 30% (n = 15) of food allergy. The most frequent symptoms that led to the indication for endoscopy were vomiting in 58% (n = 29), feeding difficulties in 40% (n = 20), dysphagia in 26% (n = 13) and low weight gain in 26% (n = 13) of patients. The endoscopic findings observed included edema in 96% (n = 48), vertical lines in 86% (n = 43), white exudates in 62% (n = 31) and concentric rings in 4% (n = 2) of patients. Only one patient presented with esophageal stricture, and only one patient had a normal endoscopic examination.
Assessment of eosinophilic infiltration (≥15 eos/HPF)The EI was graded in categories as per the EoEHSS (0: eosinophils not present; 1: <15 eos/HPF; 2: 15-59 eos/HPF; 3: >60 eos /HPF). Intra- and interobserver agreement for the identification of eosinophil counts ≥15 /high power field (grades 2 and 3), essential for the diagnosis of EoE, was excellent for all pathologists.
Assessment of intraobserver agreement (EoEHSS individual items and composite scores)Intraobserver agreement in the evaluation of the individual components of the EoEHSS varied among the pathologists. The highest κ values were observed in the evaluation of EA (grade and stage), followed by SL (grade and stage), EI (grade), and BZH (grade).
The most experienced medical pathologist showed excellent (κ > 0.80) or substantial (κ > 0.60) intraobserver agreement for all individual items between the two evaluations, except for BZH (stage), for which agreement was fair (κ = 0.24).
The less experienced pathologists showed moderate intraobserver agreement for EI (grade), fair intraobserver agreement for EI (stage), fair to moderate intraobserver agreement for BZH (grade), poor to substantial intraobserver agreement for BZH (stage), moderate to substantial for intraobserver agreement EA (grade), moderate to substantial intraobserver agreement for EA (stage), moderate to fair intraobserver agreement for SL (grade and stage), poor to fair intraobserver agreement for DIS (grade), fair to substantial intraobserver agreement for DIS (stage), poor to moderate intraobserver agreement for (grade), poor to moderate intraobserver agreement for SEA (stage), poor to fair intraobserver agreement for DEC (grade), poor to moderate intraobserver agreement for DEC (stage), absent to poor intraobserver agreement for LPF (grade) and absent to fair intraobserver agreement for LPF (stage).
The median ICC for the EoEHSS composite scores among the four pathologists were 0.70 (0.52 - 0.94) and 0.75 (0.64 - 0.90) for grade and stage, respectively. The most experienced pathologist showed excellent agreement for grade and stage, and the less experienced pathologists showed moderate to substantial agreement for grade and substantial agreement for the stage.
Table 1 shows the estimated κ coefficients for the individual items (grade and stage) and the ICCs for the composite scores (grade and stage). Scatter plots (Intraobserver agreement) for the composite scores (grade and stage) are shown in Appendix 2.
Intraobserver level of agreement.
PATH, pathologist; CI, confidence interval; EI, eosinophil inflammation; BZH, basal zone hyperplasia; EA, eosinophilic abscesses; SL, eosinophil surface layering; DIS, dilated intercellular spaces; SEA, surface epithelial alteration; DEC, dyskeratotic epithelial cells; LPF, lamina propria fibrosis; ICC, intra-class correlation coefficient.
The κ coefficient for each of the variables evaluated the reproducibility of the results. For this analysis, the first measurement performed by each pathologist was considered. The null hypothesis of a κ coefficient equal to zero (lack of reproducibility in the results) was tested for each variable versus the alternative hypothesis of a non-zero κ coefficient.
The interobserver agreement among all pathologists was moderate for EI (grade), poor for EI (stage), absent for BZH (grade), poor for BZH (stage), fair for EA (grade and stage), poor for SL (grade and stage), DIS (grade and stage), and SEA (grade and stage), and absent for DEC (grade and stage) and LPF (grade and stage).
In Table 2, the estimated κ coefficient, the p-value of the statistical test and the limits of the 95% confidence interval for the κ coefficient are presented for each variable. The best interobserver agreement was found for EI grade (moderate) and EA grade (fair).
Interobserver agreement (all pathologists).
EI, eosinophil inflammation; BZH, basal zone hyperplasia; EA, eosinophilic abscesses; SL, eosinophil surface layering; DIS, dilated intercellular spaces; SEA, surface epithelial alteration; DEC, dyskeratotic epithelial cells; LPF, lamina propria fibrosis; CI, confidence interval.
For the variable composite score (grade), the estimated ICC considering the four pathologists was equal to 0.33, indicating fair reproducibility.
For the variable composite score (stage), the estimated ICC was 0.45, indicating moderate reproducibility.
Table 3 shows the mean and standard deviation of the evaluations for each pathologist.
DiscussionThe prevalence of EoE has increased steadily in the last 20 years.18,19 According to the current consensuses, an eosinophil count ≥ 15 eos/HPF confirms the active disease. Histological evaluation of esophageal biopsies is essential to diagnose EoE as well as to monitor response to treatment.1–3
The EoEHSS was developed to assess other histological changes in the mucosa present in EoE, in addition to the peak eosinophil count.14
Most of the studies conducted to date on the histological findings have analyzed only EI by pathologists specialized in the analysis of biopsies from patients with EoE or by less experienced pathologists after specific training in the analysis of the findings in reference centers.20,21 The intraobserver and interobserver agreement of the individual EoEHSS components were evaluated only during the development of the system and subsequently by pathologists specializing in the diagnosis of EoE.14,15
With the increased incidence and prevalence of the disease, many patients are managed outside reference centers where general pathologists are involved in the diagnosis of various diseases. The repeatability and reproducibility of the histological findings among pathologists not specialized in EoE have not been evaluated in clinical practice.
In the present study, the intraobserver and interobserver agreement for the evaluation of EoEHSS individual items as well as the composite score was determined by pathologists with different degrees of experience in the diagnosis of EoE.
Intra- and interobserver agreement for the identification of eosinophil counts ≥15 /high power field, essential for the diagnosis of EoE, was excellent for all pathologists, however, it varied per item and among the observers when assessing additional features of the EoEHSS.
For the most experienced pathologist, the intraobserver agreement was excellent or substantial in the evaluation of all individual items, except for BZH (stage), for which the agreement was fair. For the variable score (grade and stage), the intraobserver agreement was also excellent for the most experienced pathologist. These findings are equivalent to those observed in a study conducted with specialist pathologists trained in EoE reference centers that found an excellent intraobserver agreement for grade and stage composite scores and were at least substantial for all the other items except for DEC, which showed fair agreement.15
The less experienced pathologists showed intraobserver agreement ranging from absent to substantial in the evaluation of individual items. The highest κ values were observed for the evaluation of EA (grade and stage), followed by SL (grade and stage), EI (grade), and BZH (grade). In this group, for the variable score (grade and stage), the intraobserver agreement ranged from moderate to substantial.
A study comparing specialist and non-specialist pathologists revealed that interobserver agreement was excellent for the determination of EI after a training session to identify the findings.20 That study did not analyze other histological features of EoE, which precludes comparison with the findings of our study but reinforces the need for training in the interpretation of histological findings.
The interobserver agreement was moderate for EI (grade) and the composite score (stage), fair for the composite score (grade), and absent or fair for the other individual EoEHSS components. These findings contrast with a previous study conducted with specialist pathologists in EoE reference centers that revealed excellent interobserver agreement for the grade and stage composite scores and were at least moderate for all the other items except SEA, which showed fair agreement.15 This discrepancy may be explained by the difference in reproducibility between the most experienced pathologist and the others, in addition to the absence of specific prior training for the reading and uniform interpretation of the features included in the EoEHSS.
There are some limitations in interpreting the results of this study that need to be discussed.
First, this study was conducted in a reference center for pediatric endoscopy with experience in EoE, where only one of the participant pathologists has worked in the routine reading of slides originated from patients diagnosed and treated at the unit. The other less experienced pathologists had not received formal training on the features of EoE histological findings and performed the analyses among other routine work activities. Second, the pathologists evaluated a small sample of slides under different microscopes that may have different illumination configurations, levels of magnification in the objective and ocular lens system, and image quality.
In addition, the slides for the study were selected from biopsy material obtained by routine endoscopy. Fragments of well-oriented biopsies are not always obtained and/or processed consistently, and the slides used in this study did not always have well-oriented sections. The quality of the biopsies and orientation of the fragments may have compromised the analysis of some findings, especially LPF and BZH. Fragmentation of the biopsies and/or mechanical compression, as well as the resolution of the microscopes, may also have affected the reproducibility of the findings.
In conclusion, the authors’ findings reveal that intra- and interobserver agreement for the identification of eosinophil counts ≥15 /high power field, required for the diagnosis of EoE, was excellent for all pathologists. The evaluation of the individual components of the EoEHSS shows intraobserver agreement in parameters comparable to previous studies when evaluated by a pathologist experienced in the diagnosis of this disease. Among non-specialist and less experienced pathologists, the intraobserver and interobserver agreement for the identification of EI grade and EA varied from moderate to fair but the additional histological features of EoE presented lower repeatability and reproducibility.
These findings suggest that there is a need for specific training of pathologists to identify EoEHSS features in addition to eosinophil count, as they are considered useful to assess histologic abnormalities and may be important for the evaluation of response to treatment as well as for the correlation with clinical manifestations and endoscopic findings. Moreover, future studies with a larger sample size and more participants are needed to confirm the intra- and interobserver reliability of the EoEHSS in clinical practice.
The authors would like to thank Dr. Eloisa M. P. Guimarães for her encouragement and contribution to data collection and tabulation. We also gratefully appreciate the valuable inputs and comments provided for the study by Dr. Carlos A. Riedi, Dr. Dante L. Escuissato, Dr. Evaldo Macedo Filho, Dr. Victor Horácio S. Costa Jr., and Dr. Solena Z. Kusma.
Study conducted at Hospital Pequeno Príncipe, Curitiba, PR, Brazil; Pontifícia Universidade Católica do Paraná, Curitiba, PR, Brasil; and Universidade Federal do Paraná, Curitiba, PR, Brazil.