More than 15 million infants globally are born low birthweight (LBW) or preterm every year.1 Mortality and serious morbidity in these infants are common and represent one of the single biggest contributors to childhood deaths. Survival in preterm infants has increased, and whilst some need complex and expensive respiratory support, this is largely restricted to most preterm infants. However, every preterm infant requires nutritional support with important short-term outcomes e.g., necrotizing enterocolitis, as well as long-term metabolic and cognitive impacts.2–7 Nutritional interventions which reduce serious morbidity and mortality are often universally available, cheap, safe, and simple to implement such as differences in the timing of starting and increasing milk feeds, or the use of breastmilk.4,8
Oropharyngeal colostrum immunotherapy (OCI) is a simple intervention with strong biological plausibility and appears safe. In settings that promote immediate kangaroo mother care (KMC) many infants, some as small as 1000 g, could receive colostrum directly from the nipple.9 However, in many middle- or high-income settings, the deliberate administration of colostrum into the buccal mucosa is increasingly practiced.10 Historically, a cotton bud was dipped in the colostrum, but there was concern that colostrum may be absorbed and lost into the cotton. Most use a 1 mL syringe to collect colostrum directly from the nipple which can be immediately transferred into the infant's cheek. Supporting mothers to deliver OCI themselves directly involves them in nutrition management from the first day of life, facilitates bonding, helps make positive memories, and emphasizes the importance of family integrated care (FiCare).11 The volume of OCI provided varies but is between 0.1 and 0.5 mL and is usually divided between the right and left cheek. Duration differs, but many use breastmilk for ‘mouth care’ for the entire duration of the infants’ stay in the neonatal unit.
In this edition of the journal, Martins et al. report on the effects of OCI as an intervention using a cohort comparison study design in 138 mother-infant dyads born at ≤1500 g (VLBW) and ≤37 weeks.12 The study may therefore have recruited infants who were moderate or late preterm (32–36 weeks gestation) in whom the risk of death is usually <1 %. Triplets were excluded and no twins were apparently enrolled. OCI was started within the first 72 h, although it is important to note that many hospitals provide OCI within the first few hours after birth.
The study of Martins et al. was conducted in a medium-sized maternity unit, so the findings may be generalizable to other settings. The study highlights the support of the Human Milk Bank, but it is not clear how they were involved in obtaining colostrum as this is usually easily achieved by the mothers with support from bedside nurses. The colostrum was dripped slowly, but there is no specific report of whether any adverse effects occurred such as apnea, desaturations, or coughing. The historical cohort was from the same hospital in 2015–2016 but it is not clear if the OCI group assessed every sequential admission. If the more recent OCI cohort does not represent sequentially admitted infants, is it possible that sicker infants were excluded from the OCI cohort, but included in the earlier historical cohort? Cohort comparison studies must be well-matched for baseline characteristics.
The sample size was determined using a power of 80 % and a significance of 5 % which are standard for small, prospective randomized controlled trials (RCTs). However, a critical factor in performing a power calculation is the decision or calculation of the incidence of the outcome (death) and the estimate of the reduction in that outcome. Whilst the baseline (control) incidence and the standard error or deviation are calculated from pre-existing data, the risk reduction (or the incidence of death in the intervention group) must be estimated by the researchers as this is not known. For this study, the authors used an incidence of death of 25 % in the control group and estimated a 50 % risk reduction (RR = 0.5) due to the intervention. Many clinicians might feel that an RR of 0.5 is ambitious given the existing data on the efficacy of OCI which previously failed to show an effect on mortality.10,13–16 There are very few interventions in medicine with an RR of 0.5 or lower for overall mortality.
The recruitment period was 19 months (May 2015–November 2016) for the control group and 23 months (October 2018–August 2020) for the OCI group yielding 66 and 72 mother-infant dyads respectively who were assessed. These recruitment rates represent three to four VLBW infants recruited per month from a unit with eight intensive care, six intermediate care, and twelve cots for KMC, and suggests there may have been many VLBW infants not meeting the inclusion criteria. It would help to have further detail on total births and admissions during that time so the authors can assess how well the cohorts represent the true population.
Table 1 from the article of Martins et al. provides demographic data and shows that more than half of the recruits were born at <28 weeks i.e., those babies at highest risk of adverse outcomes. However, given that there are many more VLBW babies born at more than 28 weeks, the authors might have expected to see more of those larger infants represented in the study population. Furthermore, more than 90 % of cases from the retrospective control group were not married or accompanied, compared to only 53 % in the OCI group. This imbalance would not occur in a large RCT. Whilst this might not seem relevant to the biological effect of OCI, it suggests there may be important population differences between the two cohorts. This variable and many others were included as confounders in the final analysis model. The authors state there were no adverse effects due to OCI treatment, however, it is not clear what adverse effects were specifically included, for example, it might be difficult to determine if there were episodes of choking, apnea, or aspiration using retrospective notes review.
Statistical analysis also calculated the number needed to treat (NNT) and survival curves. Whilst these provide additional insights, they are not typically presented in cohort comparison studies. This study suggests an NNT of four, i.e., for every four infants treated with OCI, one death is prevented. In the large systematic review of the use of donor human milk (DHM) compared to formula, the authors estimated an NNT of 33 to prevent one case of necrotizing enterocolitis, and in RCTs including more than 1500 infants we did not observe a reduction in all-cause mortality from using DHM (RR 1.1, 95 % confidence intervals 0.8–1.5). It is theoretically possible that OCI is more effective than using DHM, but such a large effect from OCI seems unlikely.
In the discussion, the authors state that ‘….OCI proved to be beneficial in reducing risk of death…’. The precise terminology is important here. There is a general acceptance that in well-designed RCTs the rejection of the null hypothesis with a p < 0.05 is considered proof of effect. However, the terminology used in observational studies is important, and alternate phrasing such as ‘significantly associated’ is preferable. Observational studies cannot generally determine causality. The authors note that recent meta-analysis suggests a reduction in mortality, although there remain concerns of bias due to the lack of adequate blinding in many studies.13 As well as a reduction in mortality, this and other systematic reviews note a decrease in the incidence of necrotizing enterocolitis, sepsis, and other key outcomes. It would be interesting if the study of Martins et al. provided data on the incidence of these and other key outcomes. It would also be interesting to know the causes of death. There are many neonatal deaths due to diseases such as interventricular hemorrhage, cystic leukomalacia, congenital infections, and heart disease, etc. where it would seem highly unlikely for OCI to have an effect. The lower death rate in the OCI group with such small group sizes might be due to chance.
The main limitation of the study of Martins et al. is acknowledged by the authors themselves and is the historical cohort design. Whilst robust data collection and clearly defined outcomes are important, it is impossible to prove causality using this study method. Multivariable regression modeling is widely used to adjust for various population characteristics, but the number of variables that can be used is limited. Furthermore, it is impossible to adjust for confounding by indication. In the case of OCI, is it possible that clinicians only offer this to the healthiest patients, but were concerned about offering OCI to sick infants?
RCTs are the best method of determining causality, and large RCTs recruit a high number of participants to balance the study groups for confounders, whether these are known or unknown. Unknown confounders cannot be adjusted for in observational studies. Most RCTs that aim to determine the effect of an intervention in VLBW on key outcomes such as necrotizing enterocolitis or sepsis require study sizes of around 1000–3000 infants.17 It might also be important to note that there has never been a study in VLBW infants with a primary outcome of necrotizing enterocolitis, because it is such a rare outcome. All-cause mortality requires even larger sample sizes. Researchers may overestimate the effect of the intervention so a study can be ‘powered’ whilst still ensuring the population sample size is achievable for a single hospital setting.
Sample size and power calculations for RCTs are relatively simple to estimate using online programs such as www.sealedenvelope.com. Almost all use an accepted significance level (alpha) of 5 %. The next variable needed is power (1 – β). Many researchers use a power of 80 %, which means there is a 20 % chance the researchers will make a type II error, which is a false negative. Using a power of 80 %, one in every five occasions the study is conducted, will fail to reject the null hypothesis when in fact it is true. In other words, there is a 1:5 chance you incorrectly conclude that the intervention does not work, when in fact it does work. For this reason, many large government-funded trials require a power of 90 % so that the chance of falsely rejecting an effective intervention is decreased to one in ten.
In the study of Martins et al. they used a population correction factor. This is used when the sample represents a large fraction of the population and allows the standard error to be reduced, in turn meaning a smaller study sample is required. However, this is not common in most small RCTs. If the population correction factor were not used, as is most common, a study with a power of 80 %, and alpha of 5 %, would require a sample size of approximately 300 infants to detect a decrease in the control group incidence of death of 25 % to the estimated incidence of death with OCI of 12.5 %. This is a 50 % reduction and would only occur where an intervention exerted a massive effect, which seems unlikely for an intervention such as OCI. A more realistic reduction in death due to OCI (given that previous RCTs show minimal or no effect on death) might be a 20 % reduction. A 20 % reduction would still be clinically important for such a safe and cheap intervention. At a power of 80 %, alpha of 5 %, and a reduction of death from 25 % (control) to 20 % (OCI intervention) the sample size would be 2184 infants. If power were increased from 80 % to 90 % (to decrease the chance of a type 2 error), a sample size of 2922 infants would be needed.
Overall, the study of Martins et al. highlights that OCI is likely to be a beneficial intervention for VLBW infants, and despite the lack of certainty for a definite reduction in necrotizing enterocolitis or death, it will be increasingly widely used as it seems safe, is cheap, and universally available. However, it might be wrong to conclude that OCI reduces death in VLBW infants. An RCT to prove that OCI reduces death might require >3000 infants and would require a large and expensive collaboration. It would also require researchers to have genuine equipment which the authors of the present study don't possess. Given the other potential benefits of OCI on maternal mental health, FiCare, and the provision of breastmilk, we and many others will continue to practice OCI provision in the absence of conclusive evidence of effectiveness from large RCTs.
See paper by Martins et al. in pages 32–39.