Comparing intrapartum CTG monitoring guidelines

Demonstrating whether CTG monitoring does its job

There are (at least) two ways to think about CTG monitoring in labour. One is to think of it as an intervention, aimed at reducing the chance of a poor outcome for the baby. The best way to test this is with randomised controlled trials.

Another way to think about CTG monitoring is as a diagnostic test. The way to establish whether something is a good test or not, is known as validity and variability testing. Validity testing aims to establish the following:

Positive predictive value – when the test is positive, how likely is it that the condition being tested for will be present
Negative predictive value – when the test is negative, how likely is it that the condition being tested for will not be present

With variability testing the question is whether different maternity professionals interpreting the same CTG trace will come to the same conclusion. This is known as inter-observer variability.

In research examining the quality of CTG monitoring as a diagnostic test, the million-dollar question is – what is it trying to diagnose? The goal is not to make a diagnosis of stillbirth (which is fortunately rare) or to predict those babies that will die in the early neonatal period (also fortunately rare). The aim is to prevent these outcomes by making a diagnosis of “fetal compromise” – a moment when damage to the fetus begins to occur but is still reversible. Proving that fetal compromise is, or was present, is difficult. There’s no “gold standard” test. It has become accepted that a low arterial cord pH (meaning a high acid level in the blood) does a reasonable job, so researchers often make use of it.

There are many assumptions that lie under this choice, and it is important to keep in mind that an abnormal pH test at birth does not always correlate with a health problem in the baby (Johnson, et al., 2021).

So what’s new?

I have written about research comparing the validity of various CTG monitoring guidelines previously (here). Another research team has recently produced new research along the same lines. del Pozo, et al. (2021) assessed the FIGO (European), ACOG (United States of America), NICE (United Kingdom), and “Physiological” guidelines (used in some places in the UK and Europe), examining the validity and variability of each. To do so, they asked three reviewers, who didn’t know the outcome for the babies, to evaluate the final 30 minutes of 150 CTG records, by applying each of the guidelines. The women whose CTG data was used all had one baby, born close to their due date, that was coming head first. The reviewers were all said to be “experts” without defining how this was determined. The ability of each guideline to pick between babies born with a pH of above 7.1 and those at or below that level was tested.

The positive predictive value (the percentage of abnormal CTGs where the baby had a low pH) was highest for the ACOG guideline, but still fairly low at 50%. The lowest positive predictive value was seen with the Chandraharan guideline (29.5%). A low positive predictive value drives a higher surgical birth rate for babies who had a normal pH and would therefore not have benefitted from early birth.

The guidelines did better in their negative predictive value, with the best being the “Physiological” guideline at 88.7%. It is important to note that this means that 11.3% of babies with a normal CTG as defined by this guideline were born with an abnormally low pH. The remaining guidelines all had negative predictive values of 80% or more.

Interobserver variability was also assessed. The highest levels of agreement seen were for the baseline heart rate in the FIGO (Fleiss Kappa of 0.53), and ACOG guidelines (0.55), and for the categorisation of the CTG as showing no hypoxia using the Chandraharan guideline (0.56). The lowest levels of agreement were seen in the abnormal CTG categories II and III in the ACOG guideline (0.17 and 0.09 respectively), and for “gradually evolving hypoxia – compensated” or “decompensated” (both at 0.11) using the “Physiological” guideline.

These levels of agreement are similar to those seen in previous studies, confirming the ongoing problem that one person’s interpretation that the CTG is normal doesn’t mean that everyone else will see it the same way.

So which one to use?

No one guideline performed universally well across all measures. The authors of the paper preferred the “Physiological” guideline. Unfortunately, the “Physiological” guideline also favours unnecessary intervention when the CTG is more often abnormal in the face of a normal pH level. The decision about which guideline to use will continue, I suspect, to relate to the history of specific settings (“we’ve always done it that way”) and professional allegiances (choosing to use the ACOG guideline in the UK is likely to get you referred to the regulator).

It is important that all changes to intrapartum CTG interpretation guidelines be assessed to determine the validity of the algorithm. From this study, and others that proceed it, we now have good evidence about many guidelines. It is noteworthy that here in Australia the RANZCOG guideline has not been assessed in a similar manner, and is it time that someone got around to doing that.

I need your help…

I’m working on a plan for a workshop on getting the best out of your fetal monitoring guideline. Possibly a series of workshops so I can cover different guidelines one at a time. I want to understand the challenges you face when you are using your local fetal monitoring guideline so I can build something that is practical and problem-solving.

Help me out…

Tell me what you need

References

del Pozo, C. Z., Ezquerro, M. C., Mejía, I., de Terán Martínez-Berganza, E. D., Esteban, L. M., Alonso, A. R., Larraz, B. C., García, M. A. & Cornudella, R. S. (2021). Diagnostic capacity and interobserver variability in FIGO, ACOG, NICE and Chandraharan cardiotocographic guidelines to predict neonatal acidemia. Journal of Maternal-Fetal & Neonatal Medicine,10.1080/14767058.2021.1986479 https://doi.org/10.1080/14767058.2021.1986479

Johnson, G. J., Salmanian, B., Denning, S. G., Belfort, M. A., Sundgren, N. C., & Clark, S. L. (2021). Relationship between umbilical cord gas values and neonatal outcomes: Implications for electronic fetal heart rate monitoring. Obstetrics & Gynecology, 138(3), 366-373. https://doi.org/10.1097/AOG.0000000000004515