Are intrapartum fetal monitoring guidelines fit for purpose?

Photo by Mark Fletcher-Brown on Unsplash

One of the core assumptions that must be true in order for CTG monitoring to reduce perinatal harm from low oxygen levels, is that there must be recognisable CTG patterns that clinicians can consistently identify, and that are strongly associated with low oxygen levels in the fetus. Measuring oxygenation directly is tricky so alternative measures are generally used instead. Typically, acidosis is measured as low oxygen levels switch metabolic processes over to those that generate lactic acid. While this can be measured by fetal blood sampling during labour, it is easier to measure acidosis from cord blood samples collected very soon after birth. This assumes that these are representative of what happened during the recent past experience of the fetus, which may or may not be the case (Al Wattar et al., 2019). In spite of these shortcomings, tests of cord blood acidosis remain common in fetal monitoring research.

All fetal monitoring guidelines define different patterns (such as variability and decelerations) of the fetal heart rate and generate categories on the basis of whether particular patterns are present or not, sometimes in combination with other patterns. The precise details of how the patterns are defined, what the categories are, and what patterns are used to determine which category to apply differ from one guideline to the next. Guidelines produced by the same organisation sometimes change over time. Despite frequent claims that such changes are evidence based, there has been little new physiological research to inform our interpretation of fetal heart rate patterns and that which has been done (see for example the work of Lear et al., 2016; Lear et al., 2018) is yet to be taken up by guidelines.

One way to demonstrate that a fetal monitoring guideline might be fit for purpose is to examine how reliably acidosis is detected in cord blood samples from babies who recently had a heart rate pattern recorded by CTG that fits into a guideline defined category that is considered to indicate a population of fetuses at high risk for being hypoxic and acidotic. Ekengård et al. set out to assess how well three different guidelines performed in respect to their ability to predict cord blood acidosis, with separate analyses for the first (Ekengård et al., 2021) and second stages of labour (Ekengård et al., 2020).

There are four measures that provide useful information for this sort of research. Sensitivity measures how many babies with a CTG classified as significantly abnormal are actually acidotic. Specificity measures how many babies with a CTG classified as normal have normal acid levels. The false negative rate measures how many babies who were born acidotic had a normal CTG, while the false positive rate measures how many babies who had normal acid levels had significantly abnormal CTG. A perfect test would have 100% sensitivity and specificity, with false positive and negative rates of zero, but this is rarely achieved. Changing guidelines to reduce the false negative rate means that fewer acidotic babies remain undetected but typically increases the false positive rate, thereby driving up the caesarean section and instrumental birth rate for well babies.

How the research was conducted

All babies in both the case and control groups had been monitored by CTG, and were from singleton pregnancies of at least 34 gestation. All the cases were babies with cord blood acidosis, defined as a pH of less than 7.1 (in either arterial or venous samples) for the first stage study, and less than 7.05 for the second stage study. All the controls were babies with a pH of at least 7.15 and an Apgar score of nine or ten at both five and ten minutes. For the first stage study, all the cases were born by caesarean section, while the controls included babies born by caesarean section, instrumental and non-instrumental vaginal birth. Control group babies in the second stage study were born by instrumental and non-instrumental vaginal birth, while the case group also included babies born by caesarean birth.

The CTG from the immediate period prior to birth (ranging from 18 to 80 mins in duration) was assessed by three independent reviewers, drawn from a pool of 21 midwives, doctors training in obstetrics, and obstetricians trained in interpreting CTGs. Each CTG was assessed according to three guidelines: the FIGO 2015 guidelines, a Swedish guideline written in 2017, and the previous version of this same guideline written in 2009. Reviewers were not aware of whether the CTG was from a case or a control, and were asked to apply each guideline to the CTG, indicating whether it was normal, suspicious, or pathological.

Validity of three guidelines in the first stage of labour

	Sweden 2009	Sweden 2017	FIGO 2015
Sensitivity	95%	77%	71%
Specificity	90%	97%	97%
False positive rate	10%	3%	3%
False negative rate	0%	10%	1%

73 cases and 291 controls were included in the first stage study. The oldest guideline had the highest sensitivity (ability to detect acidotic babies), coupled with no acidotic babies having CTGs considered to be normal (false negatives). However, ten percent of non-acidotic babies were classified as having pathological CTGs (false positives) using this guideline. These babies and their mothers would have been more likely to experience further intervention such as fetal blood sampling and caesarean section with no potential benefit.

Validity of three guidelines in the second stage of labour

	Sweden 2009	Sweden 2017	FIGO 2015
Sensitivity	87%	62%	50%
Specificity	56%	85%	88%
False positive rate	45%	15%	13%
False negative rate	2%	17%	3%

295 cases and 591 controls were included in the second stage study. Once again, the 2009 Swedish guideline had the highest sensitivity, and the lowest false negative rate, ensuring that most acidotic babies would be identified by a CTG considered to be pathological. The false positive rate was very high however, with almost half of all babies with a normal acid level having a CTG classified as pathological. The false positive rate was lowest for the FIGO 2015 guideline, but the ability of the guideline to correctly identify which babies where acidotic when the CTG was considered pathological was low at 50%.

Is the new guideline always the better one?

The findings from these two studies makes it clear that updating guidelines doesn’t always translate to improved diagnostic ability when the guideline is applied. The authors draw attention to the importance of first conducting research such as they have done to demonstrate whether changes to CTG interpretation guidelines improve the detection of fetuses at risk for acidosis while not including a large number of well babies in the group with CTGs considered to be abnormal. This is not typically done. For example, no research has examined the validity of the RANZCOG fetal surveillance guideline which is use in Australia and New Zealand. These findings place clinicians in a difficult position, as the use of a particular guideline is typically mandated by health services, limiting clinicians’ ability to revert to a previous guideline with higher sensitivity.

Drawing the line

The other issue these two studies brings to light is the tension between generating a guideline that ensures that all acidotic babies can be detected by CTG monitoring, but which results in a high rate of normal babies being included in the group considered to have abnormal CTGs; compared with generating a guideline which doesn’t over-diagnose healthy babies but misses some who have acidosis. There is no such thing as a perfect guideline that manages to always correctly distinguish well babies from those who are not. Someone has to make a decision about where to draw the line. It is therefore important that people who make use of maternity services are included in a meaningful way when guidelines are developed. They are the ones who deal with the daily and ongoing consequences of the application of guidelines in clinical practice. They should have a say about what risks are acceptable and what are not, and where the lines should be drawn.

References

Al Wattar, et al. (2019). Evaluating the value of intrapartum fetal scalp blood sampling to predict adverse neonatal outcomes: A UK multicentre observational study. European Journal of Obstetrics, Gynecology, and Reproductive Biology, 240, 62-67. https://doi.org/10.1016/j.ejogrb.2019.06.012

Ekengård, F., Cardell, M., & Herbst, A. (2020). Low sensitivity of the new FIGO classification system for electronic fetal monitoring to identify fetal acidosis in the second stage of labor. European Journal of Obstetrics & Gynecology and Reproductive Biology, in press. https://doi.org/10.1016/j.eurox.2020.100120

Ekengård, F., Cardell, M., & Herbst, A. (2021). Impaired validity of the new FIGO and Swedish CTG classification templates to identify fetal acidosis in the first stage of labor. Journal of Maternal-Fetal & Neonatal Medicine, in press. https://doi.org/10.1080/14767058.2020.1869931

Lear, C. A., Galinsky, R., Wassink, G., Yamaguchi, K., Davidson, J. O., Westgate, J., Bennet, L., & Gunn, A. J. (2016). The myths and physiology surrounding intrapartum decelerations: the critical role of the peripheral chemoreflex. The Journal of Physiology, 594(17), 4711-4725. https://doi.org/10.1113/JP271205

Lear, C. A., Westgate, J., Ugwumadu, A., Nijhuis, J. G., Stone, P. R., Georgieva, A., Ikeda, T., Wassink, G., Bennet, L., & Gunn, A. J. (2018). Understanding fetal heart rate patterns that may predict antenatal and intrapartum neural injury. Seminars in Pediatric Neurology, 28, 3-16. https://doi.org/10.1016/j.spen.2018.05.002

Categories: CTG

5 replies ›

Susan Bewley
June 11, 2021 • 9:56 pm

How were the cord bloods taken (intact or with clamping – that interferes with the blood distribution & recovery?). Isn’t it problematic that we compare one test with another proxy (and distorted) measure?

LikeLike

Reply ↓
- kirstensmall
  June 13, 2021 • 12:49 pm
  
  There’s no information given in either paper. I suspect the cord was clamped given that it is considered standard practice. I agree that we aren’t measuring a physiological process by doing this – it is an iatrogenic process, that is considered “normal” because of the frequency of occurrence.
  
  LikeLike
  
  Reply ↓