I’ve updated this post to include information from the Annual Report mentioned below as promised. Thanks go to Miranda Davies-Tuck for pointing me in the right direction.
So here’s the thing…
Evidence connecting the actions of individual maternity professionals in relation to CTG monitoring to perinatal outcomes is hard to find. Despite this, it is often argued that if people were just smarter, tried harder, or worked faster, the problem of poor outcomes continuing to happen when the CTG is abnormal would be solved. This idea underpins efforts at educating or “policy-ing” our way out of the problem.
RANZCOG (the Royal Australian and New Zealand College of Obstetricians and Gynaecologists) makes this argument in their Intrapartum Fetal Surveillance Clinical Guideline (2019), saying “review of cases with poor outcomes repeatedly demonstrate that abnormal CTGs were misinterpreted and the resulting management inappropriate. This likely arises, at least in part, because health care professionals have not been supported by comprehensive ongoing education and credentialing programs.” (p. 4). To support this claim, they reference the 1999 Annual Report of the Consultative Council on Obstetric and Paediatric Mortality and Morbidity, from the state of Victoria (Australia). Why that year, that state, and that twenty year old report? I have no idea as I have never been able to track down a copy. Perhaps there was a deliberate choice to cite an inaccessible document so no one can fact check it? Who knows… If you have a copy – send it to me and I’ll write a follow up.*
The other citation used to support their claim was Murphy, et al., 1990. The Murphy paper has been cited 102 times (according to Scopus) including use as justification for RANZCOGs Fetal Surveillance Education Program (Kroushev, et al., 2009; Zoanetti, et al., 2009). Given that this paper occupies a central place in arguments that clinicians do the wrong things and maternity professional education will fix the CTG problem, it is important to critically review the paper and see what it really says.
How to approach the task
Let me start by setting out the framework I used when looking at the paper. I know there is no convincing research evidence showing that CTG monitoring is superior to IA in preventing poor perinatal outcomes. I’m also familiar with arguments that our current understanding of the relationship between particular fetal heart rate patterns and fetal oxygenation is not as well developed as we might imagine it to be. If you have been reading Birth Small Talk posts for a while, you will know that these are things I write about a lot.
So, with that in mind, how then do we set standards for what is and is not appropriate or inappropriate interpretation and management of an abnormal CTG? If this paper is being used to argue that clinicians do bad things that lead to bad outcomes, then I would expect the authors should be clear about what they think is the right way to interpret a CTG and the right course of action to take. (Bonus points for proving that this course of action improves outcomes, but regulars here know that such evidence doesn’t exist.) So I want to see how they determined what was, and was not, appropriate action.
Given the claims made about the paper in the RANZCOG guideline, I also want to know if the authors of the paper showed whether there was a relationship between clinicians’ competence relating to the CTG and outcomes. Was this was the primary purpose of their research or a thing they noticed along the way when trying to show something else? Finally, was there any evidence that education and credentialling programs might be effective on the basis of what was in this paper?
Now that we have some questions in mind – let’s take a look at the paper in detail.
What does the paper say?
Published in 1990, the Murphy et al. paper is clearly not recent. You might argue it isn’t the best choice to use 30+ year old evidence in a guideline that promotes itself as being up to date and evidence based. The study was designed to examine whether “failure of interpretation” of “very complex and confusing fetal heart rate patterns” might be responsible for the observed inability of CTG monitoring to improve perinatal outcomes seen in randomised controlled trials. The authors used a retrospective case – control comparison, which isn’t a great design choice for trying to answer the question posed. I’ll come back to why it isn’t a good design choice later.
The data came from 17 months of births at one maternity service in Oxford (in the UK). The authors collected information about a cohort of infants with “perinatal asphyxia”. This was defined as a baby admitted to the nursery with at least two of any of the following: an Apgar score of less than seven at one minute of age, a pH of less than 7.2 at birth, the use of intermittent positive pressure ventilation, signs of cerebral irritability, seizures treated with an anticonvulsant medication, or meconium aspiration syndrome. They found these cases by consulting a register kept by the neonatal staff (there is the possibility of bias here – people might have forgotten to write down the details of milder cases for example). 85 babies were listed, but further analysis excluded 21 as not actually having had asphyxia (so much for the accuracy of the register!). Another 26 were excluded as no CTG monitoring was used, and four more as the CTG trace was “unsuitable for analysis” – they don’t say what made them unsuitable. That leaves us with 38 infants.
The control group started as 128 babies, being those who had been exposed to CTG monitoring and were the last baby born before the case, and the first baby born after the case. There were more than double the number of controls selected, as they included babies born around the time of cases later excluded because there was no, or no analysable, CTG. An odd choice, as they could have removed the controls that were linked with the cases without CTGs for analysis. It’s also odd to end up with 128 as doubling 85 should get you to 170 – so something strange went on that wasn’t explained in the methods. Controls were excluded if they had been admitted to the nursery or if the CTG was uninterpretable, leaving 120 babies.
All the CTGs were reviewed independently by three authors who were blinded to the outcome. The traces were grouped into A: “normal”; B: at least two of the observers thought the trace was abnormal enough to justify fetal blood sampling or expedited birth, or the trace was of poor quality and a fetal spiral electrode was indicated; or C: so abnormal that at least two of the observers predicted that there would be metabolic acidosis (defined as pH <7.12) at birth. The criteria for deciding the CTG was abnormal were listed in a table and include tachycardia, bradycardia, reduced variability, and variable or late decelerations. The actions taken by staff and the time to take that action were obtained from case records.
Inter-observer variability in CTG interpretation among the three authors was noted. This is interesting, though not new, with lots of research showing the same thing. It is interesting because, as we will see in a moment, the appropriateness of interventions and the timing of them were decided on by people who couldn’t all agree on whether the CTG warranted intervention in the first place. It seems that one “expert” who differed from the others on any individual trace was still an “expert”, but clinicians who also might have considered that same CTG to not be abnormal were wrong and delaying appropriate action. Hmmm….
Three of the “normal” CTGs belonged to babies with asphyxia. Only 3.2% of babies with abnormal CTGs were admitted to the nursery. Two deaths were reported – one after an acute abruption (with birth happening 17 minutes after the CTG became abnormal), and the other due to difficulty with the vaginal birth of a breech presenting baby. Neither of these were likely to be prevented through “better” CTG interpretation.
The authors considered that intervention was warranted in 87% of the cases and 29% of the controls, with intervention actually happening 42% and 21% of the time respectively. The “correct” rate of intervention was simply the opinion of the authors and doesn’t mean that if management happened as they suggested that outcomes would have been better. The 8% of controls who didn’t have what they felt was a required intervention all had good outcomes, so it is tricky to argue that with intervention the outcome would have been better.
Response times were reported in relation to the degree of abnormality of the CTG. The most abnormal CTGs and the not-so-abnormal abnormal CTGs had similar response times. This was true for both the cases and the controls. To me, this consistency suggests there were system related issues shaping response times so that individual clinicians couldn’t make things happen faster when the CTG was more abnormal. When the CTG abnormality was prolonged bradycardia, the response time for cases was significantly longer (27 minutes) than for controls (14 minutes). The authors later referred to advice that birth within 30 mins of recognition of an abnormal CTG pattern constituted appropriate practice, so they seem to suggest that both these time frames were fine.
The authors described that reporting on response times gave “some indication of the staff’s ability to interpret the intrapartum CTG”. This is, of course, nonsense. Long response times could be due to inadequate staffing resulting in long periods of time when the CTG went unobserved, or in the inability for midwifery staff to locate an available obstetrician when intervention was considered appropriate. Longer response times could also relate to inadequate access to equipment, to operating theatres, or peri-operative staff. On the other hand, short response times could be due to clinical information other than the CTG, or even due to misinterpretation of the CTG as more abnormal than it was.
Neoliberal healthcare systems prefer to pass responsibility for poor outcomes on to individuals rather than acknowledging the role of decisions made by managers to ensure clinicians can work to their best capacity. So, the assumption that it was the not-so-bright staff that were behind the response times fits with that way of thinking. The authors later commented that “it is not possible to judge the appropriateness or otherwise of the response time in an individual case without careful consideration of such factors” (referring to other possible reasons for the time taken). They made no attempt to carefully consider any of these factors in their research, and therefore should not make claims that their research provides evidence about the appropriateness or otherwise of response times.
Getting to the point
This brings us back to the question of research design. The quote above shows the researchers acknowledging that their study design, the one they chose to use, was not able to address the question. The question they posed, then designed the research around. The question about whether individual clinicians were correctly interpreting and managing abnormal CTGs in a timely manner. See the problem?
The authors didn’t use their concluding comments to revisit the question that prompted the research. Instead, they recommended more frequent use of fetal blood sampling, mandatory education in CTG interpretation, computer analysis of the CTG, and installation of central fetal monitoring systems. None of these options has subsequently been shown to improve perinatal outcomes. Perhaps rather than mandatory CTG education, mandatory courses on research design might have been a better investment? We might then have had research that helped to push us in a direction where meaningful change was now possible.
And so we arrive here at the end of this post, and I (unlike the authors of the paper) will come back to the questions I proposed earlier. Did the authors set out to study the relationship between clinicians’ competence and perinatal outcome? Yes, they did. But they designed a study that couldn’t actually show whether there was a relationship or not. So then, does this paper provide compelling evidence that poor outcomes are often due to clinicians’ failure to interpret or manage CTG abnormalities, as RANZCOG claims? Why no, it doesn’t. But let’s not let the facts get in the way of a jolly good story. Particularly if that story will set your organisation up to make a large sum of money from providing education.
When I worked as a university lecturer, my students quickly learned that if they were going to cite someone’s work in their paper, then it better say what they said it did. I was the sort of marker who checked people’s references to ensure they built sound arguments based on research and marked them down if they misused literature. Did it annoy students? Yes, I suspect so. But in doing so, I hope that there will be a generation of maternity professionals out there who know not to pull this sort of nonsense when writing a guideline that professes to be evidence based. If RANZCOG want to live up to their claim that they provide professional leadership, they need to do better.
* As promised – an analysis of the report I had not previously seen
The 1999 Annual Report of the Consultative Council on Obstetric and Paediatric Mortality and Morbidity can be viewed here. I have read it cover to cover and there is absolutely nothing in the report that supports the assertion in the RANZCOG guideline that “review of cases with poor outcomes repeatedly demonstrate that abnormal CTGs were misinterpreted and the resulting management inappropriate.” The closest I could get to something resembling this was in the section headed “Inadequate intrapartum care”, stating:
Inadequate intrapartum fetal heart rate monitoring was noted in four cases.p. 11
Not further information was provided to explain what form of intrapartum fetal heart rate monitoring they were referring to nor what was deemed to be inadequate about it. It could be that there was no fetal heart rate monitoring at all.
In the section on causes of perinatal death, the following paragraph appeared:
Intrapartum hypoxia was associated with 6 neonatal deaths, and 19 stillbirths. It is generally accepted in these cases, that if the hypoxia were identified earlier and delivery effected more expeditiously, the outcome would have been improved. This unfortunately is not always the case – “fetal distress” in labour may be the manifestation of previous severe hypoxic insult, with subsequent cerebral injury – and therefore not reversible by prompt delivery.p. 20
Nowhere in the document was there a statement about the misinterpretation of abnormal CTGs, nor any recommendation regarding education about fetal heart rate monitoring during labour. Like the Murphy et al. paper, the Annual Report does not provide evidence that misinterpretation of CTG monitoring was occurring in clinical practice nor that additional education and credentialling might be of use. I stand by the conclusion of the first draft of this post.
Kroushev, A., Beaves, M., Jenkins, V., & Wallace, E. M. (2009). Participant evaluation of the RANZCOG Fetal Surveillance Education Program. Australian & New Zealand Journal of Obstetrics & Gynaecology, 49(3), 268-273. https://doi.org/10.1111/j.1479-828X.2009.00988.x
Murphy, K. W., Johnson, P., Moorcraft, J., Pattinson, R. C., Russell, V., & Turnbull, A. (1990). Birth asphyxia and the intrapartum cardiotocograph. British Journal of Obstetrics & Gynaecology, 97(6), 470-479. https://doi.org/10.1111/j.1471-0528.1990.tb02515.x
Royal Australian and New Zealand College of Obstetricians & Gynaecologists. (2019). Intrapartum fetal surveillance clinical guideline. 4th Edn. https://ranzcog.edu.au/statements-guidelines
Zoanetti, N., Griffin, P., Beaves, M., & Wallace, E. (2009). Rasch scaling procedures for informing development of a valid Fetal Surveillance Education Program multiple-choice assessment. BMC Medical Education, 9, 20. https://doi.org/10.1186/1472-6920-9-20