AI in ED Triage

Segment 1 — Why Triage Attracts AI (Expanded)

Hamish: Before we even talk about evidence, it’s worth asking why triage keeps attracting AI solutions in the first place.

Jeremy: Because it looks clean. It’s time-bounded, data-rich, and happens before most of the mess downstream becomes visible.

Hamish: Exactly. Triage sits at a liminal point — not quite assessment, not quite treatment — but it carries enormous downstream consequences.

Jeremy: And historically, triage has functioned as a kind of shock absorber for system strain. When beds disappear or staffing thins out, triage absorbs that pressure by stretching waiting times and reprioritising risk.

Hamish: Which is precisely why it’s politically attractive. If you can make triage “smarter”, it feels like you’re acting on crowding without having to touch the harder problems.

Jeremy: But this is the first conceptual mistake. Flow is not a property of triage. It’s an emergent property of the entire system.

Hamish: Triage can decide who waits. It cannot decide where patients go once the system is full.

Jeremy: And in Australia and Aotearoa New Zealand, triage is operating in a context of structural constraint — access block, ambulance offload delays, chronic workforce shortages.

Hamish: So when AI enters triage, it’s not entering a neutral space. It’s entering a pressure valve that already exists to protect the rest of the system.

Jeremy: Which raises an uncomfortable question: are we using AI to improve decision-making — or to make rationing look more objective?

Hamish: That’s not an accusation. It’s a design reality. Any tool that operates at triage will inevitably participate in how scarcity is managed.

Jeremy: And that’s why claims about “improving flow” at triage need to be treated carefully. You can reorder risk without changing throughput.

Hamish: In fact, you can sometimes make the system feel calmer while nothing materially improves for patients.

Jeremy: So from the outset, AI-powered triage shouldn’t be evaluated as a technical upgrade.

Hamish: It should be evaluated as a redistribution of responsibility under constraint.

Hamish: Before we criticise anything, we should be clear about where AI-powered triage actually does perform well — because otherwise this just sounds like resistance dressed up as caution.

Jeremy: Agreed. And the strongest evidence base here is in retrospective prediction. There are now multiple high-quality modelling studies showing that AI can predict certain outcomes at triage better than clinicians alone.

Hamish: A commonly cited example is the 2021 paper by Raita et al., published in Annals of Emergency Medicine. That study used data from multiple emergency departments in the United States.

Jeremy: They trained a machine-learning model using triage-time variables — vitals, demographics, presenting complaint — and asked a very specific question: can we predict hospital admission and critical care requirement earlier than clinicians do?

Hamish: And the answer, statistically, was yes. The model showed strong discrimination, with AUCs that exceeded clinician-only prediction.

Jeremy: Similar findings have been reported in European and UK datasets. When the outcome is binary and system-facing — admission versus discharge — these models are genuinely impressive.

Hamish: But here’s where we need to slow down, because this is where interpretation often goes wrong.

Jeremy: Exactly. These studies tell us what AI can predict, not what it can change.

Hamish: A model that predicts admission accurately does not shorten length of stay unless the system is able — and willing — to act on that prediction.

Jeremy: And in most of these retrospective studies, there’s no intervention. No change to workflow. No capacity unlocked downstream.

Hamish: Which means there’s an implicit leap being made: that better foresight will automatically translate into better flow.

Jeremy: And that leap is not supported by evidence.

Hamish: There’s also a deeper issue here that clinicians tend to pick up on instinctively: these models are trained on historical decisions.

Jeremy: They learn from who we admitted, how we documented, how risk was interpreted in that particular system at that particular time.

Hamish: So when we say “AI outperforms clinicians,” what we often mean is that it reproduces historical clinician behaviour very efficiently — and very consistently.

Jeremy: Which may be valuable, but it’s not neutral. It bakes in local practice patterns, biases, and system constraints.

Hamish: And that matters for equity. If your historical system under-triaged certain populations, the model will learn that.

Jeremy: There’s also a cognitive load issue that rarely gets acknowledged in these papers.

Hamish: Right. Because prediction doesn’t arrive in a vacuum. It arrives as another signal, another alert, another probability score in an already crowded cognitive environment.

Jeremy: So even if the model is “right,” clinicians still have to decide when to trust it, when to override it, and how to integrate it with their own judgement.

Hamish: And under pressure, humans adapt. Sometimes in sensible ways. Sometimes in ways that introduce new risks.

Jeremy: Which brings us to the central tension of this segment.

Hamish: AI triage is very good at telling us what will happen.

Jeremy: But emergency departments run on what we are able to do — not what we can foresee.

Hamish: And confusing those two is where a lot of well-intentioned implementations start to wobble.

Hamish: This is the point where the literature usually thins out — and where real departments start to recognise themselves.

Jeremy: Because up until now, we’ve mostly been talking about what models can do in controlled, retrospective environments.

Hamish: And that’s exactly why the study by Akhlaghi and colleagues at St Vincent’s Hospital Melbourne, published in Emergency Medicine Australasia, is such an important inflection point.

Jeremy: The title alone is worth pausing on: “Evaluation of a machine learning model for predicting hospital admission after deployment in an emergency department.”

Hamish: That word — after — is doing a lot of work.

Jeremy: This wasn’t a development paper. It wasn’t an internal validation. It was a post-deployment evaluation of a live AI triage system embedded into routine clinical practice.

Hamish: Over 77,000 consecutive ED presentations. No cherry-picking. No idealised dataset. Real clinicians, real documentation, real noise.

Jeremy: And importantly, the model didn’t rely on structured data alone. It analysed free-text triage notes, which is what we actually use — and which is notoriously variable.

Hamish: That choice matters, because it exposes the model to the same ambiguity and inconsistency clinicians work with every day.

Jeremy: Performance was clinically meaningful but imperfect — sensitivity and specificity in the low-to-mid 70 percent range, with a strong negative predictive value.

Hamish: Which already tells you this wasn’t a marketing paper.

Jeremy: But the most important finding wasn’t the absolute performance. It was the performance drop compared to the model’s development phase.

Hamish: And that’s where this paper earns its credibility. Because once the model went live, reality asserted itself.

Jeremy: Documentation styles shifted. Case mix evolved. Clinicians adapted their behaviour once decision support was visible.

Hamish: Which is exactly what always happens — but is rarely measured.

Jeremy: And here’s the key thing: the authors didn’t try to explain that away. They didn’t reframe it as “acceptable degradation.”

Hamish: They named it as an expected property of live clinical systems — and argued that ongoing monitoring and recalibration are non-negotiable.

Jeremy: What they also didn’t do is just as important.

Hamish: They didn’t claim reduced emergency department length of stay.

Jeremy: They didn’t claim improved throughput, reduced crowding, or faster ambulance offload.

Hamish: Which, frankly, is refreshing.

Jeremy: Instead, the contribution of this paper is more uncomfortable and more useful: it shows that deploying AI at triage is not an endpoint — it’s the beginning of governance work.

Hamish: And that reframes the whole conversation. Because once a model is live, responsibility doesn’t sit with the algorithm.

Jeremy: It sits with the department, the hospital, and the system that chose to deploy it.

Hamish: Which raises a question we don’t ask often enough: when an AI tool flags risk at triage, who owns the obligation to act?

Jeremy: Is it the triage nurse? The senior clinician? The bed manager? The executive who approved the rollout?

Hamish: The St Vincent’s study doesn’t answer that question — but it forces us to confront it.

Jeremy: And in doing so, it exposes why so many AI triage implementations feel disappointing in practice.

Hamish: Not because the models don’t work — but because the system around them hasn’t decided how responsibility is redistributed once risk is made explicit.

Jeremy: That’s the real shift this paper represents. It moves us from “Can we build this?” to “Are we prepared to own what it shows us?”

Jeremy: At this point, it’s worth being fair. There are studies where AI-supported triage appears to improve flow — and we should name them properly.

Hamish: Yes, but we should also be clear about what kind of flow, where, and at what cost.

Jeremy: The clearest example comes from chest pain. The study most people reference is by Than and colleagues, published in JAMA Internal Medicine.

Hamish: This was conducted across multiple hospitals and evaluated an AI-informed, risk-driven chest pain pathway — not just a prediction model sitting in isolation.

Jeremy: Exactly. The system used AI-derived risk stratification to guide early decision-making within an already established chest pain pathway.

Hamish: And the outcomes were meaningful within that pathway: shorter length of stay for chest pain patients, faster disposition decisions, and fewer unnecessary admissions.

Jeremy: But this is where nuance matters. That success depended on several preconditions.

Hamish: The pathway already existed. It was staffed. There were agreed endpoints. Cardiology, ED, and bed management were already aligned.

Jeremy: AI didn’t create capacity — it aligned patients with capacity that was already there.

Hamish: Which is a very different claim from “AI improves ED flow.”

Jeremy: I agree — but I’d argue that’s still a legitimate form of flow improvement.

Hamish: It is, but it’s a local optimisation. And local optimisation in a constrained system can have unintended consequences elsewhere.

Jeremy: Such as?

Hamish: If you accelerate one cohort without increasing overall capacity, you often displace congestion onto another group — usually lower-acuity or socially complex patients.

Jeremy: That’s fair. You improve median times for one pathway, but the tail gets longer for others.

Hamish: And that’s rarely acknowledged in the headline results.

Jeremy: Another area worth mentioning is consistency rather than speed. The Levin et al. study in Annals of Emergency Medicine looked at AI-supported triage category assignment across multiple EDs.

Hamish: Their primary outcome wasn’t throughput. It was reduction in variability between clinicians and sites.

Jeremy: And they showed that AI support could standardise triage decisions — fewer extreme outliers, more predictable distribution of acuity.

Hamish: Which doesn’t necessarily shorten waiting times.

Jeremy: But it does change the shape of demand.

Hamish: Yes — and that’s where I get uneasy. Predictability is useful, but it can also mask persistent congestion.

Jeremy: Explain that.

Hamish: If your system becomes more predictable but no less congested, leadership can mistake stability for improvement.

Jeremy: So the dashboards look calmer, but the lived experience doesn’t improve.

Hamish: Exactly. That’s a classic moral hazard. We smooth the signal without fixing the problem.

Jeremy: I take that point. But I’d still argue that predictability has operational value — staffing, escalation, senior oversight.

Hamish: I don’t disagree. I just want us to be honest about what kind of value it is — and what it isn’t.

Jeremy: So the synthesis here is that AI-supported triage can improve pathway-level flow and operational predictability.

Hamish: But it doesn’t produce ED-wide throughput gains unless downstream constraints are addressed.

Jeremy: And expecting it to do so sets both clinicians and the technology up to fail.

Jeremy: Let’s pause for a moment, because we’ve covered a lot of ground, and it’s worth being explicit about where the evidence has taken us.

Hamish: Up to this point, nothing we’ve discussed suggests that AI-powered triage is useless.

Jeremy: But equally, nothing suggests it’s a solution to overcrowding.

Hamish: The consistent pattern across studies is this: AI improves how we see risk earlier.

Jeremy: It improves prediction, consistency, and sometimes pathway-level efficiency.

Hamish: But flow — real, department-wide flow — is governed by capacity, staffing, and access block.

Jeremy: And confusing better foresight with better outcomes is where systems get into trouble.

Hamish: Which brings us to the next question: what happens to clinicians when we add more signal to an already overloaded environment?

Jeremy: One of the more compelling arguments for AI at triage isn’t actually about flow — it’s about safety.

Hamish: Specifically, the idea of continuous re-triage. Moving away from triage as a single, static decision.

Jeremy: Several observational studies have looked at AI systems that continuously monitor vitals, labs, and documentation to identify deterioration in patients waiting to be seen.

Hamish: And on paper, that makes a lot of sense. Humans are bad at sustained vigilance — especially in noisy, crowded environments.

Jeremy: Early warning systems have shown benefits in inpatient settings, so it’s not unreasonable to think similar logic could apply in the waiting room.

Hamish: But this is where we need to be careful, because translating that into emergency care is not straightforward.

Jeremy: Why not?

Hamish: Because emergency departments already run at the edge of cognitive saturation. Adding continuous alerts doesn’t automatically improve safety — it can also dilute it.

Jeremy: Right. Alert fatigue isn’t theoretical. It’s something every senior clinician has felt.

Hamish: And clinicians adapt. If every second patient is flagged as “at risk”, the signal loses meaning.

Jeremy: Which raises a design question that rarely appears in AI papers: what happens to human judgement under sustained alert pressure?

Hamish: Exactly. More data doesn’t necessarily mean more clarity. Sometimes it just means more noise.

Jeremy: There’s also a subtle redistribution of responsibility here.

Hamish: Yes — once a system is continuously monitoring patients, expectations shift. If deterioration occurs, the question becomes: why wasn’t it acted on?

Jeremy: Even if no action was possible.

Hamish: Which is dangerous. We risk turning clinicians into the final common pathway for system failure — expected to absorb risk that has been identified but not resolvable.

Jeremy: So while continuous re-triage may improve detection, it can also increase moral and cognitive load.

Hamish: And unless leadership explicitly owns what happens when alerts fire and nothing can be done, that burden falls silently on clinicians.

Jeremy: This is where AI safety narratives can become misleading.

Hamish: Because safety isn’t just about detection. It’s about capacity to respond.

Jeremy: And if response capacity doesn’t exist, detection alone may actually make harm more visible — without making it preventable.

Hamish: This is the point where I think we need to let something go wrong — because in real emergency departments, it does.

Jeremy: Alright. Walk me through it.

Hamish: Picture a large metropolitan ED on a winter evening. The waiting room is full. Ambulance offload delays are already in play. Inpatient beds are tight.

Jeremy: So, a normal night.

Hamish: Exactly. Now add AI-supported triage into that mix. The system is live. It’s been embedded into the electronic medical record.

Jeremy: And it’s doing what it was designed to do.

Hamish: Yes. A middle-aged patient presents with vague but concerning symptoms — abnormal vitals, a concerning triage note. The AI flags them as high risk for admission.

Jeremy: So early risk recognition works.

Hamish: It does. The triage nurse escalates appropriately. The alert is visible. The senior doctor is aware. Everyone knows this patient matters.

Jeremy: But nothing downstream moves.

Hamish: Right. There are no cubicles. No monitored beds. No inpatient beds to pull forward. No staff you can redeploy without creating risk somewhere else.

Jeremy: So the patient waits.

Hamish: And this is the failure mode: the presence of the AI flag creates a false sense of safety.

Jeremy: Because the risk has been identified.

Hamish: Documented. Acknowledged. Visible in the system.

Jeremy: But not resolved.

Hamish: Exactly. Time passes. The waiting room gets louder. Attention is pulled elsewhere.

Jeremy: And eventually, the patient deteriorates.

Hamish: Not because the AI failed. Not because the clinicians ignored the signal.

Jeremy: But because the system had nowhere to put them.

Hamish: Now here’s the uncomfortable part. When this case is reviewed later, the documentation looks good.

Jeremy: The risk was recognised early.

Hamish: Escalation was appropriate.

Jeremy: The AI did its job.

Hamish: Which means the system failure is harder to see — and easier to defend.

Jeremy: That’s the moral hazard.

Hamish: Exactly. AI can turn unresolved risk into well-documented risk, and those are not the same thing.

Jeremy: And this is why ED-wide flow gains remain elusive.

Hamish: Because triage — human or AI-augmented — cannot overcome access block, workforce shortages, or structural congestion.

Jeremy: In fact, safer prioritisation can sometimes concentrate risk, making the consequences of inaction more severe.

Hamish: Which is confronting, but it’s the reality leaders need to design for.

Jeremy: And it reframes the question entirely.

Hamish: Yes. The question isn’t “Did the AI work?”

Jeremy: It’s “Once the AI showed us the risk, who owned what happened next?”

Jeremy: So let’s take this out of the abstract and into reality. You’re back in your department on Monday morning. Someone from executive, digital health, or procurement asks whether AI-powered triage is something your ED should be adopting.

Hamish: And the first question is not “does the model work?”

Jeremy: It’s “what problem are we actually trying to solve?”

Hamish: Because if the dominant problem in your department is access block, inpatient bed scarcity, or workforce exhaustion, AI at triage will not fix your flow metrics.

Jeremy: In fact, it may make those failures more visible without making them more solvable.

Hamish: Which can be politically attractive but clinically dangerous.

Jeremy: I want to push back slightly there. Visibility isn’t inherently bad.

Hamish: I agree — if the system is prepared to act on what it sees. If not, visibility just concentrates moral injury at the front door.

Jeremy: That’s fair. So the second judgement call is where AI triage might genuinely add value.

Hamish: And the evidence is pretty consistent here. AI-supported triage makes most sense in departments that already have functional downstream pathways.

Jeremy: Fast track, short stay, chest pain, ambulatory care, frailty pathways — places where early alignment actually changes what happens next.

Hamish: If no such pathways exist, you’re essentially installing a high-resolution thermometer in a room with no cooling system.

Jeremy: You’ll get more accurate readings — but the fever won’t break.

Hamish: The third issue is how the tool is framed to clinicians.

Jeremy: This matters more than most people realise.

Hamish: If AI triage is sold as an operational solution — “this will reduce crowding” — clinicians will quickly disengage when that promise isn’t met.

Jeremy: Whereas if it’s framed honestly as decision support — something that may improve early risk recognition and consistency — it has a chance of being trusted.

Hamish: Trust is the currency here. Once it’s lost, no amount of recalibration will bring it back.

Jeremy: The fourth judgement call is integration and friction.

Hamish: This is non-negotiable. If the AI tool sits outside the EMR, requires duplicate data entry, or interrupts triage flow, it will fail.

Jeremy: Even if the model is excellent.

Hamish: Especially then — because clinicians will resent it.

Jeremy: The fifth issue, and arguably the most important, is governance.

Hamish: Yes. Once an AI system is live, performance will drift. Case mix will change. Equity effects will emerge over time.

Jeremy: So someone has to own monitoring, recalibration, and review.

Hamish: And we need to be explicit about who that is.

Jeremy: Is it the ED director? The digital health team? The hospital executive who approved the rollout?

Hamish: Because if governance is vague, responsibility defaults downward — to the clinician at the bedside or the triage nurse at the desk.

Jeremy: Which is exactly what we should be trying to avoid.

Hamish: There’s also a liability question here that often goes unspoken.

Jeremy: Say more.

Hamish: If an AI system flags a patient as high risk and no action is taken — not because of negligence, but because of capacity — who carries that risk?

Jeremy: The clinician who couldn’t act?

Hamish: Or the system that created an obligation it couldn’t fulfil?

Jeremy: That question has not been answered in most deployments.

Hamish: And until it is, AI triage risks turning clinicians into shock absorbers for system failure.

Jeremy: So the mature position isn’t enthusiasm or rejection.

Hamish: It’s selectivity. Clarity about purpose. And honesty about limits.

Jeremy: And a willingness to say no — or not yet — if the system can’t support what the tool reveals.

Jeremy: When you step back and look across the studies we’ve discussed — from Raita’s retrospective modelling work in the US, to Than’s pathway-specific chest pain studies, to Akhlaghi’s post-deployment evaluation at St Vincent’s Melbourne — a fairly consistent story emerges.

Hamish: AI-powered triage is not a revolution in emergency care. It’s an incremental evolution.

Jeremy: Its most reliable strengths are not speed or throughput. They’re earlier risk recognition, improved consistency, and better situational awareness under pressure.

Hamish: And where flow improvements do occur, they’re narrow, localised, and dependent on downstream readiness. AI can align patients with capacity — but it cannot create capacity where none exists.

Jeremy: The St Vincent’s experience is particularly instructive, because it forces us to confront something uncomfortable: once AI leaves clean datasets and enters real emergency departments, governance becomes the central problem.

Hamish: Not algorithmic performance. Not accuracy. Governance.

Jeremy: Who owns the signal? Who is responsible for acting on it? And what happens when the system simply cannot respond?

Hamish: That question matters because AI doesn’t just surface risk — it redistributes responsibility.

Jeremy: Once risk is identified earlier, more clearly, and more visibly, the tolerance for inaction changes.

Hamish: And if that redistribution isn’t explicit, it flows downhill — onto the triage nurse, the junior doctor, the consultant on the floor.

Jeremy: Which is how clinicians quietly become shock absorbers for system failure.

Hamish: This is why it’s so important to say this clearly: overcrowding is not a triage problem.

Jeremy: It never was.

Hamish: Triage — whether human, AI-augmented, or hybrid — can prioritise risk, improve safety, and stabilise operations.

Jeremy: But it cannot compensate for access block, workforce shortages, or structural congestion.

Hamish: And when we pretend it can, we’re not innovating — we’re relocating accountability.

Jeremy: So the real question for emergency departments in Australia and Aotearoa New Zealand isn’t whether AI will appear at triage.

Hamish: It already has.

Jeremy: The real question is whether we’re prepared to own what it shows us.

Hamish: Whether we’re willing to match better detection with real authority, real capacity, and real leadership decisions.

Jeremy: Because if we’re not, AI won’t make care safer or faster.

Hamish: It will just make system failure clearer — and easier to defend.

Jeremy: The future of triage may well be AI-augmented.

Hamish: But it will remain, at its core, a profoundly human endeavour — one that still demands judgement, courage, and ownership.

Jeremy: Thanks for staying with us through this episode of the TIME Podcast. We know these aren’t always comfortable conversations, but we think they’re important ones to have.

Hamish: If this episode made you pause, disagree, or rethink something you were previously comfortable with, then it’s done its job.

Jeremy: We’d also like to thank Clintix — not just for creating the TIME Podcast, but for hosting the TIME conference itself and making space for this kind of honest, systems-level discussion.

Hamish: TIME exists because people are willing to engage with complexity rather than look for easy answers, and that’s something worth supporting.

Jeremy: Thanks for listening.

Hamish: We’ll see you in the next episode.

AI and Evidence in Emergency and Critical Care: AI in ED Triage — full transcript

Introduction

Segment 1 — Why Triage Attracts AI (Expanded)