I’ve been arguing in recent posts that provider-facing software has to increase clinician productivity by several-fold while improving quality — otherwise it won’t meaningfully move access, cost, or outcomes. And given MD/DO shortages, it should also enable APPs to safely operate at a much higher level.
But “AI for medicine” isn’t the same as providing ChatGPT for doctors or a vibe-coding tool like Claude Code.
Most time savings must happen during a live human conversation. And because of risk and liability, we can’t plausibly “delegate” critical decisions to a machine. The encounter must still produce:
• provider confidence in the reasoning (even in complex and high-risk scenarios)
• patient confidence in the plan
That means the software can’t be a leisurely prompt → do → evaluate loop. It has to structure and accelerate the clinical conversation while generating and justifying high-stakes recommendations in real time.
So why is this hard? Because decision-making in medicine requires too many inputs: symptoms, context, comorbidities, meds, labs, device data, prior history, often interacting in messy ways.
This is where cognitive load theory (John Sweller’s influential work clicked for me. Working memory is limited. Learning and decision-making break down when the combination of:
• Intrinsic load (the complexity of the clinical problem) and
• Extraneous load (EHR friction, searching, remembering, interruptions, documentation while talking)
exceeds working memory capacity.
When that happens, we rely on (the much maligned) heuristics in clinical decision-making — fast, somewhat effective, but hard to teach, hard to reproduce, and brittle when guidelines change.
If software is going to boost productivity and quality while winning over clinicians and patients, I think it must do three things:
1. Reduce working-memory burden by converting raw clinical data into usable “chunks” that are inputs to the machine and are understandable (and learnable) for the provider. Chunking is how experts overcome the limits of working memory: each unit becomes increasingly abstract and complex.
2. Show how the recommendations are derived from chunks and chunks are derived from raw inputs, not just for transparency and explainability, but so the users internalize and master those mappings through repeated software-assisted encounters, thereby becoming better and faster (the opposite of de-skilling).
3. Minimize extraneous load: fewer clicks, fewer searches, fewer context switches — so the clinician can stay present with the patient.
The goal isn’t to replace clinical judgment. It’s to keep intrinsic load manageable, strip away extraneous load, and allow learning and improvement to occur — encounter by encounter.
How are we going to scale in healthcare?
Healthcare has a bad track record of resisting innovation. For example, we’re all aware of uncontrollable costs, and yet, although