AI in Medicine & Nephrology

The future, explained. From the big picture of medicine, down to the kidney.

AI is not a robot doctor. It is becoming a second layer of clinical attention: a tireless reader of the chart that notices patterns earlier and lets the physician spend judgment where it actually matters.

Begin reading

Chapter 1

The Big Idea

You asked for the best source on AI in medicine and nephrology, because it is the future. This site is that source, and it is built to be read in order: it begins with the big picture, moves through medicine and cancer care, and ends in your own subspecialty. The short version is this. AI is not a robot doctor. It is becoming a second layer of clinical attention, a tireless reader of the chart that notices patterns earlier and lets the physician spend judgment where it actually matters.

It helps to say plainly what AI is good at today, because the honest version is more useful than the marketing version. AI is good at watching many variables at once. It is good at noticing risk earlier than a busy team can. It is good at measuring things on images and pathology slides that are tedious to count by hand. It is good at summarizing a messy hospital course into something a consultant can read in a minute. And it is good at standardizing the repetitive decisions that should be the same every time but often are not.

Notice what is not on that list. AI does not yet replace clinical judgment. It does not weigh tradeoffs the way a chief does at the bedside. And it does not own the consequences of a treatment decision. Those still belong to the physician. So the right picture is not a machine that takes your place. It is a machine that handles the parts of the hospital that currently force a good mind to hunt through fragmented data, delayed signals, and incomplete histories.

Why now, and not ten years ago? Two things changed. Hospitals went fully electronic, so a patient's labs, vitals, medications, notes, scans, and pathology now live as data a computer can actually read. And the methods for finding patterns in that data got dramatically better. Put those together and you get software that can read the whole chart continuously, across time, instead of firing a single alert when one number crosses one line.

This site follows the arc you would expect from a teacher. First, plain definitions, so no term later is mysterious. Then AI across medicine broadly, where the most mature uses are imaging and pathology, not chatbots. Then nephrology. Then onco-nephrology, the heart of it, written for the person who ran a kidney service inside a cancer hospital. The closer we get to your field, the more concrete it becomes, because that is where this technology and your career meet.

One promise about tone. On the medicine, you will be spoken to as a peer, because you know it cold. On the technology, every new idea gets explained in plain English with a clinical example the first time it appears. The goal is not to dazzle you. The goal is to hand you something accurate enough that you could pass it to a colleague.

AI is becoming a second layer of clinical attention. It is good at watching many variables at once, noticing risk earlier, quantifying images and slides, summarizing messy records, and standardizing repetitive decisions. It is not yet good enough to replace judgment, explain tradeoffs, or own the consequences of treatment.

Chapter 2

What "AI" Actually Means

Before we go further, a short and friendly glossary, so that no word later in this site is a mystery. You will meet a handful of terms, AlphaFold, machine learning, large language models, and a few others. None of them are complicated once translated. Each one below comes with a single clinical example so the idea lands in language you already think in.

The simplest way to think about all of this: "AI" is an umbrella word for software that finds patterns in data and uses them to make a prediction, read an image, or produce text. It is not one thing. In medicine it usually means one of five things, and they are easy to keep straight once you see a kidney example for each.

The most important idea to carry forward is a shift in how this software works. The old generation of clinical decision support was rule-based. A human wrote a fixed rule, the computer obeyed it, and that was the whole of its intelligence. The classic example is the creatinine alert: "if creatinine rises by X, alert the team." Useful, but blunt. It looks at one number against one threshold, and it cannot tell a dangerous rise from a trivial one, because it does not understand context.

Modern AI works differently. It reasons across time and across many kinds of data at once. Instead of watching a single lab against a single line, it can notice that a patient's creatinine, magnesium, albumin, cisplatin dose, neutrophil trend, recent contrast exposure, and proton pump inhibitor use, taken together, imply a higher-risk trajectory than any one of those variables would suggest alone. That is the leap: from a rule about one number to a reading of the whole picture, the way a good consultant actually thinks. The rest of this site is essentially that idea, applied to harder and harder kidney problems.

The seven terms worth knowing

Machine learning, or predictive models

Software that learns patterns from past cases instead of following hand-written rules, then forecasts what is likely to happen next from labs, vitals, medications, notes, comorbidities, and prior trajectory.

In the clinic: a model that flags which inpatients are likely to develop acute kidney injury in the next day or two, before it is clinically obvious.

Deep learning

A more powerful kind of machine learning, loosely inspired by layers of neurons, that is especially good at messy inputs like images, waveforms, and free text. It is the engine behind most of the impressive medical AI of the last decade.

In the clinic: the same technology that reads a chest film can be trained to count sclerotic glomeruli on a whole-slide kidney biopsy.

Computer vision

AI that reads images and pathology slides, measuring and classifying what is in them.

In the clinic: quantifying glomerulosclerosis, interstitial fibrosis, and tubular atrophy on a biopsy, or estimating fibrosis from a renal ultrasound, faster and more reproducibly than counting by eye.

Natural language processing (NLP)

AI that extracts meaning from clinical text, the notes, consults, and reports that are written in prose rather than stored as tidy numbers.

In the clinic: scanning a long chart to pull out the likely causes of an AKI, or finding which patients meet a clinical trial's eligibility criteria buried in their notes.

Generative AI, or large language models (LLMs)

AI that produces text: explanations, summaries, draft notes, and plans. A large language model is software trained on an enormous amount of writing so that it can read a prompt and generate fluent, relevant text in return. ChatGPT is simply the best-known example, a chat window where you type a question and it writes back in plain sentences.

In the clinic: drafting a chart summary before a consult, turning discharge instructions into plain language, or producing a first-pass differential for a clinician to check.

Multimodal AI

AI that combines several kinds of data at once: text, images, labs, genomics, waveforms, and pathology, rather than just one.

In the clinic: a tumor board tool or an onco-nephrology risk dashboard that reads the notes, the scans, the labs, and the slides together to assess one patient.

AlphaFold

A landmark AI system from Google DeepMind that predicts the three-dimensional shape of a protein from its amino acid sequence, in minutes and with remarkable accuracy. Protein shape determines protein function, and working it out used to take a laboratory months or years.

In the clinic and the lab: it accelerates drug discovery and our understanding of disease biology by making protein structures available almost on demand. See AlphaFold from DeepMind ↗

The five kinds of AI, side by side

Type of AI	What it does	Nephrology example
Predictive models	Forecasts an event from labs, vitals, medications, notes, comorbidities, and prior trajectories	AKI prediction, CKD progression, graft failure, dialysis hypotension
Computer vision	Reads images or pathology slides	Quantifying glomerulosclerosis, interstitial fibrosis, tubular atrophy, ultrasound fibrosis prediction
Natural language processing	Extracts meaning from clinical text	Finding AKI causes in notes, summarizing consults, identifying trial eligibility
Generative AI / LLMs	Produces text, explanations, summaries, draft plans	Chart summaries, patient instructions, prior-authorization drafts, literature synthesis
Multimodal AI	Combines text, images, labs, genomics, waveforms, and pathology	Tumor board tools, onco-nephrology risk dashboards, transplant virtual biopsy

If you would rather watch than read, this ten-minute explainer walks through the same ideas, AI versus machine learning versus deep learning versus generative AI, in plain words with no math.

IBM Technology · ~10 min · 2024AI, Machine Learning, Deep Learning and Generative AI Explained. A patient, whiteboard-style walk through the nested ideas, in plain language.

The old alert watched one number against one line. Modern AI reads the whole picture: creatinine, magnesium, albumin, cisplatin dose, the neutrophil trend, recent contrast, the PPI, all at once, the way a good consultant actually thinks.

For the clinician

The rule-based versus learned distinction is worth stating in its strongest form, because it predicts where these tools will and will not help. A rule-based alert encodes one expert hypothesis as a threshold; it is transparent but brittle, and it generates noise precisely because it ignores context. A learned model can weigh many features jointly and surface a trajectory, which is why the cisplatin example is the right teaching case: it is not magnesium alone, or albumin alone, or dose alone, but their conjunction against a baseline risk that matters.

The cost of that power is explainability, which is why later chapters lean on interpretable methods that show which features drove a given prediction. Hold this in mind: the value of an AI model in nephrology is rarely the score itself, it is whether the model can tell you the why in terms you can act on.

Chapter 3

AI Across Medicine Today

Most of what people imagine when they hear "AI in medicine" is a chatbot. The reality already inside hospitals is quieter and more impressive. It reads images, contours tumors, sorts patients into trials, and predicts the three-dimensional shape of the proteins that drugs are designed to fit. Almost none of it is a chatbot.

This chapter walks through what is actually deployed today, starting with what regulators have cleared, then building toward the frontier where this genuinely is the future. If you watch one thing on this whole site, make it the talk below: a cardiologist and leading medical-AI voice, speaking to physicians, not technologists.

Eric Topol · TED · ~13 min · 2023Can AI Catch What Doctors Miss? A peer-level account of how AI reads scans as well as or better than experts, and can see patterns humans cannot.

Start with what is real, not what is loud

A natural first question from a clinician is: what has actually been approved? The answer is clarifying. A 2025 analysis in npj Digital Medicine catalogued 1,016 AI-enabled medical devices authorized by the FDA. Among the 736 unique devices, 84.4% work on images. Another 14.5% work on signals such as ECG or other waveforms. Less than 1% touch genomic data, and only 0.4% work on the ordinary tabular data of the electronic record. Radiology is by far the dominant specialty.

The most telling finding: as of the study's December 2024 cutoff, there were no large language model devices in the authorized set at all. A "large language model," or LLM, is the kind of AI behind ChatGPT. It is the technology getting all the public attention. And it is precisely the technology that, so far, regulators have not cleared as a medical device.

This is the useful separation: the AI that is regulated, validated, and running in clinics today is overwhelmingly the AI that looks at pictures. The AI that dominates the headlines is barely in the room yet. Hold that distinction. It makes everything below easier to judge.

The AI that is regulated, validated, and running in clinics today is overwhelmingly the AI that looks at pictures. The AI that dominates the headlines is barely in the room yet.

Imaging and pathology: where AI already practices

Because regulated medical AI is mostly image-based, the most mature clinical applications are exactly the ones built on images and slides. The National Cancer Institute describes AI as relevant across cancer mechanisms, screening, diagnosis, drug discovery, surveillance, and care delivery. But in real-world oncology, the largest share of deployed AI sits in diagnosis, radiology, and pathology. Concretely, that means:

Mammography, with AI flagging suspicious areas for the radiologist's attention.
Lung nodule detection on chest CT.
Prostate MRI interpretation.
Polyp detection during colonoscopy, where the system marks a polyp the moment it appears on screen.
Dermatology image classification, sorting skin lesions by appearance.
Digital pathology, where the slide is scanned and the software helps quantify what is on it.

The common thread is augmentation, not replacement. The AI offers a read. A physician owns the decision. This is the same posture you would expect from a sharp fellow: a second set of eyes that never tires, presenting findings for your judgment.

Radiotherapy: auto-contouring saves the time, not the responsibility

Radiation oncology offers one of the cleanest, most concrete examples of AI earning its place. Before a patient is treated, someone has to draw, slice by slice on the CT scan, the exact outline of the tumor and the exact outlines of the healthy organs nearby that must be spared. The kidneys, the spinal cord, the salivary glands, the optic nerves. This careful tracing is called contouring, or segmentation. It is skilled, it is repetitive, and it takes a long time.

"Auto-contouring" means an AI, trained on thousands of prior cases, draws those outlines first. The physician then reviews and corrects them rather than starting from a blank scan. A Mayo Clinic study, working with Google Health, trained a deep-learning model to auto-contour head and neck cases covering 42 organs at risk. It cut contouring time by 76%, and randomized review by radiation oncologists confirmed the time was saved without compromising contour accuracy.

76% less time spent contouring head-and-neck radiotherapy plans, with no loss of accuracy on review.Mayo Clinic, with Google Health

84.4% of unique FDA-authorized AI medical devices work on images. No LLM devices were authorized as of December 2024.npj Digital Medicine, 2025

This is the example to picture clearly. AI takes a defined, exhausting, high-skill task and does the first pass in a fraction of the time. The physician stays fully in charge of the result. That is the shape of nearly all the good news in this chapter.

Clinical trial matching: reading the fine print at scale

Matching a patient to the right clinical trial is harder than it sounds, and the reason is text. Trial eligibility criteria are long, intricate, and written in prose. A trial might exclude patients with a certain prior therapy, a certain lab value, a certain mutation, or a degree of kidney impairment. The patient's own relevant facts are scattered across notes, scans, and reports in equally messy language. Manually checking one patient against one trial's criteria, then repeating for dozens of trials, is slow and error-prone. Good matches get missed simply because no one had time to read everything.

TrialGPT, published in Nature Communications in 2024, used large language models to do this reading. It worked in three stages: retrieving candidate trials, checking the patient against each eligibility criterion, and ranking the trials by fit. Its rankings correlated strongly with the judgments of human experts, and in a user study it reduced screening time by 42.6%. The point is not that AI decides who enters a trial. A research nurse or oncologist still makes that call. The point is that AI can read all the fine print on every patient, consistently, and hand the clinician a short, sensible list to confirm.

Drug discovery and AlphaFold: this is the part that feels like the future

If one development deserves the word "future," it is the AI prediction of protein structure. Here is why it matters, in plain terms. Proteins are the tiny machines that run the body. Each one folds into a specific three-dimensional shape, and that shape is what determines its job and how a drug can act on it. A drug molecule works by physically fitting against a protein, the way a key fits a lock. So if you want to design a drug, you badly want to know the exact shape of the protein you are targeting.

For decades, working out a single protein's shape in the laboratory could take a scientist months or years. AlphaFold, built by DeepMind, is an AI that predicts those shapes from the protein's sequence in minutes, at an accuracy that stunned the field. In roughly sixty years, scientists had determined about 150,000 protein structures by hand. AlphaFold has now predicted the structures of over 200 million proteins.

For medicine, this collapses one of the slowest steps in drug discovery. Researchers can see the shape of a target, understand how molecules might bind to it, and design candidate drugs far faster than before. The film below tells that story better than any text could.

Veritasium · ~25 min · 2025AlphaFold: The Most Useful Thing AI Has Ever Done. A clear, beautifully produced story of how AI learned to predict the shape of proteins, the molecules that run the body. The best single "the future is here" piece.

AI is being applied across the rest of the drug pipeline too: target discovery, screening large libraries of compounds, biomarker selection, and toxicity prediction, which means forecasting whether a candidate drug is likely to harm the body before it is ever given to a person.

That last category reaches directly into your field. Researchers are building AI and machine-learning models to predict drug-induced kidney injury from chemical, biological, and clinical data. In time, this could help flag, before a drug is approved or prescribed, which compounds are most likely to cause tubular injury, electrolyte disturbance, or progressive CKD. The renal toxicity that you have spent a career managing after the fact may increasingly be predicted before it happens.

For the clinician: the AlphaFold companion clip, and the database

If you want the story from the source, DeepMind's own eight-minute film, AlphaFold: The making of a scientific breakthrough, is the authoritative companion to the Veritasium piece. And the predicted structures are public: anyone can look up a protein in the free AlphaFold Protein Structure Database.

Google DeepMind · ~8 min · 2020AlphaFold: The making of a scientific breakthrough. The inside story, told by the team that solved a 50-year biology problem.

Generative AI for documentation: useful, but never unsupervised

The last category is the LLMs from the opening section, now applied to the part of medicine no clinician misses: paperwork. Large language models are being tested to draft clinical notes, summarize long hospitalizations before a consult, extract a creatinine baseline and an AKI timeline from a tangled chart, list a patient's nephrotoxic exposures, translate instructions into plain language for patients, and prepare cases for tumor board or nephrology conference. A 2025 study compared notes generated by ambient AI, software that listens to a visit and drafts the note, against physician-written notes across five specialties. The technology is moving into real workflows quickly.

But the honest caution belongs here, and a chief will want it stated plainly. A 2026 study in JAMA Network Open tested frontier language models on standardized clinical vignettes. The models did reasonably well at naming a final diagnosis or a management step, with failure rates usually under 0.40. They did poorly at the part that matters most to a diagnostician: building a differential. Failure rates for generating a differential diagnosis exceeded 0.80. The models tended to collapse prematurely onto a single answer instead of holding several possibilities open and refining between them, which is exactly the discipline a clinician uses when facing uncertainty.

The lesson is not that the tools are useless. It is that they are drafting tools and summarizing tools, to be used under a physician's review, not unsupervised reasoners that can be trusted with a differential or with what they do not know.

Chapter 4

AI in Nephrology

This is your field, so here it is plainly. AI is now being used across nephrology, not to make decisions but to watch for them. It predicts acute kidney injury before the creatinine climbs. It flags which CKD patients are headed for trouble. It reads dialysis sessions in real time. It helps weigh a donor kidney and quantify a biopsy slide. None of it replaces the nephrologist. All of it is meant to give the nephrologist a head start.

A note on one word you will see throughout. A "model" here just means a program that has been shown enough past cases that it learns to estimate what is likely to happen next. It does not reason the way you do. It recognizes patterns. When it works, it sees them sooner and across more variables than any person watching a chart.

Predicting acute kidney injury before it shows

The flagship example came in 2019, from DeepMind (Google's AI lab) working with the U.S. Veterans Affairs health system, published in Nature. They trained a model on a very large set of patient records to predict AKI before it was clinically obvious. This was "deep learning," which means the program was not given rules like "alert when creatinine rises by X." Instead it was shown the records of hundreds of thousands of patients and left to find, on its own, the combinations of labs, vitals, medications, and trends over time that tend to precede injury. Because it works across time, it can weigh a trajectory, not just a single value.

The result. The model predicted 55.8% of inpatient AKI episodes and 90.2% of the AKI episodes severe enough to later require dialysis, up to 48 hours in advance. That second number is the striking one. The catch is that it also cried wolf: roughly two false positives for every true positive.

90.2% of dialysis-requiring AKI episodes flagged up to 48 hours in advance, alongside two false positives for every true positive.DeepMind / VA, Nature 2019

One honest caveat, taken up in full in Chapter 7. Predicting AKI earlier is not the same as changing what happens to the patient. The randomized trials of AKI alerting have not shown that the prediction, by itself, improves progression, dialysis, or survival. The hard part was never the prediction. It was building an action around it that actually reduces harm.

DeepMind did not stop at the paper. The same group built a bedside app, Streams, used by nurses and doctors at a London hospital to catch AKI on the ward. It is a good look at what "prediction" becomes when it reaches real care.

Google DeepMind · ~3 min · 2017How the Streams app is used at the Royal Free. Nurses and doctors show the app alerting them to acute kidney injury at the bedside. Concrete proof the work became a real ward tool, not just a paper.

Spotting CKD progression before the clinic is overwhelmed

The same kind of model is being pointed at chronic kidney disease, where the question is not "will this patient be injured tonight" but "which of these patients is going to decline, and how soon." Fed the usual inputs, labs, demographics, comorbidities, medication history, the model can rank a population by risk.

The value here is organizational. Imagine a health system that can look across every patient with stage 3 or 4 CKD and surface the ones most likely to progress. Those are the patients you would want to move to the front of the line: a nephrology visit sooner, an evaluation for SGLT2 inhibitors, blood-pressure optimization, a medication review for nephrotoxins, transplant education while there is still time, vascular-access planning before it becomes urgent. The model does not decide any of this. It decides who gets your attention first.

A 2026 study in JAMIA on predicting end-stage renal disease outcomes made the point that matters most for this to be trustworthy. The models that hold up pull from many sources of data at once, they can show why they flagged a given patient rather than just producing a number, and they are built to guard against bias. That last point is not a footnote. CKD and ESRD outcomes are shaped heavily by race, access, and referral patterns, and a model trained on biased care will repeat it unless someone designs against that.

Reading the dialysis session in real time

Dialysis may be the most natural fit of all, because it throws off so much data: blood pressure, ultrafiltration rate, blood flow, dialysate, symptoms, interruptions, medication timing, lab trends, session after session. That density is exactly what these models feed on.

The current focus is predicting trouble mid-session. A study in Scientific Reports built an "explainable" deep-learning model to predict both intradialytic hypotension and intradialytic hypertension during the session, in time to act. "Explainable" is the operative word: rather than just sounding an alarm, the model can show which factors drove the warning, which is the difference between something a clinician trusts and something a clinician learns to ignore.

The broader idea has been called "precision dialysis," using this volume of data to personalize the prescription rather than run everyone on the same settings. The near-term uses are unglamorous and real: safer ultrafiltration targets, better dry-weight estimation, earlier nursing intervention before a patient crashes, detection of access problems, and even the operational side, staffing and chair scheduling.

Transplant: matching, allocation, and a "virtual biopsy"

In transplant, AI is being studied across the full arc: donor-recipient matching, graft survival, waitlist mortality, allocation policy, rejection risk, and biopsy interpretation. The recurring caveat in this literature is fairness and explainability, for the same reason as in CKD: allocation decisions carry equity consequences, so a model that cannot show its reasoning is hard to defend.

The most striking single result is a "virtual biopsy" system published in Nature Communications in 2024. The team assembled 14,032 day-zero biopsies from 17 centers and trained a model to estimate, from just 11 basic donor parameters, what that biopsy would likely show: the Banff lesions and the percentage of sclerotic glomeruli. They reported good discrimination, calibration, and robustness, and released a ready-to-use online tool for clinicians.

This does not retire the biopsy. Day-zero biopsies are invasive, costly, and can delay a transplant, and many centers do not perform them. The promise is a reliable estimate when a real biopsy is not available, and better judgment about which biopsies are most worth doing.

Pathology and imaging, made quantitative

Digital pathology is the clearest case where AI augments the nephrologist's eye. The biopsy is information-rich, but much of the work is laborious and subject to interobserver variability. A model can count and measure at scale, consistently. Deep-learning models have been built to quantify glomerulosclerosis on whole-slide biopsy images, and to estimate interstitial fibrosis and tubular atrophy, in some work even from ultrasound rather than tissue.

The key idea for you is that this is not only about diagnosis. It makes pathology quantitative and reproducible: a percent sclerosis, a fibrosis burden, a lesion map, a score that means the same thing across readers and can be compared to a cohort. Quantitative and reproducible is the door to prognostic, tying histologic pattern to outcome in a way that a qualitative read never quite could.

AI does not replace the nephrologist watching the chart. It is a second set of eyes that never blinks, never tires, and can hold every variable at once. The judgment, and the responsibility, stay with the physician.

For the clinician: reading the AKI model honestly

The DeepMind/VA model was trained on roughly 700,000 anonymized VA patient records. The 90.2% sensitivity for dialysis-requiring AKI at a 48-hour horizon is the headline, but read the precision honestly: about two false positives per true positive at the operating point reported. The architecture is a recurrent deep-learning model that ingests the longitudinal record rather than a snapshot, which is why it captures trajectory. The unresolved question is not discrimination, it is whether an alert this noisy can be wired into a workflow that changes management without alarm fatigue.

On the transplant virtual biopsy: from 11 donor parameters the ensemble predicts the Banff lesions and percent sclerotic glomeruli of the day-zero biopsy, with reported good calibration and an online application. Worth knowing: there is a published critique arguing the tool could increase organ discards at aggressive centers, with a reply from the authors. The point for discussion is whether a virtual estimate should ever influence an accept-or-discard decision, or only inform which real biopsies to prioritize.

Chapter 5 · The heart of it

Onco-Nephrology

This is your field, so this chapter speaks to it directly. Cancer patients accumulate renal risk faster than almost any other population a nephrologist sees. The malignancy itself, cisplatin, the immune checkpoint inhibitors, antimicrobials, contrast, obstruction, tumor lysis, volume depletion, sepsis, surgery, radiation, and a baseline of comorbid CKD can all land on the same kidney in the same week. You have spent a career holding all of that in your head at once.

The reason AI fits here is not that it reasons better than you do. It is that these cases are data dense, time sensitive, and full of competing explanations, and the chart does not assemble itself. The honest near-term promise is narrow and worth stating plainly. The useful output is not a generic alert that creatinine rose. It is a ranked differential, the same list you would build, surfaced earlier and with the supporting evidence already gathered.

The useful output in onco-nephrology is not an alert that creatinine rose. It is a ranked differential, the same list you would build, surfaced earlier and with the evidence already gathered.

AKI in cancer patients

A single oncology patient may carry metastatic disease, CKD, recent cisplatin, an immune checkpoint inhibitor, IV contrast from staging scans, broad antibiotics, a proton pump inhibitor, hypoalbuminemia, magnesium wasting, intermittent sepsis, and a creeping obstruction. Any one of those explains a rising creatinine. Several of them together change the answer.

A 2018 PLOS One study addressed AKI prediction in cancer patients specifically, and the phrase its authors used matters: heterogeneous and irregular clinical data. Cancer care does not generate clean, evenly spaced numbers. Patients cycle through infusion visits, admissions, surgeries, scans, antibiotics, and changing regimens, so the data arrive in bursts and gaps. The interesting part of that work was building something that tolerates the mess your patients actually produce.

The application that follows is a cancer-center AKI risk dashboard. Picture a single screen that watches creatinine slope, eGFR, urine findings, volume proxies, medication exposures, chemotherapy timing, contrast, sepsis markers, and obstruction risk. Its output is not the word "AKI." It is a ranked differential: cisplatin toxicity, ICI nephritis, prerenal azotemia, ATN, obstruction, tumor lysis, myeloma-related disease, drug-induced AIN. The same list you would write, assembled before the consult.

Cisplatin-associated AKI

This is the most concrete example in the chapter, because it is a clean, externally validated risk score rather than a black box. A 2024 BMJ cohort study (open-access full text) derived and externally validated a simple risk score for severe AKI after a first dose of intravenous cisplatin, in adults treated between 2006 and 2022. The derivation cohort held 11,766 patients and the external validation cohort held 12,951. Severe cisplatin-associated AKI occurred in 5.2% of the derivation cohort and 3.3% of the validation cohort. The predictors will look familiar to you: age, hypertension, diabetes, baseline creatinine, hemoglobin, white blood cell count, platelets, albumin, magnesium, and cisplatin dose.

The model's C-statistic was 0.75, which is its ability to rank a higher-risk patient above a lower-risk one, where 0.5 is a coin flip and 1.0 is perfect. Earlier models sat around 0.60 to 0.68. The plain way to describe what this does: it turns decades of clinical intuition into a consistent, risk-stratified workflow, then applies it before the injury rather than after. Before the infusion, the system estimates severe AKI risk and lets the team standardize hydration and magnesium, route high-risk patients to nephrology, weigh alternatives where appropriate, and set sensible post-infusion lab timing.

0.75 C-statistic for the cisplatin AKI risk score, up from 0.60 to 0.68 in earlier models. Derived and validated across 24,717 patients.BMJ cohort study, 2024

Immune checkpoint inhibitor kidney injury

The checkpoint inhibitors opened a new renal complication space, and it is exactly the kind of problem where ranking beats alerting. The American Society of Onco-Nephrology position statement frames the spectrum: ICIs cause a range of kidney immune-related adverse events, most commonly acute tubulointerstitial nephritis, but also glomerular disease and electrolyte disturbances. The clinical question is never "did creatinine rise." It is whether this is ICI nephritis, ATN, prerenal AKI, obstruction, infection-related AKI, PPI-associated AIN, NSAID toxicity, or something else. That distinction drives everything that matters next: steroids, holding the cancer therapy, rechallenge, biopsy, and prognosis.

A 2024 PLOS One study used interpretable machine learning on this exact question. "Interpretable" means the model shows which factors drove its answer instead of producing a number with no reasoning attached. Working from records on 616 ICI-treated patients, it predicted AKI within seven days, then used a method called SHAP, which assigns each factor a contribution to that specific patient's risk, along with clustering, which groups similar patients together, to surface patient-specific drivers. That is the feature that earns a nephrologist's trust: not just a probability, but the why behind it.

A 2024 meta-analysis put numbers to the epidemiology. All-cause AKI occurred in 7.4% of ICI-treated patients and ICI-related AKI in 3.2%, with associations between AKI and concomitant PPIs and NSAIDs. The future use case follows directly: help decide which ICI-treated patients need urgent biopsy, which can be managed by holding nephrotoxins and rechecking labs, and which are high risk for true immune-mediated nephritis, and then support safer rechallenge decisions afterward.

Myeloma, monoclonal gammopathy, and MGRS

Plasma-cell and monoclonal gammopathy disorders are another high-value area, because the decision is usually about whether and when to biopsy. The International Myeloma Working Group recommends a structured renal evaluation: creatinine, eGFR, free light chains, 24-hour urine protein, electrophoresis, immunofixation, and biopsy in selected cases such as significant albuminuria or lower free-light-chain levels. The Mayo MGRS prediction tool then estimates the probability that a kidney biopsy will actually show an MGRS lesion. At a probability threshold of 0.10, it was 98.9% sensitive and 50.5% specific. At a threshold of 0.25, it was 88.0% sensitive and 70.2% specific. The lower threshold is the one you would use to avoid missing a lesion. The higher one trims unnecessary procedures.

The decision this supports is the one you have made many times: do the kidney findings reflect a monoclonal process that needs hematologic treatment, or a coincidental MGUS plus unrelated CKD, or diabetic kidney disease, or amyloid, or cast nephropathy, or light-chain deposition disease. A triage tool helps avoid both underdiagnosis and an invasive procedure in a thrombocytopenic patient who did not need it.

The rest of the territory

The same logic extends across the rest of the territory you know:

Tumor lysis syndrome: predicting which patients need aggressive prophylaxis, rasburicase, closer electrolyte monitoring, or admission.
Obstructive nephropathy: combining imaging reports, hydronephrosis history, creatinine trajectory, tumor location, and symptoms to flag who needs ultrasound, CT review, a stent, or nephrostomy.
VEGF inhibitor and TKI toxicity: detecting early proteinuria, hypertension, thrombotic microangiopathy signals, and risk of progressive CKD.
CAR-T and cytokine-release syndromes: predicting AKI risk from inflammatory markers, hemodynamics, nephrotoxins, and ICU-level events.
Pediatric oncology AKI: commonly driven by nephrotoxic drugs, but also by tumor infiltration or compression, chemotherapy, radiotherapy, immunotherapy, surgery, dehydration, and infection.

For the clinician: why cancer data breaks ordinary AKI models

General inpatient AKI models assume a fairly regular stream of inpatient labs and vitals. Oncology violates that assumption. Data arrive in bursts at infusion visits, gaps between cycles, dense stretches during admissions, then quiet outpatient intervals. The 2018 PLOS One work was notable precisely because it set out to handle heterogeneous and irregular clinical data rather than pretend the timeline is even. For a chief evaluating one of these tools, the first question is not the headline C-statistic. It is whether the model was trained on the kind of irregular, multi-setting data your patients actually generate, or on a cleaner inpatient cohort that will not transfer.

For the clinician: the ICI differential is the whole game

You already know that "AKI on a checkpoint inhibitor" is not a diagnosis. The ASON statement catalogs the breadth: ATIN most commonly, but glomerular disease and electrolyte disturbances too. The value of an interpretable model here is that it argues a position. Consider the reasoning an explainable system can produce: AKI temporally near ICI cycle four, but stronger support for volume depletion plus piperacillin-tazobactam, no eosinophilia, no pyuria, no PPI exposure, no new proteinuria, so repeat labs and sediment before steroids.

That is a defensible nephrology argument, and it is exactly what a SHAP-based model is built to expose, because SHAP attributes the prediction to specific features rather than hiding them. The concomitant-medication signal from the meta-analysis, PPIs and NSAIDs raising the odds, is the kind of cleanable contributor a model will reliably catch and a busy service will reliably miss. The endpoint is a sharper biopsy-versus-steroids-versus-medication-cleanup decision, and safer rechallenge.

For the clinician: using the MGRS tool to defend a biopsy decision either way

The two thresholds are doing two different jobs. At 0.10 the tool is a rule-out: 98.9% sensitivity means a patient below it is very unlikely to be harboring an MGRS lesion, which helps you defer biopsy in someone with bleeding risk or thrombocytopenia. At 0.25, with 70.2% specificity, it is a rule-in that supports proceeding. The differential it sits inside is the familiar one: a monoclonal process needing hematologic treatment, versus coincidental MGUS with unrelated CKD, versus diabetic kidney disease, amyloid, cast nephropathy, or LCDD. The tool does not read the biopsy. It tells you how likely the biopsy is to change management before you take the risk.

Chapter 6

The Vision: A Renal Command Center

Picture a system that watches every cancer patient's kidneys the way a careful chief watches the whole service. Not one number at a time, but all of them together, all the time. It does not wait for the creatinine to jump and then page someone. It notices the trajectory early, names the most likely cause, and tells the team the next useful thing to do. That is the vision: not a robot doctor, but a renal command center sitting quietly inside the cancer center's records.

For every active cancer patient, the command center would continuously estimate:

the risk of acute kidney injury in the next 24 to 72 hours
the risk of severe AKI after cisplatin or another kidney-toxic therapy
the probability that a rise in creatinine is immune checkpoint inhibitor nephritis rather than some other cause
the probability of obstruction
the risk of tumor lysis
the risk of CKD progression after cancer therapy
the risk of needing dialysis during a hospitalization
the renal survivorship risk after cure or remission

The point that matters most to a clinician is the next one. The system would not merely alert. It would suggest the next useful action. Depending on the patient, that might be: repeat the urinalysis, check the urine microscopy, stop the PPI or the NSAID, adjust antimicrobial dosing, order a renal ultrasound, modify the cisplatin hydration, consult nephrology, consider a biopsy, or sit down and discuss modifying the cancer therapy. An alert tells you something happened. This tells you what to do about it.

The differential engine

The most interesting part is not the alert. It is the reasoning. In a cancer patient, a rise in creatinine has a long list of suspects: the cancer itself, cisplatin, a checkpoint inhibitor, antibiotics, contrast, obstruction, tumor lysis, volume depletion, sepsis, a PPI, an NSAID, and underlying CKD. A human consultant can reason through all of that. The records do not make it easy. The differential engine does the hunting first and hands you a ranked list of likely causes with the evidence for each, so you start your thinking where the data already points. The two passages below are exactly the kind of output it would produce.

"This patient's AKI is temporally close to ICI cycle 4, but the model finds stronger support for volume depletion plus vancomycin/piperacillin-tazobactam exposure. No eosinophilia, no pyuria, no PPI exposure, no new proteinuria. Consider repeat labs and urine sediment before steroids."

An example of differential-engine output

"Creatinine rise after cisplatin cycle 2, new hypomagnesemia, low albumin, high baseline risk score, no obstruction on recent imaging. High likelihood of cisplatin-associated tubular injury."

A second example of differential-engine output

Biopsy decision support

In cancer patients, a biopsy is not a free decision. Bleeding risk, low platelets, prognosis, and the urgency of starting treatment all weigh on it. The command center would help estimate two things that drive the call: how likely the biopsy is to yield a clear diagnosis, and how likely that diagnosis is to change management. That turns biopsy from a reflex into a weighed choice. The questions it would help with are familiar ones: in checkpoint inhibitor AKI, biopsy versus empiric steroids; in suspected monoclonal gammopathy of renal significance, biopsy versus observation; in nephrotic syndrome in a cancer patient, sorting membranous nephropathy from amyloid, minimal change, thrombotic microangiopathy, or diabetic kidney disease; in a transplant recipient who develops cancer, rejection versus BK versus drug toxicity versus recurrent disease.

Cancer-therapy planning with renal tradeoffs built in

Many oncology decisions carry a kidney tradeoff that is currently weighed in someone's head, if at all. The command center would put the patient-specific numbers on the table before the choice is made: cisplatin versus carboplatin, checkpoint inhibitor rechallenge versus discontinuation, contrast-enhanced imaging now versus an alternative or prophylaxis, the choice among kidney-toxic antibiotics, continuing a VEGF inhibitor in the face of new proteinuria, and dose adjustment as the eGFR moves. The goal here is worth stating plainly. This is shared decision-making with quantified risk. It is not the computer deciding. The numbers inform the conversation between oncologist, nephrologist, and patient. The judgment stays with the physicians.

Survivorship nephrology

Cancer survivors now live long enough that CKD, hypertension, proteinuria, and cardiovascular risk catch up with them. The command center would flag survivors who need long-term kidney follow-up after nephrectomy, cisplatin, ifosfamide, radiation, stem-cell transplant, CAR-T complications, or recurrent AKI. Survivorship care today is scattered across oncology, primary care, nephrology, and cardiology, and patients fall through the gaps. A system that keeps watching after remission is one of the most valuable things AI could do here, precisely because no single clinician currently owns that follow-through.

For the clinician: why this is achievable, not science fiction

The architecture here is deliberately unglamorous. It does not require a single breakthrough model. It requires connecting models that already exist or are within reach: AKI prediction, the cisplatin risk score, the explainable checkpoint inhibitor AKI work, the MGRS biopsy tool, obstruction signals from imaging reports, and continuous CKD-progression estimates, all surfaced in one place at the point of care. The value is in the integration and the connection to action, not in any one algorithm.

Chapter 7

The Cautions

Here is the part a chief will respect most, because it is the part the brochures leave out. A prediction that is accurate is not the same as a patient who does better. The honest question for every one of these systems is not "is the model clever," but "what would have to be true for this to actually help patients." This chapter is built around that question.

Prediction is not outcome improvement

This is the strongest caution, so it goes first. We can already predict acute kidney injury earlier than a physician might notice it. That has been shown. The trouble is what happens next. When you actually test the alerts in a controlled way, the patients often do no better.

A randomized trial published in BMJ tested electronic AKI alerts head to head against usual care. The primary outcome, a composite of AKI progression, dialysis, or death, occurred in 21.3% of the alert group and 20.9% of the usual-care group. That difference is not meaningful. The alert did not reduce harm. It is worth sitting with that: the technology worked, the prediction fired, and the patients were no better off.

A 2024 trial in JAMA went a step further. It tested a "kidney action team" that delivered early, individualized recommendations to the treating clinicians. The intervention succeeded at getting recommendations delivered and acted on more often. But it still did not significantly reduce the composite outcome of worsening AKI stage, dialysis, or mortality. Better delivery of advice, no measurable change in what happened to patients.

The lesson is the one any chief already knows from a career of running a service. A signal only helps if it is specific, if it is trusted, if it leads to a clear action, and if someone owns that action. A prediction with no owner and no plan is just more noise on a busy floor.

Cancer patients are distribution-shift machines

"Distribution shift" is a technical phrase, so here it is in plain English. A model learns from the patients it was trained on. If the world then changes, the model can quietly go stale without anyone noticing, because it is still confidently giving answers based on a world that no longer exists.

Cancer care changes faster than almost any field in medicine. New immunotherapies, antibody-drug conjugates, CAR-T products, bispecifics, targeted therapies, supportive-care protocols, and trial regimens arrive constantly. A model trained on last year's therapies may not understand this year's. And a model built at one cancer center may fail at another, because the patient mix, the formulary, the lab timing, the imaging frequency, and the admission practices are all different. A tool that is excellent in one place can be unreliable across the street. This is not a reason to avoid the tools. It is a reason to keep checking them against reality, the same way you would re-credential a clinician.

Explainability matters more in nephrology than in most fields

A nephrologist does not just want a score. He wants the why. Is the model worried because of the creatinine slope, the albumin, the cisplatin dose, an NSAID, the urine protein, hydronephrosis, hypotension, or a vancomycin level? Without the why, the number is unusable, because you cannot act on a cause you have not been shown.

Some modern models can show their reasoning. There is a technique, which you will see referred to as SHAP, that does one simple thing: for each prediction, it shows which findings pushed the risk up and which pushed it down, for that specific patient. So instead of "high risk," you get "high risk, driven mostly by the cisplatin dose and the falling magnesium." That is a model a clinician can argue with, and a model you can argue with is one you can trust. The explainable checkpoint inhibitor AKI work is built exactly this way.

Bias and access

Kidney disease and cancer outcomes are not shaped by biology alone. CKD, end-stage renal disease, transplantation, and cancer outcomes are all shaped by race, socioeconomic status, referral patterns, insurance, geography, and access to specialty care. AI sits on top of all of that, and it can go either way. Used well, it can reduce disparities by catching missed risk earlier in patients who would otherwise be overlooked. Used carelessly, it can amplify disparities by faithfully learning the inequities baked into historical care and then repeating them at scale. Which one happens is a design choice, not an accident.

LLM hallucination

One more term, in plain English. A large language model, the kind of system behind tools like ChatGPT, generates fluent text by predicting likely words. It can sound completely confident and still be wrong. It does not know when it does not know. In clinical medicine, confident-and-wrong is dangerous. So the honest position on this generation of generative AI is narrow and specific: the best current use is drafting and summarizing under physician review, not unsupervised diagnosis or treatment. Let it write the first draft of the consult note or the chart summary. Do not let it have the last word.

The filter to apply to everything on this site. What is the cause it identifies? What action does it recommend? Who owns that action? Will it reduce harm, or just add to the alert burden? The randomized trials above are the cautionary baseline: two well-run studies where the prediction or the recommendation worked and the outcome still did not move. Any new tool has to clear that bar, not just beat a curve on paper.

Chapter 8

Three Examples to Discuss

The best way into this is not a lecture, it is three cases and three questions. Each one is real and each one is chosen to be argued about. The questions are the point.

01 AKI prediction

A landmark 2019 study from DeepMind and the U.S. Veterans Affairs system trained a deep-learning model on large-scale records to predict acute kidney injury before it was clinically obvious. The model reportedly predicted 55.8% of inpatient AKI episodes and 90.2% of the dialysis-requiring episodes up to 48 hours in advance. It also produced false alarms, roughly two false positives for every true positive. So the seeing-ahead is real. But then come the randomized alert trials from Chapter 7: the BMJ trial found 21.3% in the alert group versus 20.9% in usual care, no meaningful benefit, and the 2024 JAMA kidney action team trial improved how often recommendations were delivered but still did not move the composite outcome. Prediction, proven. Better outcomes, not yet.

Ask yourself

What action would have to follow an AKI prediction for this to actually change outcomes?

02 Cisplatin AKI risk

A 2024 BMJ cohort study derived and externally validated a simple risk score for severe AKI after intravenous cisplatin. It used a multicenter cohort of adults receiving their first IV cisplatin, with 11,766 patients in the derivation cohort and 12,951 in external validation. Severe cisplatin-associated AKI occurred in 5.2% of the derivation cohort and 3.3% of the validation cohort. The predictors are all things you already check: age, hypertension, diabetes, creatinine, hemoglobin, white blood cell count, platelets, albumin, magnesium, and cisplatin dose. The model reached a C-statistic of 0.75, better than earlier models in the range of 0.60 to 0.68. This is not exotic. It is decades of clinical intuition turned into a consistent, pre-chemo risk stratification, applied before the injury rather than after.

Ask yourself

Would this have changed which patients you wanted to see before chemotherapy?

03 ICI nephritis versus competing etiologies

Immune checkpoint inhibitors created a new renal complication space. The American Society of Onco-Nephrology position statement notes they can cause a spectrum of kidney immune-related adverse events, most commonly acute tubulointerstitial nephritis, but also glomerular disease and electrolyte disturbances. The hard clinical question is not "did the creatinine rise." It is whether this is checkpoint inhibitor nephritis, ATN, prerenal AKI, obstruction, infection-related AKI, PPI-associated interstitial nephritis, or NSAID toxicity, because the answer changes everything downstream: steroids, holding cancer therapy, rechallenge, biopsy, and prognosis. A 2024 PLOS One study used interpretable machine learning on 616 checkpoint-inhibitor-treated patients to predict AKI within seven days, and used SHAP-based explanations to show the patient-specific drivers behind each prediction. A 2024 meta-analysis found all-cause AKI in 7.4% of treated patients and checkpoint-inhibitor-related AKI in 3.2%, with PPIs and NSAIDs associated with higher odds of both.

Ask yourself

Could AI help decide who needs biopsy versus empiric steroids versus simple medication cleanup?

Why these three, in this order

The three cards are arranged on purpose. Card 1 is the humbling one: prediction without action does not help. Card 2 is the closest to immediate use: a plain risk score that could change a pre-chemo clinic. Card 3 is the intellectually richest: a model that does not just flag a number but helps untangle a high-stakes differential. Together they trace the real arc of where this is useful and where it is not.

Chapter 9

The Bottom Line

So here is where it lands. The honest summary is not "AI is coming for the nephrologist." It is something quieter and, I think, more hopeful.

AI is becoming a second layer of clinical attention. It is good at watching many variables at once, noticing risk earlier, quantifying images and slides, summarizing messy records, and standardizing the repetitive decisions. It is not yet good enough to replace judgment, to explain the tradeoffs, or to own the consequences of treatment. Those still belong to the physician, and the cautions chapter is the reason why.

In onco-nephrology the opportunity is unusually strong, because kidney injury in cancer is exactly the kind of problem AI suits: high stakes, data-rich, and causally tangled. The killer application is almost certainly not a generic AI doctor. It is a cancer-center renal intelligence system that tracks AKI risk, explains the likely causes, anticipates nephrotoxicity, helps triage which patients need a biopsy, supports the therapy decisions, and follows survivors for CKD long after the cancer is treated. Every piece of that is something this report has shown is either already real or within reach. The work is putting them in one place, connected to action, under a clinician's hand.

AI is not replacing the nephrologist's mind. It is replacing the parts of the hospital that currently force that mind to hunt through fragmented data, delayed signals, and incomplete histories.

Chapter 10

Go Deeper: Sources & Library

Everything on this site is built on published work. Here it is, organized so you can read as deeply as you like. Papers marked Open access are free to read in full; many are also saved here as a PDF you can open directly. The videos are gathered at the end.

Tools you can open

FDA AI-Enabled Medical Devices list ToolSearchable, downloadable. Radiology-dominant.
AlphaFold Protein Structure Database ToolLook up the predicted shape of almost any protein. Free.

A note on the full briefing. This site is drawn from a longer written briefing, AI in nephrology, onco-nephrology, cancer care, and medicine, kept alongside these pages.

The Big Idea

What "AI" Actually Means

The seven terms worth knowing

The five kinds of AI, side by side

AI Across Medicine Today

Start with what is real, not what is loud

Imaging and pathology: where AI already practices

Radiotherapy: auto-contouring saves the time, not the responsibility

Clinical trial matching: reading the fine print at scale

Drug discovery and AlphaFold: this is the part that feels like the future

Generative AI for documentation: useful, but never unsupervised

AI in Nephrology

Predicting acute kidney injury before it shows

Spotting CKD progression before the clinic is overwhelmed

Reading the dialysis session in real time

Transplant: matching, allocation, and a "virtual biopsy"

Pathology and imaging, made quantitative

Onco-Nephrology

AKI in cancer patients

Cisplatin-associated AKI

Immune checkpoint inhibitor kidney injury

Myeloma, monoclonal gammopathy, and MGRS

The rest of the territory

The Vision: A Renal Command Center

The differential engine

Biopsy decision support

Cancer-therapy planning with renal tradeoffs built in

Survivorship nephrology

The Cautions

Prediction is not outcome improvement

Cancer patients are distribution-shift machines

Explainability matters more in nephrology than in most fields

Bias and access

LLM hallucination

Three Examples to Discuss

The Bottom Line

Go Deeper: Sources & Library

Watch

AI across medicine

AI in nephrology

Onco-nephrology

The cautions

Tools you can open