Despite AI Advancements, Human Oversight Remains Essential

Icahn School of Medicine. Stock AI in medicine concept image shows medical holographic icons around a central AI icon

The introduction of artificial intelligence-driven technology into health care continues at a brisk pace, with sometimes uneven results.

Researchers at the Icahn School of Medicine at Mount Sinai wanted to know how state-of-the-art artificial intelligence systems known as large language models (LLMs) would perform as medical coders, a precise, laborious and time-consuming task that is mostly still performed by humans.

Their study, published in the April 19 online issue of NEJM emphasizes the necessity for refinement and validation of these technologies before considering clinical implementation.

The study cross-checked the results of more than 27,000 unique diagnosis and procedure codes originally produced by human medical coders against the results generated by OpenAI, Google, and Meta, when those programs were fed the same medical data.

The generated codes were compared with the original codes and errors were analyzed for any patterns. All of the studied LLM’s showed limited accuracy (below 50 percent) in reproducing the original medical codes, highlighting a significant gap in their usefulness for medical coding.

Co-senior author Girish Nadkarni, M.D., said “This study sheds light on the current capabilities and challenges of AI in health care, emphasizing the need for careful consideration and additional refinement prior to widespread adoption.”

LEARN MORE

Resources on the Role of Hospitals

Telling the Hospital Story

Innovation

Artificial Intelligence (AI)

ICD-10-CM and ICD-10-PCS Codes and Coding