How well does ChatGPT understand spine clinical guidelines?

Carly Behm - Monday, August 12th, 2024

Different versions of ChatGPT addressed clinical questions about lumbar disc herniation with radiculopathy differently, according to a study published in the September issue of the North American Spine Society Journal.

Researchers asked ChatGPT 3.5 and ChatGPT 4.0 15 questions from the 2012 NASS Clinical Guidelines for diagnosing and treating lumbar disc herniation with radiculopathy. Two independent authors assessed the accuracy, over-conclusiveness, supplementary and incompleteness of their outputs.

Of the 15 responses from ChatGPT 3.5, 47% were accurate, 47% were over-conclusive and 40% were incomplete. All were supplementary. The study found a "statistically significant difference in supplementary information" between ChatGPT 3.5 and ChatGPT 4.0.

The study concluded that "ChatGPT-4.0 provided less supplementary information and overall higher accuracy in question categories than ChatGPT-3.5. ChatGPT showed reasonable concordance to NASS guidelines, but clinicians should caution use of ChatGPT in its current state as it fails to safeguard against misinformation."

How well does ChatGPT understand spine clinical guidelines?

Articles We Think You'll Like

Featured Webinars

Featured Whitepapers

Most Read - Spine