How well does ChatGPT understand spine clinical guidelines?

Spine

Different versions of ChatGPT addressed clinical questions about lumbar disc herniation with radiculopathy differently, according to a study published in the September issue of the North American Spine Society Journal.

Researchers asked ChatGPT 3.5 and ChatGPT 4.0 15 questions from the 2012 NASS Clinical Guidelines for diagnosing and treating lumbar disc herniation with radiculopathy. Two independent authors assessed the accuracy, over-conclusiveness, supplementary and incompleteness of their outputs.

Of the 15 responses from ChatGPT 3.5, 47% were accurate, 47% were over-conclusive and 40% were incomplete. All were supplementary. The study found a "statistically significant difference in supplementary information" between ChatGPT 3.5 and ChatGPT 4.0.

The study concluded that "ChatGPT-4.0 provided less supplementary information and overall higher accuracy in question categories than ChatGPT-3.5. ChatGPT showed reasonable concordance to NASS guidelines, but clinicians should caution use of ChatGPT in its current state as it fails to safeguard against misinformation."

Copyright © 2024 Becker's Healthcare. All Rights Reserved. Privacy Policy. Cookie Policy. Linking and Reprinting Policy.

 

Featured Webinars

Featured Whitepapers