EVALUATING CHATGPT-4'S ACCURACY IN IDENTIFYING NEPHROLOGY MEDICATIONS FROM SELF-CAPTURED PILL IMAGES ACROSS MULTIPLE GPT-4 MODELS

7 Feb 2025 12 a.m. 12 a.m.
WCN25-AB-1426, Poster Board= FRI-066

Introduction:

Medication errors are a significant concern in healthcare, particularly among patients with chronic kidney disease (CKD) who often experience polypharmacy due to multiple comorbidities. Accurate identification of medications is crucial to reducing these errors. ChatGPT-4, a large language model developed by OpenAI, has shown promise in various healthcare applications, including image analysis. This study aims to assess the accuracy and reliability of ChatGPT-4 in identifying commonly prescribed nephrology medications using high-quality pill images, with a focus on its ability to recognize drug name, dosage, and imprint.

Methods:

The study was conducted at the Mayo Clinic in Rochester, Minnesota, in 2024. Twenty-five commonly prescribed nephrology medications were selected, and high-resolution images of these pills were captured using an iPhone 13 Pro Max. Each medication was uploaded and tested only once, and the accuracy of ChatGPT-4's first response was recorded. The images were presented to ChatGPT-4 using the prompt, "What is this medication?" The accuracy of ChatGPT-4's responses was evaluated based on correct identification of the medication name, dosage, and imprint. The process was repeated after two weeks to assess the consistency of the model's performance across different versions of GPT-4, including GPT-4, GPT-4 Legacy, and GPT-4.Ø. Misidentified medications were re-tested after providing corrective feedback on imprints.

Results:

ChatGPT-4 accurately identified 22 out of 25 medications (88%) across all tested versions, including GPT-4, GPT-4 Legacy, and GPT-4.Ø (Table 1). However, it consistently misidentified Hydrochlorothiazide, Nifedipine, and Spironolactone due to misinterpretation of pill imprints (Table 2). For instance, Nifedipine ER 90 mg was mistaken for Metformin Hydrochloride ER 500 mg due to a misreading of the imprint "NF 06" as "NF 05." Similar imprint-related errors were observed with Hydrochlorothiazide and Spironolactone. Despite these errors, the model demonstrated 100% consistency when re-tested, correcting previous misidentifications after receiving feedback on the correct imprints.

Conclusions:

ChatGPT-4 shows strong potential as a tool for identifying nephrology medications from high-quality pill images, demonstrating a high accuracy rate of 88% across different versions of GPT-4, including GPT-4, GPT-4 Legacy, and GPT-4.Ø. However, challenges remain in accurately reading difficult-to-distinguish imprints, which can lead to misidentification. The ability to correct errors with feedback suggests that ChatGPT-4 could be valuable in digital health for medication identification, particularly when used under professional supervision. Future research should focus on enhancing the model's capacity to distinguish between similar imprints and improving its generalizability across a broader range of drug classes and image conditions, thereby advancing its utility in clinical practice.

I have no potential conflict of interest to disclose.

I did not use generative AI and AI-assisted technologies in the writing process.

 
Table 1Table 2