Introduction:
Artificial intelligence (AI) has demonstrated significant capabilities in analyzing various types of patient data. It offers potential opportunities to enhance patient care, improve diagnostics, and enable more precise treatment decisions in many medical fields, including nephrology. The clinical trial pre-screening process could be time-consuming, particularly when a large patient cohort is involved. In addition, human errors are inevitable and may result in potential screening failures. Typically, nephrologists review the patient data and study inclusion/exclusion criteria prior to initiating formal screening; however, this pre-screening process may lack optimal efficiency due to the challenges mentioned. In this study, we evaluated the accuracy and efficiency of AI in the pre-screening process for a published clinical trial (NefIgArd) and compared its performance to nephrologists.
Methods:
A survey comprising four simulated clinical cases through Google Forms was shared between derived connections from investigators and social media platforms, including LinkedIn and X. Nephrologists were asked to determine the eligibility of each case for pre-screening according to the NefIgArd Trial's inclusion and exclusion criteria, providing a “yes “or “no” response. Participants were also instructed to record and input the time taken to complete their assessment for each case. The same cases were evaluated using AI (ChatGPT version 3.5) to compare the speed and accuracy of the nephrologists' responses.
Results:
Figure 1. Statistical Comparison of Physician and AI Accuracy
A total of 33 nephrologists participated in the study, mostly from the academic setting (69.7%). Among the nephrologists, 39.4% were Assistant Professor, 18.2% were Associate Professors, and 9.1% were Professors. The median years of experience was 8, with an interquartile range (IQR) of 3.5 to 15 years. The AI consistently outperformed physicians, achieving 100% accuracy across all cases, while physicians' accuracy ranged from 21.9% to 90.6%. The AI's accuracy was significantly higher than physicians' for each case and overall (p<0.001), with physicians' overall accuracy at 55.9% compared to the AI's 99.9% (Figure 1). Physicians took an average of 117 seconds (SD: 146) with a median of 60 seconds (IQR: 29-120). In contrast, the AI took an average of 11 seconds (SD: 1) with a median of 11 seconds (IQR: 11-12). The mean rank for physicians was 67.93, while the AI's mean rank was 4.75. The p-value of 0.001 indicates a statistically significant difference between the time taken by physicians and AI, with the AI completing the screening process significantly faster than physicians.
Conclusions:
Integrating AI in nephrology in certain tasks with clear instructions, such as clinical trial pre-screenings, might provide more accuracy and efficiency. Further studies are warranted to explore this potential fully.
I have no potential conflict of interest to disclose.
I did not use generative AI and AI-assisted technologies in the writing process.