Expert review of AI-generated responses to the top ten patient complaints in primary care

Background: Artificial intelligence (AI) systems such as ChatGPT are among the fastest-growing applications of all time. Most physicians are familiar with the pitfalls of “Dr. Google” and WebMD. However, the same is not true for the increasingly accessed AI-based applications. This study...

Full description

Saved in:
Bibliographic Details
Main Authors: Monica Gillie, George Kent
Format: Article
Language:English
Published: Academia.edu Journals 2024-11-01
Series:Academia Medicine
Online Access:https://www.academia.edu/125308167/Expert_review_of_AI_generated_responses_to_the_top_ten_patient_complaints_in_primary_care
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background: Artificial intelligence (AI) systems such as ChatGPT are among the fastest-growing applications of all time. Most physicians are familiar with the pitfalls of “Dr. Google” and WebMD. However, the same is not true for the increasingly accessed AI-based applications. This study aims to evaluate ChatGPT’s response to the top ten complaints seen in primary care to help clinicians assess the utility and accuracy of a popular AI-based application. Methods: The top ten patient-reported complaints leading to a visit with their primary care physician were used to generate two questions, each regarding cause and treatment. These questions were then asked via the Perplexity AI search engine. Each response was graded by three experienced family medicine clinicians, and the overall score was reported for its utility and appropriateness. Results: About 95% of responses were rated as useful with 85% of responses were clinically appropriate. There were three responses deemed inappropriate by the reviewers indicating possible areas of harmful omission or improper triage. Unanimously, the response to treatment of shortness of breath was regarded as not useful and inappropriate due to lack of emphasis on seeking medical care and life-threatening conditions. Fatigue received the highest ratings of utility for both etiology and treatment. Responses were overall focused and concise. However, citations were secondary sources with variability in utility and clinical safety. Conclusion: “Doctor AI” is here to stay and will require ongoing investigation as it inevitably plays an increasing role in the provision of medical information and advice to patients. The rapid pace of AI search engine development produces limitations in a study of this design, as results are likely to differ over a short period of time. More research on the safety and utility of medical AI in the primary care setting is paramount.
ISSN:2994-435X