ChatGPT Responses to Glaucoma Questions Based on Patient Health Literacy Levels
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Content Notes
Abstract
BACKGROUND: Glaucoma is a complex, progressive neurodegenerative disease of the optic nerve, commonly found in the elderly. Patients usually do not understand the complexities of the disease and struggle to find answers from different glaucoma sources and sites which may be difficult to understand. AI chatbots such as ChatGPT(r) have recently emerged as a useful tool to gather information on any medical question. However, the role of ChatGPT in generating answers to glaucoma treatment questions is not well documented. Health literacy is defined as the basic reading and mathematical skills required to find, understand, and use health-related information. The average reading level among US adults is 7th-8th grade; however, most medical information is often written at a higher reading level. The purpose of this study was to determine whether ChatGPT can tailor responses to glaucoma treatment questions based on patient health literacy levels. HYPOTHESIS: We hypothesize that ChatGPT may satisfactorily tailor answers to glaucoma questions based on patient health literacy level. METHODS: We selected 27 common questions relating to glaucoma medications, lasers, and surgical treatments. The questions were inputted into ChatGPT, first without instructions. Then, ChatGPT was instructed to tailor responses to 4 health literacy levels based on the US National Assessment of Health Literacy: below basic (BB), basic (B), intermediate (I), and proficient (P). Responses were analyzed using Flesch-Kincaid (FKC) grade level [0-18+] corresponding to years of education, word count, and syllables. Kruskal-Wallis rank sum tests were used to analyze the data. RESULTS: The mean FKC grade level of ChatGPT responses without any instructions about health literacy levels was 12.83, corresponding to a 12th-grade or "fairly difficult to read" level. When instructed to tailor responses, the mean FKC grade level of BB, B, I, and P responses were 11.50, 12.49, 12.95, and 13.12 (p<0.001), respectively. The mean word count of BB, B, I, and P answers (82, 117, 163, 177, respectively) correspondingly increased (p<0.001). CONCLUSION: ChatGPT in its current form is unable to provide easy to comprehend responses to glaucoma questions for the public. Future AI chatbots may need to be trained on not only the specific databases, such as medical, conversational, computer science, and finance, but to be able to provide easily understandable answers at all levels of health literacy to cater to a wider sector of society.