Search Articles

View query in Help articles search

Search Results (1 to 10 of 107 Results)

Download search results: CSV END BibTex RIS


From E-Patients to AI Patients: The Tidal Wave Empowering Patients, Redefining Clinical Relationships, and Transforming Care

From E-Patients to AI Patients: The Tidal Wave Empowering Patients, Redefining Clinical Relationships, and Transforming Care

Among LLM users, half reported personal learning as their goal, and 39% sought information about physical or mental health [3]. Patients burdened with life-changing or rare conditions commonly search for the resources that they need to solve problems. As consumer costs of care keep rising and health care is relentlessly hard to navigate, patients and caregivers are gaining skills and intelligence using LLMs across a breadth of topics.

Susan S Woods, Sarah M Greene, Laura Adams, Grace Cordovano, Matthew F Hudson

J Particip Med 2025;17:e75794

Benchmarking the Confidence of Large Language Models in Answering Clinical Questions: Cross-Sectional Evaluation Study

Benchmarking the Confidence of Large Language Models in Answering Clinical Questions: Cross-Sectional Evaluation Study

This raises questions about the underlying mechanisms that prompt an LLM to label certain statements as “more factual.” For example, one possible explanation could be that data-rich or frequently discussed topics in training sets may be perceived as more certain [18], even if this does not translate into clinical accuracy. Additionally, retrieval-augmented generation (RAG) has been proposed to ground LLM outputs in external data, which potentially mitigates hallucinations [19].

Mahmud Omar, Reem Agbareia, Benjamin S Glicksberg, Girish N Nadkarni, Eyal Klang

JMIR Med Inform 2025;13:e66917

Evaluating Generative AI in Mental Health: Systematic Review of Capabilities and Limitations

Evaluating Generative AI in Mental Health: Systematic Review of Capabilities and Limitations

LLMs such as Chat GPT, Claude, and Bard hold great promise in mitigating this stark situation to reduce clinicians’ burden and increase clinician efficiency through LLM-assisted clinical notes writing, formulating differential diagnoses, drafting personalized treatment plans, drawing insights from patient chart data, providing on-demand coaching and companionship, and, ultimately, providing therapy [10,11].

Liying Wang, Tanmay Bhanushali, Zhuoran Huang, Jingyi Yang, Sukriti Badami, Lisa Hightow-Weidman

JMIR Ment Health 2025;12:e70014

Global Health care Professionals’ Perceptions of Large Language Model Use In Practice: Cross-Sectional Survey Study

Global Health care Professionals’ Perceptions of Large Language Model Use In Practice: Cross-Sectional Survey Study

Large language model (LLM) refers to advanced artificial intelligence (AI) models designed for natural language processing tasks. LLMs are trained on vast amounts of text data and use deep learning techniques to understand and generate human-like language. They helped transform various fields, including medicine [1]. Some examples of most popular LLMs are Lla MA by Meta, Orca and Phi-1 by Microsoft, BLOOM, Pa LM2 by Google, and GPT by Open AI.

Ecem Ozkan, Aysun Tekin, Mahmut Can Ozkan, Daniel Cabrera, Alexander Niven, Yue Dong

JMIR Med Educ 2025;11:e58801

The Effectiveness of a Custom AI Chatbot for Type 2 Diabetes Mellitus Health Literacy: Development and Evaluation Study

The Effectiveness of a Custom AI Chatbot for Type 2 Diabetes Mellitus Health Literacy: Development and Evaluation Study

Retrieval-augmented generation (RAG) offers a solution to the issue of medical credibility by anchoring LLM responses to specific reference documents [24,25]. The overall architecture of the RAG LLM is shown in Figure 1. The user query is combined with the prompt to provide an input to the system. The prompt that guides the LLM should be predesigned for the T2 DM health literacy chatbot task. Medical reference documents are indexed to extract the lexical meaning (embeddings) of the text.

Anthony Kelly, Eoin Noctor, Laura Ryan, Pepijn van de Ven

J Med Internet Res 2025;27:e70131

Accuracy of Large Language Models When Answering Clinical Research Questions: Systematic Review and Network Meta-Analysis

Accuracy of Large Language Models When Answering Clinical Research Questions: Systematic Review and Network Meta-Analysis

The search subject terms were “LLM,” “generative AI,” “open AI,” “Large language model,” “Chat GPT-3.5,” “Chat GPT-4,” “Google Bard,” and “Bing,” without any language restriction. The complete search strategies for all databases are shown in Multimedia Appendix 2. A combination of End Note X9 deduplication and manual deduplication was used to screen the literature in accordance with the developed inclusion criteria.

Ling Wang, Jinglin Li, Boyang Zhuang, Shasha Huang, Meilin Fang, Cunze Wang, Wen Li, Mohan Zhang, Shurong Gong

J Med Internet Res 2025;27:e64486

Comparing Diagnostic Accuracy of Clinical Professionals and Large Language Models: Systematic Review and Meta-Analysis

Comparing Diagnostic Accuracy of Clinical Professionals and Large Language Models: Systematic Review and Meta-Analysis

Although there is no official definition of LLM, based on the literature [2,3], we define LLM as a model with over a billion parameters, designed for typical artificial intelligence (AI) applications. Accurate clinical diagnosis is essential for patient treatment outcomes and survival rates. However, even when health care professionals gather extensive information and conduct numerous observations and tests, absolute diagnostic accuracy cannot be guaranteed.

Guxue Shan, Xiaonan Chen, Chen Wang, Li Liu, Yuanjing Gu, Huiping Jiang, Tingqi Shi

JMIR Med Inform 2025;13:e64963

Guideline-Incorporated Large Language Model-Driven Evaluation of Medical Records Using MedCheckLLM

Guideline-Incorporated Large Language Model-Driven Evaluation of Medical Records Using MedCheckLLM

Our proposed algorithm, Med Check LLM, is an LLM-driven, structured reasoning mechanism designed to automate the evaluation of medical records against evidence-based guidelines. The guidelines are deterministically accessed and returned to the LLM as input without further model fine-tuning. This strict separation of LLM and guidelines is expected to increase the validity and interpretability of the evaluations. The approach's step-by-step structure could improve transparency in clinical applications.

Marc Cicero Schubert, Stella Soyka, Wolfgang Wick, Varun Venkataramani

JMIR Form Res 2025;9:e53335

Development of a GPT-4–Powered Virtual Simulated Patient and Communication Training Platform for Medical Students to Practice Discussing Abnormal Mammogram Results With Patients: Multiphase Study

Development of a GPT-4–Powered Virtual Simulated Patient and Communication Training Platform for Medical Students to Practice Discussing Abnormal Mammogram Results With Patients: Multiphase Study

Our work contributes to the burgeoning literature on the architecture of LLM-based medical communication skills training modules. Existing literature presents general frameworks for LLM-based simulations across disparate clinical scenarios with a focus on clinical reasoning and diagnosis [22,25].

Dan Weisman, Alanna Sugarman, Yue Ming Huang, Lillian Gelberg, Patricia A Ganz, Warren Scott Comulada

JMIR Form Res 2025;9:e65670