A Randomized Clinical Trial
Published article at the Journal of the American Medical Association Network Open:
Effect of Artificial Intelligence Tutoring vs Expert Instruction on Learning Simulated Surgical Skills Among Medical Students A Randomized Clinical Trial by Ali M. Fazlollahi, MSc; Mohamad Bakhaidar, MD, MSc; Ahmad Alsayegh, MD; Recai Yilmaz, MD; Alexander Winkler-Schwartz, MD; Nykan Mirchi, MSc; Ian Langleben; Nicole Ledwos, MSc; Abdulrahman J. Sabbagh, MBChB; Khalid Bajunaid, MD, MSc; Jason M. Harley, PhD; Rolando F. Del Maestro, MD, PhD
Link to the article: https://jamanetwork.com/journals/jama…
Question How does feedback from artificial intelligence (AI) tutoring system compare with training by remote expert instruction in learning a surgical procedure?
Findings In this randomized clinical trial including 70 medical students, learning a simulated operation achieved significantly higher performance scores when training with an AI tutor compared with expert instruction and a control with no feedback. Students’ cognitive and affective responses to learning with the AI tutor were similar to that fostered by human instructors.
Meaning These findings suggest that learning surgical skills in the simulation was more effective with metric-based assessment and formative feedback on quantifiable criteria and actionable goals by an AI tutor than remote expert instruction.Abstract
Importance To better understand the emerging role of artificial intelligence (AI) in surgical training, efficacy of AI tutoring systems, such as the Virtual Operative Assistant (VOA), must be tested and compared with conventional approaches.
Objective To determine how VOA and remote expert instruction compare in learners’ skill acquisition, affective, and cognitive outcomes during surgical simulation training.
Design, Setting, and Participants This instructor-blinded randomized clinical trial included medical students (undergraduate years 0-2) from 4 institutions in Canada during a single simulation training at McGill Neurosurgical Simulation and Artificial Intelligence Learning Centre, Montreal, Canada. Cross-sectional data were collected from January to April 2021. Analysis was conducted based on intention-to-treat. Data were analyzed from April to June 2021.
Interventions The interventions included 5 feedback sessions, 5 minutes each, during a single 75-minute training, including 5 practice sessions followed by 1 realistic virtual reality brain tumor resection. The 3 intervention arms included 2 treatment groups, AI audiovisual metric-based feedback (VOA group) and synchronous verbal scripted debriefing and instruction from a remote expert (instructor group), and a control group that received no feedback.
Main Outcomes and Measures The coprimary outcomes were change in procedural performance, quantified as Expertise Score by a validated assessment algorithm (Intelligent Continuous Expertise Monitoring System [ICEMS]; range, −1.00 to 1.00) for each practice resection, and learning and retention, measured from performance in realistic resections by ICEMS and blinded Objective Structured Assessment of Technical Skills (OSATS; range 1-7). Secondary outcomes included strength of emotions before, during, and after the intervention and cognitive load after intervention, measured in self-reports.
Results A total of 70 medical students (41 [59%] women and 29 [41%] men; mean [SD] age, 21.8 [2.3] years) from 4 institutions were randomized, including 23 students in the VOA group, 24 students in the instructor group, and 23 students in the control group. All participants were included in the final analysis. ICEMS assessed 350 practice resections, and ICEMS and OSATS evaluated 70 realistic resections. VOA significantly improved practice Expertise Scores by 0.66 (95% CI, 0.55 to 0.77) points compared with the instructor group and by 0.65 (95% CI, 0.54 to 0.77) points compared with the control group (P < .001). Realistic Expertise Scores were significantly higher for the VOA group compared with instructor (mean difference, 0.53 [95% CI, 0.40 to 0.67] points; P < .001) and control (mean difference. 0.49 [95% CI, 0.34 to 0.61] points; P < .001) groups. Mean global OSATS ratings were not statistically significant among the VOA (4.63 [95% CI, 4.06 to 5.20] points), instructor (4.40 [95% CI, 3.88-4.91] points), and control (3.86 [95% CI, 3.44 to 4.27] points) groups. However, on the OSATS subscores, VOA significantly enhanced the mean OSATS overall subscore compared with the control group (mean difference, 1.04 [95% CI, 0.13 to 1.96] points; P = .02), whereas expert instruction significantly improved OSATS subscores for instrument handling vs control (mean difference, 1.18 [95% CI, 0.22 to 2.14]; P = .01). No significant differences in cognitive load, positive activating, and negative emotions were found.
Conclusions and Relevance In this randomized clinical trial, VOA feedback demonstrated superior performance outcome and skill transfer, with equivalent OSATS ratings and cognitive and emotional responses compared with remote expert instruction, indicating advantages for its use in simulation training.
Trial Registration ClinicalTrials.gov Identifier: NCT04700384