If you’ve visited a doctor in Ontario over the past two to three years, there’s a strong chance their visit was documented using an AI medical scribe—software that listens to conversations, transcribes them, and formats the data into medical notes.

While the concept promises efficiency, significant concerns have emerged. This week, Ontario’s auditor general—an independent accountability officer under the Legislative Assembly of Ontario—released a special report highlighting critical flaws in AI medical scribes. The report warns that these systems were not adequately evaluated and may present fabricated information to healthcare professionals.

First reported by Global News, the audit examined 20 AI scribe platforms from government-approved vendors. It found that all systems showed inaccuracies during procurement testing, including:

  • Hallucinations (fabrication of data)
  • Incorrect information
  • Missing or incomplete details

The report emphasized the risks:

“Inaccuracies in medical notes generated by AI Scribe systems could potentially result in inadequate or harmful treatment plans that may potentially impact patient health outcomes.”

Ontario’s Minister of Public and Business Service Delivery and Procurement, Stephen Crawford, clarified that these inaccuracies were observed only during regulatory testing—not during actual patient visits.

“Let’s be very clear about that, that’s not actually in operational use with doctors, that’s in the optional stage where we’re reviewing the various scribes,”
Crawford told Global News.

Despite this, Auditor General Shelley Spence noted that AI scribes are already in use by approximately 5,000 doctors across Ontario. During her own medical visit, Spence requested her physician to

“please look at the transcript when you’re done with my own visit.”

This scrutiny comes as another AI scribe system, OpenEvidence, faces increasing criticism in the United States for similar issues. Doctors interviewed by NBC News reported that OpenEvidence occasionally draws overly strong conclusions from medical studies with small sample sizes.

While many physicians appreciate the efficiency of AI tools, their real-world reliability remains unproven. The medical community is now questioning how these systems will perform under actual conditions—and how they will be judged once the initial AI hype subsides.

Source: Futurism