AI Tools Outperform Clinicians in Rwanda Study
The potential of artificial intelligence (AI) tools to offer affordable health advice to low-income countries has been outlined in a new study. Researchers described the work as the first evaluation of its kind and found that five large language models (LLMs) significantly outperformed local doctors and nurses in Rwanda when responding to hundreds of clinical questions.
The tools, including Google’s Gemini-2 and ChatGPT-4o, delivered responses at a cost 500 times lower per answer and still outperformed clinicians when responding in the local language, Kinyarwanda. The research team included academics from Rwanda and the U.K. and noted a lack of previous research around how LLMs perform in low-income countries. The questions tested were randomly selected from thousands supplied by community health workers across four Rwandan districts and evaluated using a rubric of expert-rated metrics.
Study Suggests AI Tools Outperform Clinicians in Rwanda
Community health workers across four Rwandan districts supplied thousands of clinical questions, and researchers randomly selected around 520 for the test. Experts then evaluated the responses using a rubric of rated metrics. The other tools measured — o3-mini, Deepseek R1 and Meditron-70B — each scored significantly higher than local clinicians.
According to the research team, the study aimed to evaluate the ability of LLMs to generate safe, high-quality and cost-effective responses to real questions posed by frontline health care workers in a low-resource setting. The team concluded that LLMs can provide high-quality, on-demand clinical advice to community health workers that outperforms local experts, even in low-resource, non-English language settings.
The researchers designed the study to simulate a situation in which a community health worker seeks telephone advice from a general practitioner or senior nurse and accepts the first response offered. Despite the headline finding, the authors acknowledged the study does not fully reflect the complexity of day-to-day clinical practice, as real-life situations often involve back-and-forth conversations. They suggested future studies examine how AI tools perform in extended clinical conversations.
Gates Foundation Funds AI Roll-Out
The Gates Foundation funded the Rwanda study and has led efforts to deploy and research large language models in Sub-Saharan Africa. In January 2026, the foundation announced a $50 million joint investment with OpenAI to deploy AI tools supporting primary care workers across 1,000 clinics, starting in Rwanda.
In February 2026, the foundation also launched the Evidence for AI in Health initiative with the Wellcome Trust and the Novo Nordisk Foundation, committing $60 million to projects in low- and middle-income countries.
The three-year project will support researchers evaluating LLMs in clinical settings, AI tools that read diagnostic scans and models that predict disease risk or prioritize patients for follow-up based on their medical history. Priority will go to technologies designed for resource-limited settings.
Looking Ahead
The growing interest in these projects reflects the economic challenge of delivering universal health coverage in low-income countries. A recent World Bank analysis suggested that achieving universal health coverage requires about $60 per capita in low-income countries, compared with around $17 per capita in current government and donor funding.
Global aid cuts have increased pressure on health budgets, making the search for affordable approaches to care more urgent. The study highlighted that AI tools can outperform clinicians in Rwanda. Indeed, the investments that followed suggest that AI tools may offer one pathway toward bridging that gap in resource-limited settings.
– Lawrence Dunhill
Lawrence is based in London, UK and focuses on Global Health for The Borgen Project.
Photo: Flickr
