Home Department: Bioengineering
Mentor: Gary Darmstadt (Pediatrics - Neonatology)
"LLM Benchmarking for Improved Maternal and Neonatal Care in Bangladesh and Zimbabwe"
Large language models (LLMs) are increasingly proposed as clinical decision-support and task-automation tools in low- and middle-income countries (LMICs), where shortages of healthcare workers are severe. However, emerging evidence shows that while LLMs perform well on general medical reasoning, they fail on country-specific epidemiology, guideline adherence, and citation accuracy (Maric et al.). Diba’s project investigates whether LLM models can be systematically benchmarked and fine-tuned for maternal and neonatal care in LMIC contexts. Using the MedHELM benchmarking framework, Diba will evaluate the latest OpenAI model on clinically grounded scenarios relevant to maternal and neonatal care practitioners in Bangladesh and Zimbabwe. Semi-structured interviews will identify workflow and information gaps, informing dataset and algorithm design. The project will produce benchmarking results and a fine-tuned model using Low Rank Adaptation Algorithm (LORA) to improve clinical reliability for future prototyping.
