I am currently collaborating with Root Causes, a student-run initiative out of the Duke School of Medicine, to evaluate the Fresh Produce Program. This program provides Durham residents with free, bi-weekly boxes of fresh produce. We are connecting program data to Duke electronic health records to assess potential health effects.
Students:
Katelyn Hucker, MIDS '25
Sizhe Chen, MIDS '26
Fan Xu, MIDS '26
Presented at the International Day of Women in Statistics and Data Science conference, fall 2025
Each year, I mentor at least one MIDS Capstone project. Teams are comprised of 4-5 MIDS and MSS students.
2025-26:
Does access to school meals improve student outcomes? Evidence from Community Eligibility Provision participation in North Carolina with the World Food Policy Center: Peter de Guzman (MIDS), Meron Gedrago (MIDS), Kayla Haeussler (MIDS), Diego Rodriguez (MIDS), Fan Xu (MIDS)
2024-25:
Developing an LLM-based tool for easy data cleaning and analysis for Duke Anesthesiology: Gunel Aghakishiyeva (MIDS), Katherine Tian (MIDS), Poojitha Balamurugan (MIDS), Revanth Ganga (MIDS), Udyan Sachdev (MIDS)
2023-24:
Examining Disparities in NIH R01 Funding Allocation with Scarlett Bellamy, PhD (Boston University): Lorna Aine (MIDS), Medhavi Darshan (MSS), Zhonglin Wang (MIDS)
Generating Drug Diversion Care Flags with Duke Anesthesiology: Susanna Anil (MIDS), Xiaoquan Liu (MIDS), Song Young Oh (MIDS), Yucan Zang (MSS)
2022-23:
Drug Diversion Intervention with Duke Anesthesiology: Anna Dai (MIDS), Joseph Ekpenyong (MSS), Robert Wan (MIDS), Nansu Wang (MIDS)
Racial Differences in Fever Detection with Duke Critical Care Medicine
TherLid: A Thermometry Linked Dataset
Abstract:
A recent study showed that infrared (IR) sensors may be prone to calibration discrepancies among darker-pigmented patients. Similar disparities have already been verified in pulse oximetry. This raises questions about whether thermometry measurements may inadvertently overlook hypothermia, fever, or sepsis cases, potentially leading to delayed diagnoses and ultimately exacerbating poorer outcomes among vulnerable subpopulations.
TherLiD is a derived dataset from 3 Electronic Health Record databases: MIMIC-IV, eICU-CRD-1, and eICU-CRD-2. It consists of 13,251 temperature pairs, along with comprehensive demographic data and time-synchronized hospital information, offering a detailed profile of each patient. These pairs have one reference (contact thermometers - oral, core, and rectal) and one infrared-based (temporal) temperature value measured within a 1-hour time window, with temperature values between 30°C to 45°C. TherLiD not only provides high-quality, clinically relevant data but also offers a reproducible framework, allowing researchers to tailor the dataset to their specific research needs, including training machine learning models. This dataset was also built to facilitate temperature-related retrospective studies and promote research on racial and ethnic healthcare disparities.
Student:
Jeremy Tan, MIDS '25