Concepts:
Fitting a logistic regression model in R
Interpreting coefficient estimates on the log-odds and odds scales
Using a logistic regression model to make predictions
Overview: An interactive twist on a classic dataset! Students take on the identity of a real Titanic passenger and are provided with that passenger's age, gender, and ticket class. They fit a logistic regression model to predict survival from these characteristics using other passenger data, use the model to predict their own survival, and compare the predicted probability to their true survival outcome.
Audience: First-semester data science masters students in an applied statistical modeling course (used in IDS 702 Fall 2025)
Prerequisite lectures: Introduction to GLMs, Logistic regression estimation
Materials:
Passenger list with details and brief bio (44 passengers included, but more can be found here)
Qmd file with exercise and link to the dataset. The dataset includes full Titanic passenger data, excluding the passengers listed on the passenger list linked above.
Possible extension: Logistic regression assessment
Concepts:
Making careful decisions and considering broad implications with real-world data
History of statistics
Statistics and society
Overview: Students engage with a material of their choice from a provided list and write a reflection about the piece. Materials include articles, blogs, podcasts, videos/TED Talks; optional discussion questions are provided as a launching point.
Audience: I have given this assignment in various forms in graduate and undergraduate-level statistics courses, with and without classroom discussion components.
Prerequisite lectures: None
Materials:
Concepts:
Fitting, interpreting, and evaluating survival analysis models (materials for alternative model types provided below)
Interpreting technical results for a non-technical audience
Overview: An open-ended task in which students are provided with a customer churn dataset and asked to provide recommendations to a company executive to improve customer retention. Students focus on using layman's terms to describe technical concepts and developing appropriate recommendations from a statistical model.
Audience: Data science and statistics masters students
Prerequisite lectures: Kaplan-Meier curves, Log-rank tests, Cox proportional hazards model, principles of non-technical presentations, and slide design
Materials:
Similar assignments that use different modeling types and a written component for the non-technical portion instead of an oral presentation: Airline customer satisfaction; Airbnb pricing