Regression Diagnostics Workshop

John Fox

McMaster University

SORA/TABA May 2022

Abbreviated URL:  https://tinyurl.com/SORA-TABA-diagnostics


added variable
                                  plots
“Regression diagnostics” are methods for determining whether a regression model fit to data adequately represents the data. This workshop will present diagnostics for linear models fit by least squares, for generalized-linear models fit by maximum likelihood, for linear and generalized-linear mixed-effects models, and for linear regression models estimated by instrumental variables. I assume some familiarity with the various regression models covered in the workshop. Primarily to establish notation and basic results, I prepared a brief review of these topics; please read the review prior to the workshop.

I’ll use the R statistical computing environment for the presentation, and so I also assume some familiarity with R. See my ICPSR R lectures for introductory material on R along with a variety of references and links to resources, including installation instructions for R and RStudio (a free programming editor for R). In addition to the standard R distribution, to follow along with the R scripts for the workshop (see below), you should install several contributed packages: install.packages(c("car", "effects", "ivreg", "lme4"))

The workshop is largely based on two sources: Fox, Regression Diagnostics, Second Edition (Sage, 2020), and Fox and Weisberg, An R Companion to Applied Regression, Third Edition (Sage, 2019). You need not have access to these books to follow the workshop.

Register at Eventbrite: 2022 SORA-TABA Annual Workshop & DLSPH Biostatistics Research Day.

The topics below don't precisely correspond to the eight hours of the workshop; in particular, the earlier topics will likely take more time than the later ones.

The materials on this website may be updated before the workshop, so please (re-)download them the day before the workshop.
I also recommend that your update your R installation to the current version and that you update all of your R packages:
update(ask=FALSE)
 

Downloads

To be read before the workshop: Review of Linear Models, Generalized Linear Models, and Linear Mixed-Effects Models




Lecture Slides and R Scripts


Topic Slides R Script
1
Introduction and review of the normal linear model slides-introduction.pdf

2
Examining and transforming regression data slides-examining-transforming-data.pdf
(corrected 2022-05-19)
examining-transforming-data.R
3
Unusual data: Outliers, leverage, and influence slides-unusual-data.pdf unusual-data.R
4
The response: Non-normality and nonconstant error variance slides-response-distribution.pdf
(corrected 2022-05-29)
response-distribution.R
5
Lack-of-fit: Detecting and correcting nonlinearity
slides-nonlinearity.pdf nonlinearity.R
6
Diagnostics for generalized linear models slides-glms.pdf glms.R
7
Diagnostics for mixed-effects models and instrumental-variables estimation slides-mixed-models-ivs.pdf mixed-models-ivs.R
8
Collinearity diagnostics and wrap-up slides-collinearity.pdf
slides-wrap-up.pdf
collinearity.R





Data Files

CIA.txt CIA World Factbook data
Davis.txt Davis's data on measured and reported height and weight
Duncan.txt Duncan's occupational-prestige data
Mroz.txt Mroz's data on women's labor-force participation