Code
theme_set(theme_minimal())
theme_set(theme_minimal())
Variable/Model selection and ANOVA on Whiteside data
We address the following question: was the external temperature distributed in the same way during the two heating seasons? When we raise this question, we silently make modeling assumptions. Spell them out.
What kind of hypothesis are we testing in the next two chunks? Interpret the results.
lm_temp <- lm(Temp ~ Insul, whiteside)
lm_temp |>
tidy()
# A tibble: 2 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 5.35 0.537 9.96 7.80e-14
2 InsulAfter -0.887 0.734 -1.21 2.32e- 1
lm_temp |>
glance()
# A tibble: 1 × 12
r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 0.0263 0.00830 2.74 1.46 0.232 1 -135. 276. 282.
# ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>
Display parallel boxplots, overlayed cumulative distribution functions and a quantile-quantile plot (QQ-plot) to compare the temperature distributions during the two heating seasons. Comment
As we have to infer the dependence on Temperature, the questions turn tricky.
Draw a qqplot
to compare Gas consumptions before and after insulation.
Compare ECDFs of Gas consumption before and after insulation.
This consists in assessing whether the Intercept is modified after Insulation while the slope is left unchanged. Which models should be used to assess this hypothesis?
Draw the disgnostic plots for this model
Find the formula and build the model.
Investigate formulae Gas ~ poly(Temp, 2, raw=T)*Insul
, Gas ~ poly(Temp, 2)*Insul
, Gas ~ (Temp +I(Temp*2))*Insul
, Gas ~ (Temp +I(Temp*2))| Insul
Play it with degree 10 polynomials
Make a named list with the models constructed so far
stepAIC()
to perform stepwise explorationUse fonction anova()
to compare models constructed with formulae
formula(lm0)
Gas ~ Insul + Temp