March 9, 2017

Meetup Presentations

Type I and II Errors

There are two competing hypotheses: the null and the alternative. In a hypothesis test, we make a decision about which might be true, but our choice might be incorrect.

fail to reject H0 reject H0
H0 true ✔ Type I Error
HA true Type II Error ✔



  • Type I Error: Rejecting the null hypothesis when it is true.
  • Type II Error: Failing to reject the null hypothesis when it is false.

Hypothesis Test

If we again think of a hypothesis test as a criminal trial then it makes sense to frame the verdict in terms of the null and alternative hypotheses:

H0 : Defendant is innocent
HA : Defendant is guilty

Which type of error is being committed in the following circumstances?

  • Declaring the defendant innocent when they are actually guilty
    Type 2 error
  • Declaring the defendant guilty when they are actually innocent
    Type 1 error

Which error do you think is the worse error to make?

Null Distribution

(cv <- qnorm(0.05, mean=0, sd=1, lower.tail=FALSE))
## [1] 1.644854
PlotDist(alpha=0.05, distribution='normal', alternative='greater')
abline(v=cv, col='blue')

Alternative Distribution

cord.x1 <- c(-5, seq(from = -5, to = cv, length.out = 100), cv)
cord.y1 <- c(0, dnorm(mean=cv, x=seq(from=-5, to=cv, length.out = 100)), 0)
curve(dnorm(x, mean=cv), from = -5, to = 5, n = 1000, col = "black",
        lty = 1, lwd = 2, ylab = "Density", xlab = "Values")
polygon(x = cord.x1, y = cord.y1, col = 'lightgreen')
abline(v=cv, col='blue')

pnorm(cv, mean=cv, lower.tail = FALSE)
## [1] 0.5

Another Example (mu = 2.5)

mu <- 2.5
(cv <- qnorm(0.05, mean=0, sd=1, lower.tail=FALSE))
## [1] 1.644854

Numeric Values

Type I Error

pnorm(mu, mean=0, sd=1, lower.tail=FALSE)
## [1] 0.006209665

Type II Error

pnorm(cv, mean=mu, lower.tail = TRUE)
## [1] 0.1962351

Shiny Application

Why p < 0.05?

Statistical vs. Practical Significance

  • Real differences between the point estimate and null value are easier to detect with larger samples.
  • However, very large samples will result in statistical significance even for tiny differences between the sample mean and the null value (effect size), even when the difference is not practically significant.
  • This is especially important to research: if we conduct a study, we want to focus on finding meaningful results (we want observed differences to be real, but also large enough to matter).
  • The role of a statistician is not just in the analysis of data, but also in planning and design of a study.