Bayesian Data Analysis for Speech Sciences

# Bayesian Data <br>Analysis<br> for <br>Speech Sciences
## More on priors
### Timo Roettger, Stefano Coretta and Joseph Casillas
### LabPhon workshop
### 2021/07/06 (updated: 2021-07-03)

---

![](img/informativity.png)

???

Priors can convey different amounts of prior knowledge, or information. This is called "informativity".

---

![](img/informativity-scale.png)

???

Informativity is a scale: from the least to the greatest amount of information.

Note that the scale is relative, not absolute.

---

![](img/informativity-scale-types.png)

???

Based on the scale, we can identify three types of priors:

- Uninformative priors: virtually no prior knowledge/information is added to the model. These are also called "flat" priors, because the distribution is flat (all values have the same probability). No influence on the posterior.

- Weakly informative priors: some prior knowledge/information is added to the model, but it is vague or at least less informative than what the actual prior knowledge is.

- Strongly informative priors: most or all of the prior knowledge/information is added to the model. This have the strongest influence on the posterior.

---

# Prior informativity

???

Examples of priors with different degrees of informativity in relation to the mean f0.

---

![](img/regularising.png)

???

Regularising priors are priors centered on 0 (i.e. with mean = 0). These priors are help with model estimation and safe-guard against extreme effects.

---

by [Kristoffer Magnusson](https://rpsychologist.com/)

???

Here we can visualise the effect of the prior on the posterior.

---

# Recommendations

.bg-washed-blue.b--black.ba.bw2.br3.shadow-5.ph4.mt2[
Use **regularising priors**.

- Prior mean = 0.
]

.bg-washed-blue.b--black.ba.bw2.br3.shadow-5.ph4.mt2[
Use **weakly informative priors**.

- Prior standard deviation as large as it makes sense.
]

---

# Prior predictive checks

```r
my_priors <- c(
  prior(normal(0, 15), class = Intercept),
  prior(normal(0, 10), class = b, coef = attitudepol),
  prior(cauchy(0, 1), class = sigma)
)

b_mod_01_pripc <- brm(
  articulation_rate ~ attitude,
  prior = my_priors,
  sample_prior = "only",
  data = polite,
  file = here::here("assets/b_mod_01_pripc")
)
```

???

When choosing priors it's important to check that they are weakly informative enough, but not too weakly informative.

You can run prior predictive checks using the same code as you would use when fitting the model, but with the argument `sample_prior = "only"`.

This code will run the model sampling values from the priors.

---

# Prior predictive checks

```r
conditional_effects(b_mod_01_pripc)
```

???

You can now plot the model predictions based on the priors with `conditional_effects()`. (You can use this function with a full model too!)

Remember yesterday we were wondering about negative values (articulation rate cannot be negative)? That is because, in fact, articulation rate does not follow a normal/Gaussian distribution.

Instead, we can use a log-normal distribution as the family of the outcome variable.

---

# BRM with non-Gaussian distributions

```r
my_priors <- c(
  prior(normal(0, 3), class = Intercept),
  prior(normal(0, 1), class = b, coef = attitudepol),
  prior(cauchy(0, 0.1), class = sigma)
)

b_mod_02_pripc <- brm(
  articulation_rate ~ attitude,
  prior = my_priors,
  sample_prior = "only",
  data = polite,
  family = lognormal(),
  file = here::here("assets/b_mod_02_pripc")
)
```

???

This is just a quick example with `family = lognormal()`. Note that priors have to be specified on the log-scale and that the posterior will also be on the log scale.

Unfortunately we don't have time to go into details (it would need a workshop on its own!), but if you used logistic regressions before, you can use your knowledge to set priors and interpret the model output.

The same concepts we covered with priors for `family = gaussian()` apply to other distributions.

---

# BRM with non-Gaussian distributions

```r
conditional_effects(b_mod_02_pripc)
```