## Wednesday, 11 December 2019

### Statistical rethinking: grid approximation for estimating a posterior

My colleague James Cotton recommended that a good way to expand my stats knowledge is to read the book Statistcal Rethinking by Richard McElreath, which so far I'm enjoying a lot!

A nice aspect of the book is that he uses many examples in R.

Grid approximation for estimating a posterior distribution
In chapter 2, he shows how to estimate a posterior distribution using a 'grid approximation' approach. I've wrapped up his lines of code into a little function that lets you try out different number of grid points.

In this case, we are interested in calculating the posterior distribution for p, the probability of success for a binomial variable.

We can specifiy different prior distributions for p.

Let's assume that we have carried out 9 Bernoulli trials and observed 6 successes so far.

Let's try a uniform prior:

plot_posterior_using_uniform_prior <- function(num_points_in_grid)
{
`    p_grid <- seq(from = 0, to=1, length.out=num_points_in_grid)`
`    prior <- rep(1,num_points_in_grid)`
`    likelihood <- dbinom(6, size=9, prob=p_grid)`
`    ustd.posterior <- likelihood * prior`
`    posterior <- ustd.posterior/sum(ustd.posterior)`
`    plot(p_grid, posterior, type="b", xlab="probability of success", ylab="posterior probability for p")`
`}`
```
```
`Do a plot with a grid with 30 values for p:`
`plot_posterior_using_uniform_prior(num_points_in_grid=30) `

```
```
```Let's try 10 values for p:
plot_posterior_using_uniform_prior(num_points_in_grid=10)```
`This doesn't give us such a good estimate of the posterior distribution for p:`

```
```
`What if we had a different prior, that was a step function, assigning 0 probability to all values of p less than 0.5:`
```
```

plot_posterior_using_step_prior <- function(num_points_in_grid)
{
p_grid <- seq(from = 0, to=1, length.out=num_points_in_grid)
prior <- ifelse ( p_grid < 0.5, 0, 1)
`     likelihood <- dbinom(6, size=9, prob=p_grid) `
`     ustd.posterior <- likelihood * prior `
`     posterior <- ustd.posterior/sum(ustd.posterior) `
`     plot(p_grid, posterior, type="b", xlab="probability of success", ylab="posterior probability for p")`
}

`plot_posterior_using_step_prior(num_points_in_grid=30)`
` `
` `
Quadratic approximation for estimating a posterior distribution
Chapter 2 of Statistical Rethinking also explains that the grid approximation approach doesn't scale well and is slow, and that a faster, more scaleable approach is quadratic approximation. McElreath shows how it can be used to estimate the posterior distribution for the same parameter p, as above: (see his book for how to install the 'rethinking' R package):

`library(rethinking)`
```globe.qa <- quap(alist(W ~ dbinom(W+L,p), p ~ dunif(0,1)), data=list(W=6,L=3)
+ )```
`precis(globe.qa)`
```  mean   sd 5.5% 94.5%
p 0.67 0.16 0.42  0.92```
` `
`This tells us that, using a uniform prior and assuming the posterior for p is Gaussian (the key assumption of the quadratic approximation approach), its maximum (peak) is at 0.67 and its standard deviation is 0.16.`
` `
`This is similar to the peak of the posterior distribution for p that we estimated using the grid approximation approach above (when using a large number of grid points such as 30).`
```
```
```
```

```
```