# The Delta-Method and Autograd

Posted by **Cameron Davidson-Pilon** on

One of the reasons I’m really excited about autograd is because it enables me to be able to transform my abstract parameters into business-logic. Let me explain with an example. Suppose I am modeling customer churn, and I have fitted a Weibull survival model using maximum likelihood estimation. I have two parameter estimates: lambda-hat and rho-hat. I also have their covariance matrix, which tells me how much uncertainty is present in the estimates (in lifelines, this is under the `variance_matrix_`

property. From this, I can plot the survival curve, which is fine, but what I really want is a measure of lifetime value. Customers give us $10 each time period for the first 3 time periods, and then $30 a month afterwards. For a single user, the average LTV (up to `timeline`

) calculation might look like:

```
# create a Weibull model with fake data
wf = WeibullFitter().fit(np.arange(1, 100))
from autograd import numpy as np
def LTV(params, timeline):
lambda_, rho_ = params
sf = np.exp(-(timeline/lambda_) ** rho_)
clv = 10 * (timeline <= 3) * sf + 30 * (timeline > 3) * sf
return clv.sum()
timeline = np.arange(10)
params = np.array([wf.lambda_, wf.rho_])
LTV(params, timeline)
# 214.76
```

This gives me a point estimate. But I also want the variance, since there is uncertainty in lambda-hat and rho-hat. There is a technique called the delta-method that allows me to transform variance from one domain to another. But to use that, I need the gradient of LTV w.r.t to lambda and rho. For this problem, I could probably write out the gradient, but it would be messy and error-prone. I can use autograd instead to do this.

```
from autograd import jacobian
gradient_LTV = grad(LTV)
gradient_LTV(params, timeline)
# array([ 0.15527043, 10.78900217])
```

The output of the above is the gradient at the `params`

value. Very easy. Now, if I want the variance of the LTV, the delta methods tells me to multiple the above by the covariance matrix and the gradient’s transpose (check: this should return a scalar and not a vector/matrix):

```
var = gradient_LTV(params, timeline).dot(wf.variance_matrix_).dot(gradient_LTV(params, timeline))
# 3.127
```

So, my best estimate of LTV (at timeline=10) is 214.76 (3.127). From this, we can build confidence intervals, etc.

- Tags: lifelines, survival analysis

## Latest Data Science screencasts available

Comments