Menu
Cart

Napkin Folding — lifelines

Non-parametric survival function prediction

Posted by Cameron Davidson-Pilon at

As I was developing lifelines, I kept having a feeling that I was gradually moving the library towards prediction tasks. lifelines is great for regression models and fitting survival distributions, but as I was adding more and more flexible parametric models, I realized that I really wanted a model that would predict the survival function — and I didn't care how. This led me to the idea to use a neural net with \(n\) outputs, one output for each parameter...

Read more →

SaaS churn and piecewise regression survival models

Posted by Cameron Davidson-Pilon at

A software-as-a-service company (SaaS) has a typical customer churn pattern. During periods of no billing, the churn is relatively low compared to periods of billing (typically every 30 or 365 days). This results in a distinct survival function for customers. See below: kmf = KaplanMeierFitter().fit(df['T'], df['E']) kmf.plot(figsize=(11,6)); To borrow a term from finance, we clearly have different regimes that a customer goes through: periods of low churn and periods of high churn, both of which are predictable. This predictability and...

Read more →

Counting and interval censoring analysis

Posted by Cameron Davidson-Pilon at

Let’s say you have an initial population of (micro-)organisms, and you are curious about their survival rates. A common summary statistic of their survival is the half-life. How might you collect data to measure their survival? Since we are dealing with micro-organisms, we can’t track individual lifetimes. What we might do is periodically count the number of organisms still alive. Suppose our dataset looks like: T = [0, 2, 4, 7 ] # in hours N = [1000, 914, 568,...

Read more →

The Delta-Method and Autograd

Posted by Cameron Davidson-Pilon at

One of the reasons I’m really excited about autograd is because it enables me to be able to transform my abstract parameters into business-logic. Let me explain with an example. Suppose I am modeling customer churn, and I have fitted a Weibull survival model using maximum likelihood estimation. I have two parameter estimates: lambda-hat and rho-hat. I also have their covariance matrix, which tells me how much uncertainty is present in the estimates (in lifelines, this is under the variance_matrix_...

Read more →

Evolution of lifelines over the past few months

Posted by Cameron Davidson-Pilon at

TLDR: upgrade lifelines for lots of improvements pip install lifelines==0.22.1 During my time off, I’ve spent a lot of time improving my side projects so I’m at least kinda proud of them. I think lifelines, my survival analysis library, is in that spot. I’m actually kinda proud of it now. A lot has changed in lifelines in the past few months, and in this post I want to mention some of the biggest additions and the stories behind them. Performance...

Read more →