Napkin Folding — lifelines
Highlights from lifelines v0.25.0
Posted by Cameron Davidson-Pilon at
Today, the 0.25.0 release of lifelines was released. I'm very excited about some changes in this version, and want to highlight a few of them. Be sure to upgrade with: pip install lifelines==0.25.0 Formulas everywhere! Formulas, which should really be called Wilkinson-style notation but everyone just calls them formulas, is a lightweight-grammar for describing additive relationships. If you have used R, you'll likely be familiar with formulas. They are less common in Python, so here's an example: Writing age +...
An L½ penalty in Cox Regression
Posted by Cameron Davidson-Pilon at
Following up from a previous blog post where we explored how to implement an \(L_1\) and elastic net penalty to induce sparsity, a paper, by Xu Z B, Zhang H, Wang Y, et al., explores what a \(L_{1/2}\) penalty is and how to implement it. But first, I think we are familiar with an \(L_1\) penalty, but what is an \(L_0\) penalty then? If you work out the math, it is a penalty that counts the number of non-zero coefficients, independent of the magnitude of the coefficients: $$ll^*(\theta, x) =...
An accelerated lifetime spline model
Posted by Cameron Davidson-Pilon at
A paper came out recently with a novel accelerated lifetime (AFT) model with cubic splines. This should pique your interest for a few reasons: 1. It helps dethrone the Proportional Hazard (PH) model as the default survival model. People like the PH model because it doesn't make any distributional assumptions. However, like a Trojan horse, there are very strong implicit assumptions that are inherited, often which are too restricting. Suffice to say, I am not a big proponent of the PH model. ...
L₁ Penalty in Cox Regression
Posted by Cameron Davidson-Pilon at
In the 00's, L1 penalties were all the rage in statistics and machine learning. Since they induced sparsity in fitted parameters, they were used as a variable selection method. Today, with some advanced models having tens of billions of parameters, sparsity isn't as useful, and the L1 penalty has dropped out of fashion. However, most teams aren't using billion parameter models, and smart data scientists work with simple models initially. Below is how we implemented an L1 penalty in the...
Non-parametric survival function prediction
Posted by Cameron Davidson-Pilon at
As I was developing lifelines, I kept having a feeling that I was gradually moving the library towards prediction tasks. lifelines is great for regression models and fitting survival distributions, but as I was adding more and more flexible parametric models, I realized that I really wanted a model that would predict the survival function — and I didn't care how. This led me to the idea to use a neural net with \(n\) outputs, one output for each parameter...
SaaS churn and piecewise regression survival models
Posted by Cameron Davidson-Pilon at
A software-as-a-service company (SaaS) has a typical customer churn pattern. During periods of no billing, the churn is relatively low compared to periods of billing (typically every 30 or 365 days). This results in a distinct survival function for customers. See below: kmf = KaplanMeierFitter().fit(df['T'], df['E']) kmf.plot(figsize=(11,6)); To borrow a term from finance, we clearly have different regimes that a customer goes through: periods of low churn and periods of high churn, both of which are predictable. This predictability and...
Counting and interval censoring analysis
Posted by Cameron Davidson-Pilon at
Let’s say you have an initial population of (micro-)organisms, and you are curious about their survival rates. A common summary statistic of their survival is the half-life. How might you collect data to measure their survival? Since we are dealing with micro-organisms, we can’t track individual lifetimes. What we might do is periodically count the number of organisms still alive. Suppose our dataset looks like: T = [0, 2, 4, 7 ] # in hours N = [1000, 914, 568,...
The Delta-Method and Autograd
Posted by Cameron Davidson-Pilon at
One of the reasons I’m really excited about autograd is because it enables me to be able to transform my abstract parameters into business-logic. Let me explain with an example. Suppose I am modeling customer churn, and I have fitted a Weibull survival model using maximum likelihood estimation. I have two parameter estimates: lambda-hat and rho-hat. I also have their covariance matrix, which tells me how much uncertainty is present in the estimates (in lifelines, this is under the variance_matrix_...
Evolution of lifelines over the past few months
Posted by Cameron Davidson-Pilon at
TLDR: upgrade lifelines for lots of improvements pip install -U lifelines During my time off, I’ve spent a lot of time improving my side projects so I’m at least kinda proud of them. I think lifelines, my survival analysis library, is in that spot. I’m actually kinda proud of it now. A lot has changed in lifelines in the past few months, and in this post I want to mention some of the biggest additions and the stories behind them....