Feature Space in Machine Learning

Posted by Cameron Davidson-Pilon at

Feature space refers to the \(n\)-dimensions where your variables live (not including a target variable, if it is present). The term is used often in ML literature because a task in ML is feature extraction, hence we view all variables as features. For example, consider the data set with:

  • \(Y \equiv\) Thickness of car tires after some testing period
  • \(X_1 \equiv\) distance travelled in test
  • \(X_2 \equiv\) time duration of test
  • \(X_3 \equiv\) amount of chemical \(C\) in tires

The feature space is \(\mathbf{R}^3\), or more accurately, the positive quadrant in \(\mathbf{R}^3\) as all the \(X\) variables can only be positive quantities. Domain knowledge about tires might suggest that the *speed* the vehicle was moving at is important, hence we generate another variable, \(X_4\) (this is the feature extraction part):

  • \(X_4 =\frac{X_1}{X_2} \equiv\) the speed of the vehicle during testing.

This extends our old feature space into a new one, the positive part of \(\mathbf{R}^4\).


Furthermore, a mapping in our example is a function, \(\phi\), from \(\mathbf{R}^3\) to \(\mathbf{R}^4\):

$$\phi(x_1,x_2,x_3) = (x_1, x_2, x_3, \frac{x_1}{x_2} )$$

Related Posts

Latest Data Science screencasts available


Leave a comment

Please note: comments will be approved before they are published