One great way to understanding how classifier works is through visualizing its decision boundary. In scikit-learn, there are several nice posts about visualizing decision boundary (plot_iris, plot_voting_decision_region); however, it usually require quite a few lines of code, and not directly usable. So I write the following function, hope it could serve as a general way to visualize 2D decision boundary for any classification models. (see Github, the notebook is Here) (Note. a few updates after my first publish, in current version: 1. the API is much simpler 2. add dimension reduction (PCA) to handle higher dimension cases 3. wrap the function into the package (pylib) ) The usage of this function is quite simple, here it is: In the random forest case, we see the decision boundary is not very continuous as the previous two models. This is because the decision boundary is calculated based on model prediction result: if the predict class changes on this grid, this grid will be identified as on decision boundary. However, if the model has strong volatile behavior in some space, it will be displayed as if decision boundary here.
Happy Thanksgiving!
4 Comments
|
AuthorData Magician Archives
October 2017
Categories
All
|