Magic Analytics
  • Home
  • Python
    • Pandas
    • Matplotlib
    • Interactive Visualization
    • Folium
  • Spark
    • DataFrame
  • Machine Learning
    • Classification >
      • Logistic Regression
    • Dimension Reduction
    • Model Explaination
  • Blog
  • About

Aries Research Note

RGB Representation for classes

11/26/2016

1 Comment

 
Initially, in the plot_decision_boundary code, I used hard-code color schema to represent each class. If the input class # is higher than that, an error will be presented. This may not be the best approach. Then I start to think about: how to define class RGB representation in this specific use case? Basically, here is a few fact/requirement:
Fact: 
F-1. any data point has a probability associated with different class, they sum to 1. 

Requirement:
R-1. If one point is 100% in class A, it should have a unique color, and cannot be constructed as a hybrid composition by other classes. 
R-2. Each color represents a unique combination of different classes composition. 
     
For R-2, after some thinking, I found it is impossible to get such no-duplication satisfied. This is because as long as the class # is larger than 4 (3 + 1), due to the dimensionality difference, there must be a duplication happens. So ignore this requirement. 

For condition R-1, it could be satisfied as long as the RGB value for all classes form a strict convex object within RGB 3D space. A simple example is: on a RG 2D space, if we use (0,0), (1,0), (0,1), (1,1) as class color definition, no class can be represented by a composition of other classes (based on coefficient sum to 1). However, if we add a new class like (0.5, 1.0), it can be written as:
    (0.5, 1.0) = 50% * (1, 1) + 50% * (0, 1)

This is because the extra point is not making the class color geometry strict convex any more.

A easier solution is to define a sphere within RGB 3D box, and taking the color representation over the sphere as each class color. In this case, R-1. should be satisfied. 

However, in practice, I found if using just the sphere, the contrast between classes will be smaller than ignore R-1. Here is the comparison between 20 colors and 100 colors between a sphere surface representation vs. RGB 3D surface representation. Currently, I set it as the default in plot_decision_boundary function. 
Picture
Picture
1 Comment
Custom Girls link
8/20/2023 11:21:00 pm

Nice post thanks for shharing

Reply



Leave a Reply.

    Author

    Data Magician

    Archives

    October 2017
    April 2017
    November 2016
    October 2016
    September 2016

    Categories

    All
    Git
    Hive
    Machine Learning
    Matplotlib
    Pandas
    Plotly
    Python
    R
    Spark

    RSS Feed

Powered by Create your own unique website with customizable templates.
  • Home
  • Python
    • Pandas
    • Matplotlib
    • Interactive Visualization
    • Folium
  • Spark
    • DataFrame
  • Machine Learning
    • Classification >
      • Logistic Regression
    • Dimension Reduction
    • Model Explaination
  • Blog
  • About