Magic Analytics
  • Home
  • Python
    • Pandas
    • Matplotlib
    • Interactive Visualization
    • Folium
  • Spark
    • DataFrame
  • Machine Learning
    • Classification >
      • Logistic Regression
    • Dimension Reduction
    • Model Explaination
  • Blog
  • About

Aries Research Note

Pandas: data cleaning functions

10/1/2016

1 Comment

 
Data cleaning is a key part in Pandas. Usually it requires functions such as:
- drop columns
- fill/drop missing data
- drop duplicate rows
- replace data value

Doing this in Pandas is quite straightforward (based on Pandas 0.18.1)

    
Picture
Here are the functions used:
drop, dropna, fillna, drop_duplicates, replace

Simple, but powerful  :)
1 Comment
jeevan
2/24/2020 11:24:25 pm

where is dataset ??

Reply



Leave a Reply.

    Author

    Data Magician

    Archives

    October 2017
    April 2017
    November 2016
    October 2016
    September 2016

    Categories

    All
    Git
    Hive
    Machine Learning
    Matplotlib
    Pandas
    Plotly
    Python
    R
    Spark

    RSS Feed

Powered by Create your own unique website with customizable templates.
  • Home
  • Python
    • Pandas
    • Matplotlib
    • Interactive Visualization
    • Folium
  • Spark
    • DataFrame
  • Machine Learning
    • Classification >
      • Logistic Regression
    • Dimension Reduction
    • Model Explaination
  • Blog
  • About