Magic Analytics
  • Home
  • Python
    • Pandas
    • Matplotlib
    • Interactive Visualization
    • Folium
  • Spark
    • DataFrame
  • Machine Learning
    • Classification >
      • Logistic Regression
    • Dimension Reduction
    • Model Explaination
  • Blog
  • About

Pandas: Data Analytics in Python

"Pandas is a software library written for the Python programming language for data manipulation and analysis." -- Wikipedia

Pandas has many powerful functionalities, and here are my blogs for some practical usage.
Data cleaning functions
Data Frame sub-selection
Group-by-aggregation vs. dplyr
Index-free Group-by
Reshape data frame



Data operation

Native visualization

Interactive visualization (Plotly)

Comparison with PySpark's DataFrame

Spark DataFrame vs. Pandas (Part 1. select and filter)
Spark DataFrame vs. Pandas (Part 2. join related operation)
Spark DataFrame vs. Pandas (Part 3. group-by related operation)
Spark DataFrame vs. Pandas (Part 4. set related operation)
Spark DataFrame vs. Pandas (Part 5: SQL-windows function)
Powered by Create your own unique website with customizable templates.
  • Home
  • Python
    • Pandas
    • Matplotlib
    • Interactive Visualization
    • Folium
  • Spark
    • DataFrame
  • Machine Learning
    • Classification >
      • Logistic Regression
    • Dimension Reduction
    • Model Explaination
  • Blog
  • About