"Pandas is a software library written for the Python programming language for data manipulation and analysis." -- Wikipedia
Pandas has many powerful functionalities, and here are my blogs for some practical usage.
Data cleaning functions
Data Frame sub-selection
Group-by-aggregation vs. dplyr
Index-free Group-by
Reshape data frame
Data operation
Native visualization
Interactive visualization (Plotly)
Comparison with PySpark's DataFrame
Spark DataFrame vs. Pandas (Part 1. select and filter)
Spark DataFrame vs. Pandas (Part 2. join related operation)
Spark DataFrame vs. Pandas (Part 3. group-by related operation)
Spark DataFrame vs. Pandas (Part 4. set related operation)
Spark DataFrame vs. Pandas (Part 5: SQL-windows function)
Pandas has many powerful functionalities, and here are my blogs for some practical usage.
Data cleaning functions
Data Frame sub-selection
Group-by-aggregation vs. dplyr
Index-free Group-by
Reshape data frame
Data operation
Native visualization
Interactive visualization (Plotly)
Comparison with PySpark's DataFrame
Spark DataFrame vs. Pandas (Part 1. select and filter)
Spark DataFrame vs. Pandas (Part 2. join related operation)
Spark DataFrame vs. Pandas (Part 3. group-by related operation)
Spark DataFrame vs. Pandas (Part 4. set related operation)
Spark DataFrame vs. Pandas (Part 5: SQL-windows function)