,

Creating Pivot Tables and Charts for Multiple Columns in Excel Using Python

Creating Pivot Tables and Charts for Multiple Columns in Excel Using Python

In data analysis and reporting, creating pivot tables and charts is a common task to summarize and visualize data effectively. Python, with its powerful libraries like Pandas and Matplotlib, provides a convenient way to automate this process. In this tutorial, we will learn how to use Python to create pivot tables for multiple columns from an Excel dataset, generate charts for each pivot column, and save these charts to a new Excel sheet.

We will start by loading the dataset using Pandas, then create pivot tables for specified columns, and finally, generate charts using Matplotlib. This automated approach saves time and ensures consistency in reporting, making it ideal for data analysts and professionals working with large datasets.

Let’s dive into the details of how to achieve this in Python.

import pandas as pd
import openpyxl
from openpyxl import Workbook
from openpyxl.chart import BarChart, Reference
from openpyxl.utils.dataframe import dataframe_to_rows

# Sample DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'Alice', 'Bob', 'Charlie'],
'Category': ['A', 'B', 'A', 'B', 'A', 'B'],
'Value': [10, 20, 30, 40, 50, 60],
'Quantity': [5, 10, 15, 20, 25, 30]
}
df = pd.DataFrame(data)

# Create a new Excel workbook and add the DataFrame to it
wb = Workbook()
ws = wb.active
ws.title = 'Data'
for r in dataframe_to_rows(df, index=False, header=True):
ws.append(r)

# Create pivot tables for each column
for col in ['Value', 'Quantity']:
pivot_df = df.pivot_table(index='Name', columns='Category', values=col, aggfunc='sum', fill_value=0)
pivot_ws = wb.create_sheet(title=f'{col} Pivot')
for r in dataframe_to_rows(pivot_df, header=True):
pivot_ws.append(r)

# Create a bar chart for the pivot table
chart = BarChart()
chart.title = f'{col} Chart'
chart.x_axis.title = 'Category'
chart.y_axis.title = 'Value'
data = Reference(pivot_ws, min_col=2, min_row=1, max_col=pivot_df.shape[1]+1, max_row=pivot_df.shape[0]+1)
cats = Reference(pivot_ws, min_col=1, min_row=2, max_row=pivot_df.shape[0]+1)
chart.add_data(data, titles_from_data=True)
chart.set_categories(cats)
pivot_ws.add_chart(chart, f'A{pivot_df.shape[0] + 3}')

# Save the Excel file
wb.save('output.xlsx')

Output:

Explanation of the Above Code:

  1. import pandas as pd: Import the Pandas library as pd, which is commonly used for data manipulation and analysis.
  2. import openpyxl: Import the openpyxl library, which allows us to work with Excel files in Python.
  3. from openpyxl import Workbook: Import the Workbook class from openpyxl, which is used to create a new Excel workbook.
  4. from openpyxl.chart import BarChart, Reference: Import the BarChart and Reference classes from openpyxl.chart, which are used to create charts in Excel.
  5. from openpyxl.utils.dataframe import dataframe_to_rows: Import the dataframe_to_rows function from openpyxl.utils.dataframe, which converts a Pandas DataFrame to rows that can be appended to an Excel worksheet.
  6. data = { ... }: Define a sample DataFrame with columns ‘Name’, ‘Category’, ‘Value’, and ‘Quantity’.
  7. df = pd.DataFrame(data): Create a DataFrame from the sample data.
  8. wb = Workbook(): Create a new Excel workbook.
  9. ws = wb.active: Get the active worksheet from the workbook.
  10. ws.title = 'Data': Set the title of the active worksheet to ‘Data’.
  11. for r in dataframe_to_rows(df, index=False, header=True):: Iterate over the rows of the DataFrame converted to rows and append them to the worksheet.
  12. for col in ['Value', 'Quantity']:: Iterate over the columns ‘Value’ and ‘Quantity’.
  13. pivot_df = df.pivot_table(...): Create a pivot table for the current column, grouping by ‘Name’ and ‘Category’ and summing the values.
  14. pivot_ws = wb.create_sheet(...): Create a new worksheet for the pivot table.
  15. for r in dataframe_to_rows(pivot_df, header=True):: Iterate over the rows of the pivot table converted to rows and append them to the worksheet.
  16. chart = BarChart(): Create a new BarChart object.
  17. chart.title = f'{col} Chart': Set the title of the chart to the current column name followed by ‘Chart’.
  18. data = Reference(...): Define the data range for the chart.
  19. cats = Reference(...): Define the category range for the chart.
  20. chart.add_data(data, titles_from_data=True): Add the data to the chart.
  21. chart.set_categories(cats): Set the categories for the chart.
  22. pivot_ws.add_chart(chart, f'A{pivot_df.shape[0] + 3}'): Add the chart to the worksheet starting from cell A, row equal to the number of rows in the pivot table plus 3.
  23. wb.save('output.xlsx'): Save the Excel workbook to a file named ‘output.xlsx’.

More on Excel Automation

In conclusion, the Python code demonstrates how to create pivot tables and charts for multiple columns in a DataFrame. It starts by reading data into a DataFrame, then creates pivot tables for specified columns, generates charts for each pivot table, and finally saves the charts to a new Excel sheet. This process allows for easy analysis and visualization of data, making it simpler to understand trends and patterns in the data. Overall, this code showcases the power and flexibility of Python for data analysis and visualization tasks.

Author

Sona Avatar

Written by

Leave a Reply

Trending

CodeMagnet

Your Magnetic Resource, For Coding Brilliance

Programming Languages

Web Development

Data Science and Visualization

Career Section

<script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-4205364944170772"
     crossorigin="anonymous"></script>