,

Automatically Extract & Rearrange Data Columns in Excel Using Python – Using Nested Loops

Automatically Extract & Rearrange Data Columns in Excel Using Python - Using Nested Loops

Data extraction and manipulation are common tasks in Excel, often requiring repetitive manual effort. Python, with its powerful libraries like Openpyxl, offers a solution to automate these tasks, saving time and reducing errors. In this tutorial, we will explore how to automatically extract data from specific columns in an Excel sheet and rearrange it into a new sheet using Python.

I will demonstrate how to load an Excel file, specify the columns to extract, and rearrange the data into a new sheet. This automation can be useful for various applications, such as data cleaning, transformation, and analysis, where specific data columns need to be extracted and organized efficiently. Let’s dive into the details of how to achieve this automation with Python.

To automatically extract and rearrange data columns using Python and openpyxl, let us check out the Python Code:

import openpyxl

# Load the Excel file
wb = openpyxl.load_workbook('Indian_employee_names.xlsx')

# Select the active sheet
sheet = wb.active

# Create a new sheet for the rearranged data
new_sheet = wb.create_sheet(title='Rearranged Data')

# Define the column mappings for extraction and rearrangement
column_mappings = {
'A': 'A',
'B': 'B',
'C': 'C'
}

# Iterate over the rows and columns to extract and rearrange the data
for row in sheet.iter_rows(values_only=True):
new_row = []
for source_column, target_column in column_mappings.items():
new_row.append(row[ord(source_column) - 65]) # Convert column letter to index (65 is the ASCII value for 'A')
new_sheet.append(new_row)

# Save the modified workbook
wb.save('your_modified_excel_file.xlsx')

Explanation of the Above Code:

  1. import openpyxl: This line imports the openpyxl library, which is a Python library for reading and writing Excel (xlsx) files.
  2. wb = openpyxl.load_workbook('Indian_employee_names.xlsx'): This line loads an existing Excel workbook named ‘Indian_employee_names.xlsx’ into the wb variable. load_workbook() is a function provided by openpyxl to load Excel files.
  3. sheet = wb.active: This line selects the active sheet in the workbook wb and stores it in the variable sheet. The active sheet is the sheet that is currently displayed or selected in the Excel file.
  4. new_sheet = wb.create_sheet(title='Rearranged Data'): This line creates a new sheet in the workbook wb with the title ‘Rearranged Data’ and stores it in the variable new_sheet. This sheet will be used to store the rearranged data.
  5. column_mappings = {'A': 'A', 'B': 'B', 'C': 'C'}: This dictionary defines the mapping of columns from the original sheet (source_column) to the new sheet (target_column). In this case, it maps column A from the original sheet to column A in the new sheet, column B to column B, and so on.
  6. for row in sheet.iter_rows(values_only=True):: This line iterates over each row in the sheet, extracting the values from each cell in the row. The values_only=True argument ensures that only the cell values are returned, not the cell objects.
  7. new_row = []: This line initializes an empty list new_row to store the values of the rearranged row.
  8. for source_column, target_column in column_mappings.items():: This line iterates over each key-value pair in the column_mappings dictionary. source_column represents the column in the original sheet, and target_column represents the corresponding column in the new sheet.
  9. new_row.append(row[ord(source_column) - 65]): This line extracts the value from the cell in the source_column of the current row (row) and appends it to the new_row list. The ord(source_column) - 65 converts the column letter (source_column) to a zero-based index (0 for ‘A’, 1 for ‘B’, etc.) for accessing the row values.
  10. new_sheet.append(new_row): This line appends the new_row list, containing the values of the rearranged row, to the new_sheet, effectively adding a new row to the new sheet.
  11. wb.save('your_modified_excel_file.xlsx'): This line saves the modified workbook (wb) to a new Excel file named ‘your_modified_excel_file.xlsx’. The modifications include the addition of the new sheet (new_sheet) with the rearranged data.

Output:

Original Sheet:

After Automating With Python Code The Orignal Excel Sheet will have another sub sheet named as ‘Rearranged Data’ as below

In this project, we explored how to automate the extraction and rearrangement of data columns in Excel using Python and the openpyxl library. We began by loading an existing Excel file and selecting the active sheet. Next, we created a new sheet to store the rearranged data. We defined a dictionary to map columns from the original sheet to the new sheet.

Using a nested loop, we iterated over each row and column in the original sheet, extracted the values, and rearranged them according to our mapping. We then appended the rearranged row to the new sheet. Finally, we saved the modified workbook to a new Excel file.

This automation process can be extremely useful for tasks that require repetitive data manipulation and organization in Excel. By leveraging Python’s capabilities, we can streamline these tasks, save time, and reduce the risk of manual errors. Overall, this project demonstrates the power and flexibility of using Python for Excel automation.

Author

Sona Avatar

Written by

Leave a Reply

Trending

CodeMagnet

Your Magnetic Resource, For Coding Brilliance

Programming Languages

Web Development

Data Science and Visualization

Career Section

<script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-4205364944170772"
     crossorigin="anonymous"></script>