Python Automation – Sort Date in Excel using Pandas

Sorting data in Excel is a common task, but it can become slow and inefficient when dealing with large datasets or complex calculations. Fortunately, Python, along with the powerful Pandas library, provides a faster and more efficient way to sort Excel data.

In this tutorial, we’ll explore how to sort an Excel data table using Python, ensuring speed and accuracy. We’ll cover two main sorting methods in Pandas: .sort_index() and .sort_values(). Let’s dive in!

Prerequisites

Before we begin, make sure you have the following:

An example Excel file with relevant data (e.g., dates, customer names, purchases)

Python installed on your system

  • Import Pandas module
  • Make DataFrame from Excel file
  • sort the date column with DataFrame.sort_value() function
  • Display the Final DataFrame

Step 1: Importing pandas module and making DataFrame form excel.

# import module 
import pandas as pd 

# making data frame from excel file 
df = pd.read_excel('sample_date.xlsx') 

print("Original DataFrame") 
df 

Output:

Step 2: Sorting date with DataFrame.sort_value() function.

# sorting date with sort_value() function 
Final_result = df.sort_values('Joining Date') 

print(" Sorted Date DataFrame") 
Final_result 

Full Code:

# import module
import pandas as pd

# making data frame from excel file
df = pd.read_excel('sample_date.xlsx')

# Displaying original DataFrame
print("Original DataFrame")
print(df)

# saving the original DataFrame to a new Excel file
df.to_excel('original_sample_date.xlsx', index=False)

# sorting date with sort_value() function
Final_result = df.sort_values('Joining Date')

# Displaying sorted DataFrame
print("Sorted Date DataFrame")
print(Final_result)

# saving the sorted DataFrame to a new Excel file
Final_result.to_excel('sorted_sample_date.xlsx', index=False)

Output:

If you want to open the excel sheet directly in the output. Then you can use os in the code, like this

# import module
import pandas as pd
import os

# making data frame from excel file
df = pd.read_excel('sample_date.xlsx')

# Displaying original DataFrame
print("Original DataFrame")
print(df)

# saving the original DataFrame to a new Excel file
df.to_excel('original_sample_date.xlsx', index=False)

# sorting date with sort_value() function
Final_result = df.sort_values('Joining Date')

# Displaying sorted DataFrame
print("Sorted Date DataFrame")
print(Final_result)

# saving the sorted DataFrame to a new Excel file
Final_result.to_excel('sorted_sample_date.xlsx', index=False)

# Open the original and sorted Excel files
os.system('start excel.exe original_sample_date.xlsx')
os.system('start excel.exe sorted_sample_date.xlsx')

Output:

Explanation of the Code:

import pandas as pd
import os
  • The first line imports the Pandas library and assigns it the alias pd.
  • The second line imports the os module, which provides a way to interact with the operating system (e.g., running shell commands).

Reading an Excel File into a DataFrame

df = pd.read_excel('sample_date.xlsx')
  • This line reads an Excel file named 'sample_date.xlsx' and creates a Pandas DataFrame called df.
  • The read_excel() function reads the data from the Excel file and converts it into a tabular structure (DataFrame).

Displaying the Original DataFrame

print("Original DataFrame")
print(df)
  • These lines print the header “Original DataFrame” followed by the content of the DataFrame df.
  • The DataFrame contains the data read from the Excel file.

Saving the Original DataFrame to a New Excel File

df.to_excel('original_sample_date.xlsx', index=False)
  • This line saves the original DataFrame df to a new Excel file named 'original_sample_date.xlsx'.
  • The index=False argument ensures that the row indices are not included in the saved file.

Sorting the DataFrame by Date

Final_result = df.sort_values('Joining Date')
  • This line sorts the DataFrame df based on the values in the column named 'Joining Date'.
  • The resulting sorted DataFrame is assigned to the variable Final_result.

Displaying the Sorted DataFrame

print("Sorted Date DataFrame")
print(Final_result)

These lines print the header “Sorted Date DataFrame” followed by the content of the sorted DataFrame Final_result.

Saving the Sorted DataFrame to a New Excel File

Final_result.to_excel('sorted_sample_date.xlsx', index=False)

This line saves the sorted DataFrame Final_result to a new Excel file named 'sorted_sample_date.xlsx'.

Opening the Original and Sorted Excel Files

os.system('start excel.exe original_sample_date.xlsx')
os.system('start excel.exe sorted_sample_date.xlsx')

These lines use the os.system() function to open the Excel files 'original_sample_date.xlsx' and 'sorted_sample_date.xlsx' using the default system application associated with Excel files (usually Microsoft Excel).

Remember to adjust the file names ('sample_date.xlsx''original_sample_date.xlsx', and 'sorted_sample_date.xlsx') according to your specific use case. This script demonstrates reading, sorting, and saving Excel data using Pandas in Python. Feel free to modify it as needed! 😊🚀

In this tutorial, we explored how to automate the sorting of dates in Excel using Python and the powerful Pandas library. By leveraging Pandas’ .sort_values() method, we achieved faster and more efficient sorting of data. Remember to adjust the column names and file paths according to your specific use case, and happy automating!

Author

Sona Avatar

Written by

Leave a Reply

Trending

CodeMagnet

Your Magnetic Resource, For Coding Brilliance

Programming Languages

Web Development

Data Science and Visualization

Career Section

<script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-4205364944170772"
     crossorigin="anonymous"></script>