,

How To Automate VLOOKUP & XLOOKUP in Excel Using Python

How To Automate VLOOKUP & XLOOKUP in Excel Using Python

VLOOKUP vs XLOOKUP: A Comprehensive Comparison

In the world of spreadsheet functions, VLOOKUP and XLOOKUP stand out as powerful tools for searching and retrieving data. While VLOOKUP has long been the go-to function for many Excel users, XLOOKUP represents a newer and more versatile alternative. In this article, we’ll explore the differences between VLOOKUP and XLOOKUP, and then demonstrate how you can perform similar operations using Python, a popular programming language.

Understanding VLOOKUP and XLOOKUP

VLOOKUP (Vertical Lookup) is a function in Excel that searches for a value in the first column of a table and returns a value in the same row from another column. It is commonly used to look up and retrieve data from a table based on a matching criteria.

XLOOKUP, on the other hand, is a more advanced lookup function introduced in Excel 365. It allows you to search a range or an array, and returns an item corresponding to the first match it finds. XLOOKUP offers more flexibility and functionality compared to VLOOKUP, including the ability to search both vertically and horizontally, perform approximate matches, and handle errors more gracefully.

Performing XLOOKUP and VLOOKUP in Python

While Excel is a powerful tool for data analysis, performing similar operations in Python can offer more flexibility and scalability. Python’s pandas library, in particular, provides a robust set of tools for working with tabular data.

Below, I will demonstrate how you can perform a lookup operation similar to VLOOKUP and XLOOKUP using pandas in Python:

import pandas as pd

# Sample data
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Age': [25, 30, 35, 40, 45],
'Salary': [50000, 60000, 70000, 80000, 90000]
}

df = pd.DataFrame(data)

# Using pandas' merge function to perform a VLOOKUP-like operation
result_vlookup = pd.merge(df, pd.DataFrame({'Name': ['Bob']}), on='Name')

print("VLOOKUP result:")
print(result_vlookup)

# Using pandas' loc function to perform an XLOOKUP-like operation
result_xlookup = df.loc[df['Name'] == 'Bob']

print("\nXLOOKUP result:")
print(result_xlookup)

In order to run this code you can go to the command prompt and write python space file name in the directory where your file is stored you will get the XLOOKUP and VLOOKUP result like this

Now, let me explain the above code in detail

import pandas as pd
  1. import pandas as pd: This line imports the pandas library and gives it the alias pd, which is a common convention in Python for simplifying the library name when used in code.
data = { ... }

2.data = { ... }: This block of code defines a Python dictionary called data containing sample data for ‘Name’, ‘Age’, and ‘Salary’. Each key in the dictionary represents a column name, and the corresponding value is a list of values for that column.

df = pd.DataFrame(data)

3. df = pd.DataFrame(data): This line creates a pandas DataFrame called df from the data dictionary. A DataFrame is a two-dimensional, size-mutable, and heterogeneous tabular data structure with labeled axes (rows and columns).

result_vlookup = pd.merge(...)

4. result_vlookup = pd.merge(...): This line uses the pd.merge() function to perform a VLOOKUP-like operation. It merges the df DataFrame with another DataFrame created from pd.DataFrame({'Name': ['Bob']}) on the ‘Name’ column. The on='Name' argument specifies that the merge should be based on the ‘Name’ column.

print("VLOOKUP result:")

5. print("VLOOKUP result:"): This line prints a message indicating that the following output is the result of the VLOOKUP-like operation.

print(result_vlookup)

6. print(result_vlookup): This line prints the result of the VLOOKUP-like operation, which is a DataFrame containing the information for the name ‘Bob’.

result_xlookup = df.loc[...]

7. result_xlookup = df.loc[...]: This line uses the df.loc[] function to perform an XLOOKUP-like operation. It filters the df DataFrame to retrieve rows where the ‘Name’ column is equal to ‘Bob’.

print("\nXLOOKUP result:")

8.print("\nXLOOKUP result:"): This line prints a message indicating that the following output is the result of the XLOOKUP-like operation. The \n is a newline character, which adds a blank line before the message.

print(result_xlookup)

9. print(result_xlookup): This line prints the result of the XLOOKUP-like operation, which is a DataFrame containing the information for the name ‘Bob’.

    Let’s take another example which will things more clear and simpler

    Write the Python code to create an Excel workbook, store data, and perform XLOOKUP and VLOOKUP-like operations

    import pandas as pd

    # Sample data
    data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
    'Age': [25, 30, 35, 40, 45],
    'Salary': [50000, 60000, 70000, 80000, 90000]
    }

    # Create a DataFrame
    df = pd.DataFrame(data)

    # Create an Excel writer object
    writer = pd.ExcelWriter('data.xlsx', engine='openpyxl')

    # Write the DataFrame to the Excel sheet
    df.to_excel(writer, sheet_name='Sheet1', index=False)

    # Save the Excel file
    writer.book.save('data.xlsx')
    writer.close()

    # Perform XLOOKUP-like operation
    # Load the Excel file
    data_xlookup = pd.read_excel('data.xlsx', sheet_name='Sheet1')

    # Filter the DataFrame for 'Bob'
    result_xlookup = data_xlookup.loc[data_xlookup['Name'] == 'Bob']
    print("XLOOKUP result:")
    print(result_xlookup)

    # Perform VLOOKUP-like operation
    # Load the Excel file
    data_vlookup = pd.read_excel('data.xlsx', sheet_name='Sheet1')

    # Merge with another DataFrame containing 'Bob'
    result_vlookup = pd.merge(data_vlookup, pd.DataFrame({'Name': ['Bob']}), on='Name')
    print("\nVLOOKUP result:")
    print(result_vlookup)

    Output:

    Explanation of the above code:

    1. import pandas as pd: Imports the pandas library and assigns it the alias pd.
    2. data = {...}: Defines a dictionary data containing sample data with keys 'Name', 'Age', and 'Salary', each corresponding to a list of values.
    3. df = pd.DataFrame(data): Creates a pandas DataFrame df from the data dictionary.
    4. writer = pd.ExcelWriter('data.xlsx', engine='openpyxl'): Creates an Excel writer object writer with the filename 'data.xlsx' and specifies the Excel engine as 'openpyxl'.
    5. df.to_excel(writer, sheet_name='Sheet1', index=False): Writes the DataFrame df to the Excel writer object writer with the sheet name 'Sheet1' and without including the DataFrame index.
    6. writer.book.save('data.xlsx'): Saves the Excel file using the save method of the writer.book attribute (which represents the Excel file).
    7. writer.close(): Closes the Excel writer object writer.
    8. data_xlookup = pd.read_excel('data.xlsx', sheet_name='Sheet1'): Reads the Excel file 'data.xlsx' into a DataFrame data_xlookup with the sheet name 'Sheet1'.
    9. result_xlookup = data_xlookup.loc[data_xlookup['Name'] == 'Bob']: Filters the data_xlookup DataFrame to only include rows where the 'Name' column is equal to 'Bob', storing the result in result_xlookup.
    10. print("XLOOKUP result:"): Prints a message indicating the result is from an XLOOKUP-like operation.
    11. print(result_xlookup): Prints the result_xlookup DataFrame containing the filtered data.
    12. data_vlookup = pd.read_excel('data.xlsx', sheet_name='Sheet1'): Reads the Excel file 'data.xlsx' into a new DataFrame data_vlookup with the sheet name 'Sheet1'.
    13. result_vlookup = pd.merge(data_vlookup, pd.DataFrame({'Name': ['Bob']}), on='Name'): Merges the data_vlookup DataFrame with a new DataFrame containing only the name 'Bob', matching on the 'Name' column.
    14. print("\nVLOOKUP result:"): Prints a message indicating the result is from a VLOOKUP-like operation, with a newline character (\n) to create a line break.
    15. print(result_vlookup): Prints the result_vlookup DataFrame containing the merged data.

    Real Life Example:

    In real life, XLOOKUP and VLOOKUP functions in Excel are commonly used for various tasks. Let’s consider a scenario where a sales manager wants to analyze sales data stored in an Excel sheet. The Excel sheet contains columns for “Product Name”, “Sales Date”, “Quantity Sold”, and “Revenue”. The manager wants to find the revenue generated by a specific product on a particular date using XLOOKUP and VLOOKUP functions.

    Here’s how you can perform these operations using Python with the pandas library:

    import pandas as pd

    # Sample data
    data = {
    'Product Name': ['Product A', 'Product B', 'Product A', 'Product C', 'Product B'],
    'Sales Date': ['2024-01-15', '2024-01-16', '2024-01-15', '2024-01-16', '2024-01-17'],
    'Quantity Sold': [100, 150, 200, 120, 180],
    'Revenue': [50000, 75000, 100000, 60000, 90000]
    }

    # Create a DataFrame
    df = pd.DataFrame(data)

    # Create an Excel writer object
    writer = pd.ExcelWriter('sales_data.xlsx', engine='openpyxl')

    # Write the DataFrame to the Excel sheet
    df.to_excel(writer, sheet_name='Sheet1', index=False)

    # Save the Excel file
    writer.book.save('sales_data.xlsx')
    writer.close()

    # Load the Excel file
    sales_data = pd.read_excel('sales_data.xlsx', sheet_name='Sheet1')

    # Assume 'product_name' and 'sales_date' are variables representing the product name and sales date
    product_name = 'Product A'
    sales_date = '2024-01-15'

    # Using XLOOKUP
    xlookup_result = sales_data.loc[(sales_data['Product Name'] == product_name) & (sales_data['Sales Date'] == sales_date), 'Revenue'].values[0]

    print(f"Revenue for {product_name} on {sales_date}: {xlookup_result}")

    # Using VLOOKUP
    vlookup_result = sales_data.loc[sales_data['Product Name'] == product_name, 'Revenue'].sum()

    print(f"Total Revenue for {product_name}: {vlookup_result}")

    In this example, xlookup_result will contain the revenue generated by “Product A” on “2024-01-15”, while vlookup_result will contain the total revenue generated by “Product A” across all dates.

    breakdown of each line of code:

    1. import pandas as pd: This imports the pandas library and assigns it the alias pd for easier reference in the code.
    2. data = { ... }: This defines a dictionary data containing sample sales data, including product names, sales dates, quantities sold, and revenue.
    3. df = pd.DataFrame(data): This creates a pandas DataFrame df from the data dictionary, which organizes the data into a tabular format.
    4. writer = pd.ExcelWriter('sales_data.xlsx', engine='openpyxl'): This creates an Excel writer object writer that will be used to write the DataFrame to an Excel file named 'sales_data.xlsx'. It uses the ‘openpyxl’ engine for Excel.
    5. df.to_excel(writer, sheet_name='Sheet1', index=False): This writes the DataFrame df to the Excel file specified by writer, with the data placed in a sheet named 'Sheet1'. The index=False parameter ensures that the DataFrame index is not included in the Excel file.
    6. writer.book.save('sales_data.xlsx'): This saves the Excel file using the writer object. The writer.book property accesses the underlying openpyxl workbook object, and save() saves the workbook to a file.
    7. writer.close(): This closes the writer object, releasing any system resources it was using.
    8. sales_data = pd.read_excel('sales_data.xlsx', sheet_name='Sheet1'): This reads the Excel file 'sales_data.xlsx' back into a pandas DataFrame sales_data, allowing you to work with the data further.
    9. product_name = 'Product A': This sets the variable product_name to 'Product A', representing the product for which you want to retrieve revenue information.
    10. sales_date = '2024-01-15': This sets the variable sales_date to '2024-01-15', representing the date for which you want to retrieve revenue information.
    11. xlookup_result = ...: This line uses pandas’ .loc method to perform an XLOOKUP-like operation. It filters sales_data to find rows where both the ‘Product Name’ and ‘Sales Date’ match the specified product_name and sales_date, and then retrieves the ‘Revenue’ value from those rows.
    12. print(f"Revenue for {product_name} on {sales_date}: {xlookup_result}"): This line prints the XLOOKUP-like result, showing the revenue for the specified product_name on the specified sales_date.
    13. vlookup_result = ...: This line uses pandas’ .loc method to perform a VLOOKUP-like operation. It filters sales_data to find rows where the ‘Product Name’ matches the specified product_name, and then sums up the ‘Revenue’ values from those rows.
    14. print(f"Total Revenue for {product_name}: {vlookup_result}"): This line prints the VLOOKUP-like result, showing the total revenue for the specified product_name.

    This code demonstrates how to use pandas to perform XLOOKUP and VLOOKUP-like operations on Excel data in Python.

    Conclusion

    In conclusion, while VLOOKUP and XLOOKUP are powerful functions in Excel, performing similar operations in Python using libraries like pandas can offer more flexibility and scalability. By understanding the differences between these functions and leveraging the capabilities of Python, you can enhance your data analysis workflows and tackle more complex tasks with ease.

    Author

    Sona Avatar

    Written by

    Leave a Reply

    Trending

    CodeMagnet

    Your Magnetic Resource, For Coding Brilliance

    Programming Languages

    Web Development

    Data Science and Visualization

    Career Section

    <script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-4205364944170772"
         crossorigin="anonymous"></script>