How to Convert Files Between Formats with a Python Script (e.g., CSV to Excel)

How to Convert Files Between Formats with a Python Script (e.g., CSV to Excel)

Easily Convert CSV to Excel and Back Using Python: A Simple Automation Guide

File format conversion is a common task in data processing and management. One frequently encountered scenario is converting CSV files to Excel format, especially when working with large datasets that need to be shared or further analyzed. In this guide, we'll walk through creating a Python script that converts files from CSV to Excel format using the pandas library.

What You’ll Learn:

  • Reading CSV files using Python.

  • Converting CSV files to Excel format.

  • Handling multiple CSV files and saving them as Excel files.

  • Automating file conversion tasks.

1. Setting Up the Environment

First, make sure you have Python installed on your machine. You’ll also need the pandas and openpyxl libraries, which can be installed via pip:

pip install pandas openpyxl

Library Overview:

  • pandas: Provides data manipulation and analysis tools, and allows for easy file format conversions.

  • openpyxl: Enables writing to Excel files (XLSX format) when using pandas.

2. Basic CSV to Excel Conversion

Step 1: Writing a Script to Convert a Single CSV File to Excel

Here's a simple Python script to convert a CSV file to an Excel file:

import pandas as pd

def csv_to_excel(csv_file, excel_file):
    # Read the CSV file
    df = pd.read_csv(csv_file)

    # Convert and save to Excel
    df.to_excel(excel_file, index=False)

# Example usage
csv_to_excel('data.csv', 'data.xlsx')

3. Handling Multiple CSV Files

Step 2: Converting Multiple CSV Files in a Directory

If you need to convert multiple CSV files in a directory to Excel files, you can automate this process:

import os
import pandas as pd

def convert_all_csv_in_directory(directory):
    # Loop through all files in the directory
    for filename in os.listdir(directory):
        if filename.endswith('.csv'):
            csv_file = os.path.join(directory, filename)
            excel_file = os.path.join(directory, filename.replace('.csv', '.xlsx'))

            # Convert CSV to Excel
            df = pd.read_csv(csv_file)
            df.to_excel(excel_file, index=False)
            print(f'Converted {csv_file} to {excel_file}')

# Example usage
convert_all_csv_in_directory('/path/to/your/csv_files')

4. Enhancing the Script

Step 3: Adding Options for Different Excel Formats

You might want to specify the sheet name or use different Excel formats:

def csv_to_excel(csv_file, excel_file, sheet_name='Sheet1'):
    # Read the CSV file
    df = pd.read_csv(csv_file)

    # Convert and save to Excel with a specified sheet name
    df.to_excel(excel_file, sheet_name=sheet_name, index=False)

# Example usage
csv_to_excel('data.csv', 'data.xlsx', sheet_name='DataSheet')

5. Converting Excel to CSV

Step 4: Converting Excel Files Back to CSV

To complete the file conversion loop, you can also convert Excel files back to CSV:

def excel_to_csv(excel_file, csv_file):
    # Read the Excel file
    df = pd.read_excel(excel_file)

    # Convert and save to CSV
    df.to_csv(csv_file, index=False)

# Example usage
excel_to_csv('data.xlsx', 'data.csv')
Conclusion
In this guide, we've walked through creating a Python script to convert files between CSV and Excel formats. Whether you’re working with a single file or automating the conversion of multiple files, Python and pandas provide powerful tools for handling these tasks. You can easily expand this script to support other formats or additional customizations as needed for your specific workflow.

This script can be a valuable asset in managing and processing data, especially in data-intensive environments where format consistency is crucial.