Skip to content

Instantly share code, notes, and snippets.

@dannguyen
Last active August 3, 2018 20:51
Show Gist options
  • Save dannguyen/d83f27a93b2e6f80edda22cfa0f0a1d6 to your computer and use it in GitHub Desktop.
Save dannguyen/d83f27a93b2e6f80edda22cfa0f0a1d6 to your computer and use it in GitHub Desktop.
example of using Python's xlrd library to do a batch process of opening a Excel workbook, extract the content from the first sheet as txt, and then save as CSV
"""
This code processes each xlsx file in the xlsx/* path,
then writes the data (a list of values) to CSV files stored in the csv/ path
"""
from csv import writer
from glob import glob
from os.path import basename, join
from os import makedirs
from xlrd import open_workbook
import re
CSV_DIRNAME = 'csv'
makedirs(CSV_DIRNAME, exist_ok=True)
for xfname in glob(join('xlsx', '*', '*.xlsx')):
cname = join(CSV_DIRNAME, basename(xfname) + '.csv')
print("writing to", cname)
cf = open(cname, 'w')
cv = writer(cf)
book = open_workbook(xfname)
sheet = book.sheets()[0]
for n in range(sheet.nrows):
rowvals = sheet.row_values(n)
cv.writerow(rowvals)
cf.close()
@elaOnMars
Copy link

Hi, I recently tested it with these Gov open data files: https://open.nrw/de/dataset/bundesverkehrswegezahlung-ac

I created the folder xlsx and copied the xlsx into this subfolder.
The libraries are imported.

But only the csv folder was created.
Have you any idea which error I have made?

Best,
Daniela

@dannguyen
Copy link
Author

Sorry, only the CSV folder was created? Were all the files created too? Actually I had forgotten that I designed this script to read all the XLSX files in a given tree of directories, but save them to a single CSV folder. I can change that.

@dannguyen
Copy link
Author

@elaOnMars
Copy link

Thank you very much. I think I've forgotten a line before. Everything works well!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment