On Jun 15, 9:29 pm, Michael <selmo2...@gmail.com> wrote:
> Recently I needed to quickly convert XLSX workbooks to XLS workbooks
> so I could then interact with them via xlrd. Here it is, hopefully it
> will be useful to someone. :) Note that pywin32 is required to
> interact with Excel 2007, so unfortunately this script will work only
> on Windows with Excel 2007 installed.
> This script is to be executed in the directory of XLSX workbooks
> pending conversion.
> import glob
> import os
> import time
> import win32com.client
> xlsx_files = glob.glob('*.xlsx')
> if len(xlsx_files) == 0:
> raise RuntimeError('No XLSX files to convert.')
> xlApp = win32com.client.Dispatch('Excel.Application')
> for file in xlsx_files:
> xlWb = xlApp.Workbooks.Open(os.path.join(os.getcwd(), file))
> xlWb.SaveAs(os.path.join(os.getcwd(), file.split('.xlsx')[0] +
> '.xls'), FileFormat=1)
> xlApp.Quit()
> # Delete or comment out the following lines if you want to preserve
> the
> # original XLSX files.
> time.sleep(2) # give Excel time to quit, otherwise files may be locked
> for file in xlsx_files:
> os.unlink(file)
Interesting approach.
For a possibly limited but cross-platform way (i.e. don't need to be
on Windows or use pywin32) to do the same conversion from .XLSX
to .XLS files, it is also possible to use an XML parser, such as a SAX-
capable parser (to read the content (*) of the .XLSX files, and then
write the same content to .XLS files using, I guess, the Python xlwt
library, which is mentioned in other messages in this group.). (I have
not used xlwt (yet), which is why I said "I guess", though I have used
its counterpart for reading, xlrd, in my xtopdf toolkit.)
(*) Conditions apply - see below.
This alternative method is possible because .XLSX format files are a
kind of XML. There is a recipe for how to extract the text-only
content (i.e. numbers and strings, no formatting or images or charts -
this is the condition mentioned above) of .XLSX files, using SAX, in
the Python Cookbook 2nd Edition. I had tried out that recipe some time
ago (it worked fine, though I had to tweak it a bit), and used it to
convert the (text-only) content of .XLSX files to PDF, as part of my
xtopdf toolkit. That code is not in the xtopdf release yet, but will
be after some time. If I can dig up the (standalone) code I wrote for
that conversion, I'll post a link to it here in a few days. But
basically, it's really easy to read .XLSX content with Python using
SAX, since there are clearly defined XML elements for tables, rows and
cells. In fact, that means you can also read the .XLSX content using
any language that has a SAX XML parser, not just Python.
- Vasudev Ram
Biz site: www.dancingbison.com
xtopdf: fast and easy PDF creation from other file formats:
www.dancingbison.com/products.html
Blog (on software innovation): jugad2.blogspot.com