Message from discussion
Status of XLRD reading .xlsx (Excel 2007)
Received: by 10.115.93.18 with SMTP id v18mr104212wal.13.1245896991702;
Wed, 24 Jun 2009 19:29:51 -0700 (PDT)
Return-Path: <sjmac...@lexicon.net>
Received: from poplet1.per.eftel.com (poplet1.per.eftel.com [203.24.100.46])
by gmr-mx.google.com with ESMTP id k19si525140waf.4.2009.06.24.19.29.50;
Wed, 24 Jun 2009 19:29:51 -0700 (PDT)
Received-SPF: neutral (google.com: 203.24.100.46 is neither permitted nor denied by best guess record for domain of sjmac...@lexicon.net) client-ip=203.24.100.46;
Authentication-Results: gmr-mx.google.com; spf=neutral (google.com: 203.24.100.46 is neither permitted nor denied by best guess record for domain of sjmac...@lexicon.net) smtp.mail=sjmac...@lexicon.net
Received: from [192.168.1.2] (202.76.163.18.dynamic.rev.eftel.com [202.76.163.18])
by poplet1.per.eftel.com (Postfix) with ESMTP id 670DD44074
for <python-excel@googlegroups.com>; Thu, 25 Jun 2009 10:29:48 +0800 (WST)
Message-ID: <4A42E113.1010800@lexicon.net>
Date: Thu, 25 Jun 2009 12:29:39 +1000
From: John Machin <sjmac...@lexicon.net>
User-Agent: Thunderbird 2.0.0.22 (Windows/20090605)
MIME-Version: 1.0
To: python-excel@googlegroups.com
Subject: Re: [pyxl] Status of XLRD reading .xlsx (Excel 2007)
References: <03b6dd47-3ca6-47a4-b4fe-ac734740c774@33g2000vbe.googlegroups.com>
In-Reply-To: <03b6dd47-3ca6-47a4-b4fe-ac734740c774@33g2000vbe.googlegroups.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
On 25/06/2009 6:51 AM, Darryl Wallace wrote:
Hi Darryl,
> I know this has been asked in the past, but is support for
> reading .xlsx (Excel 2007) format closer to being complete?
The current intention is this:
Basic support will be in the next release, whenever that is, unless
something happens that causes it not to be. It is intended to support
on_demand=True but not formatting_info=True. Support for *any* version
of Excel is unlikely ever to be "complete".
> The reason I ask is because the included README.html mentions that
> support is scheduled for v0.7.1 which is the current version.
s/is/was/
I apologise for the slackness of the documentation team :-)
> I tried
> to read a simple excel 2007 (under ubuntu linux, python 2.5.4) file
> and was greeted with the following error:
> ---
>>>> book = xlrd.open_workbook("myexcel2007book.xlsx")
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "xlrd/__init__.py", line 429, in open_workbook
> biff_version = bk.getbof(XL_WORKBOOK_GLOBALS)
> File "xlrd/__init__.py", line 1545, in getbof
> bof_error('Expected BOF record; found %r' % self.mem[savpos:savpos
> +8])
> File "xlrd/__init__.py", line 1539, in bof_error
> raise XLRDError('Unsupported format, or corrupt file: ' + msg)
> xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected
> BOF record; found 'PK\x03\x04\x14\x00\x06\x00'
> ---
> So my guess is that it's not ready and that's fine. I was just
> interested in the status.
If you have some non-simple XLSX files that you think may test the
capabilities of the development team, please send them. Of particular
interest would be files created by software other than Excel itself. As
with previous Excel versions, Microsoft documentation will say "you must
do X" but Excel will support reading non-X. This has already occurred
with the docs saying you must use the shared string table; C# code
supplied by an MS write-your-own-XLSX workshop doesn't comply but Excel
accepts the resultant file silently.
Cheers,
John