Gmail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Dumb question about updating xlrd version
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  7 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Marco  
View profile  
 More options Oct 19, 9:07 pm
From: Marco <giun...@googlemail.com>
Date: Mon, 19 Oct 2009 04:07:07 -0700 (PDT)
Local: Mon, Oct 19 2009 9:07 pm
Subject: Dumb question about updating xlrd version
Sorry for the question. I have just started using xlrd and it seems to
be the perfect solution for what I need (read an excel file into
python!! :-). I noticed that version 0.71 is out. Now, what is the
best way for updating the package without messing things up? I am on
Windows XP. I assume the new version is backward compatible.

Thank you John for writing this very useful package.

marco


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John Machin  
View profile  
 More options Oct 19, 10:02 pm
From: John Machin <sjmac...@lexicon.net>
Date: Mon, 19 Oct 2009 23:02:27 +1100
Local: Mon, Oct 19 2009 10:02 pm
Subject: Re: [pyxl] Dumb question about updating xlrd version
On 19/10/2009 10:07 PM, Marco wrote:

Welcome to the group ...

> Sorry for the question. I have just started using xlrd and it seems to
> be the perfect solution for what I need (read an excel file into
> python!! :-). I noticed that version 0.71 is out.

Version 0.7.1 was released on 2009-06-01, well over 4 months ago. When
exactly did you "just" start using xlrd, and what version are you using
at the moment?

> Now, what is the
> best way for updating the package without messing things up? I am on
> Windows XP.

Presuming you used a distutils method of installing (the Windows
installer (xlrd-VERSION.win32.exe), or running setup.py after extracting
from an archive) it should suffice to install over the top of your
existing version.

> I assume the new version is backward compatible.

Your usage should be backward compatible provided you have stuck with
the documented APIs.

> Thank you John for writing this very useful package.

Thanks for the kind remark.

Cheers,
John


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Marco  
View profile  
 More options Oct 20, 2:12 am
From: Marco <giun...@googlemail.com>
Date: Mon, 19 Oct 2009 09:12:03 -0700 (PDT)
Local: Tues, Oct 20 2009 2:12 am
Subject: Re: Dumb question about updating xlrd version
John, i don't want to sound cheesy, but i have to thank you for your
quick reply.

What you say is very helpful. However, I remembered that I had that
utility easy_install installed and that made the trick.

As far as the version is concerned, I downloaded version 0.6.1 for
windows from www.lexicon.net last week (http://www.lexicon.net/
sjmachin/xlrd.htm). I found out that the new version was out because I
googled "python xlrd xlsx".

I have just one question: I would like it to be a bit quicker (I have
large files): is there a way of speeding it up, possibly by skipping
formatting and just get to the data or is it just the way it is? (not
at all a complaint).

Well, I hope that the answer to my dumb question will help some other
(dumb :-) person as well.

Ciao, marco

On Oct 19, 1:02 pm, John Machin <sjmac...@lexicon.net> wrote:


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John Machin  
View profile  
 More options Oct 20, 9:07 am
From: John Machin <sjmac...@lexicon.net>
Date: Tue, 20 Oct 2009 10:07:52 +1100
Local: Tues, Oct 20 2009 9:07 am
Subject: Re: [pyxl] Re: Dumb question about updating xlrd version
On 20/10/2009 3:12 AM, Marco wrote:

> I have just one question: I would like it to be a bit quicker (I have
> large files):

What is "large", what are you running it on (including how much free
memory), how long does it take to do what, how long might it take to do
the same job using pywin32/COM, what is "a bit quicker", and how much
time/effort/money do you want to spend?

> is there a way of speeding it up, possibly by skipping
> formatting and just get to the data or is it just the way it is? (not
> at all a complaint).

Here are a couple of minimal-investment ideas:

(1) Try using psyco ... e.g. on my box [see below] with Python 2.6.2,
the time for open_workbook() with default args drops from 7.9 secs to
2.6 secs with a 15 Mb file; from 31 secs to 10.2 secs with a 120 Mb file
-- this is using psyco.full() immediately before the first
open_workbook() call.

(2) "possibly by skipping formatting and just get to the data": Read the
docs about the args of open_workbook(), especially the formatting_info
arg. Eyeball the source code, especially in the neighbourhood of
mentions of "formatting_info".

My box has a 2.0 GHz AMD Turion 64 Mobile (running in 32-bit mode) with
896Mb usable memory. OS is Windows XP SP3.

Cheers,
John


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John Machin  
View profile  
 More options Oct 20, 9:13 pm
From: John Machin <sjmac...@lexicon.net>
Date: Tue, 20 Oct 2009 22:13:44 +1100
Local: Tues, Oct 20 2009 9:13 pm
Subject: Re: Dumb question about updating xlrd version
On 20/10/2009 7:36 PM, Marco wrote:

=== top-posting corrected ===

=== Also I'm presuming you didn't intend to go off-list so I'm pushing
this back there ===

 > Hi John,
 > I know that big and slow are relative concepts. I haven't measured the
 > time exactly but, for a single worksheet with 1 date and 44 numerical
 > columns and about 100,000 rows, it takes well over a minute to read
 > the values, in my very basic script.

The maximum number of rows in a single worksheet of an XLS file is
65,536 (Excel 97 to Excel 2003). Is 66 about 100?

 > I must be doing something silly
 > because  it's much slower than your script with a file of about the
 > same size. I have a dualcore centrino 1.66Gbx2 and 1GB ram (not
 > shared), on XP SP3, 32 bit, of course.
 >
 > This week I am working on something else but i will try to speed up my
 > xlrd script soon, maybe trying to give it a bit more CPU as well.

Your script may well need speeding up, but you need to read carefully to
ensure that you are comparing like with like. I wrote "the time for
open_workbook() with default args drops from 7.9 secs to 2.6 secs" ...
note that this is like it says *ONLY* the time for the open_workbook()
call. It does *NOT* include any further processing, silly or otherwise.

Dig out the runxlrd.py script (a supported installation method should
have left it in your_python_install_directory\scripts) and do this

\whatever\it\takes\runxlrd.py --help
...\runxlrd.py ov yourfile.xls
...\runxlrd.py show yourfile.xls >stdout.txt

The last line above dumps out a line for each non-empty cell. You might
like to compare the code in runxlrd.py and the time it takes with yours.

 > Psycho? Oh, no, another module!

"Psycho"??? Oh, no, another inattentive reader :-) Reread my previous
message.

As I said, all you need to do is import the module and make one function
call. The result is way up there on the bangs per buck scale. It's not
"another module".

 > I am still using python 2.5, I plan to
 > install python(x,y) 2.6 this week. Then, I will follow your
 > suggestions (Psycho), and look at the documentation!!
 >
 > In any case, i will let you kow if I make progress, maybe we could
 > start a new thread for  this topic.

More importantly, let us know if you /don't/ make progress :-) If all
else fails, publish your code for review.

Cheers,
John


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Marco  
View profile  
 More options Oct 21, 1:44 am
From: Marco <giun...@googlemail.com>
Date: Tue, 20 Oct 2009 08:44:15 -0700 (PDT)
Local: Wed, Oct 21 2009 1:44 am
Subject: Re: Dumb question about updating xlrd version

John,
this week I am working on something else. Just wanted to tell you that
since version 2007 the Excel grid is 1,048,576 rows by 16,384 columns,
even MS products improve sometimes!

Happy to hear that i can report failures in improving the speed. Will
do when i have time.

Thank you very much for your help, John.

Marco

On Oct 20, 12:13 pm, John Machin <sjmac...@lexicon.net> wrote:


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John Machin  
View profile  
 More options Oct 21, 7:09 am
From: John Machin <sjmac...@lexicon.net>
Date: Wed, 21 Oct 2009 08:09:38 +1100
Local: Wed, Oct 21 2009 7:09 am
Subject: Re: [pyxl] Re: Dumb question about updating xlrd version
On 21/10/2009 2:44 AM, Marco wrote:

> John,
> this week I am working on something else. Just wanted to tell you that
> since version 2007 the Excel grid is 1,048,576 rows by 16,384 columns,
> even MS products improve sometimes!

That large grid is available only in XLS[XMB] files. If you use Excel
2007 to save a workbook containing cells outside the older limits as an
XLS file, it will warn you and ignore offending cells if you persist.

No released version of xlrd supports reading other than XLS files.

Your statements that your script was reading files with about 100,000
rows in a single worksheet and that you were using xlrd are not
compatible except under an extremely charitable interpretation of
"about", or some other "deus ex machina".


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google