Gmail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
stripping lines from an excel file
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  2 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Sage  
View profile  
 More options Oct 23, 4:53 am
From: Sage <sagear...@gmail.com>
Date: Thu, 22 Oct 2009 11:53:30 -0700 (PDT)
Local: Fri, Oct 23 2009 4:53 am
Subject: Re: stripping lines from an excel file
I am trying to find a string and copy the lines that have that string
in it to another excel file.
My issue now I can only copy the cells that come after the cell with
the string I flag.  Ive tried to store out each line into a temporary
list (not good for speed or memory) but I cant, I think because each
time the "def cell" is called everything has to be defined anew.  So
if I try writing the new cell to an array I get an error that I am
trying to access before assigning.  I am the original poster and got a
closer to working script which is pasted below.  (after script I
pasted a test.xls small grid I am rtesting script on).

import os
from xlutils.filter import \
    BaseReader,BaseFilter,BaseWriter,process

class Reader(BaseReader):
    def get_filepaths(self):
        return [os.path.abspath('test.xls')]

class Writer(BaseWriter):
    def get_stream(self,filename):
        return file(filename,'wb')

class Filter(BaseFilter):
    pending_row = None
    wtrowxi = 0
    def workbook(self,rdbook,wtbook_name):
        self.next.workbook(rdbook,'filtered-'+wtbook_name)
    def row(self,rdrowx,wtrowx):
        self.pending_row = (rdrowx,wtrowx)
    def cell(self,rdrowx,rdcolx,wtrowx,wtcolx):
        if rdrowx==0 and rdcolx==0:
            self.print_row_num=-1
        value = self.rdsheet.cell(rdrowx,rdcolx).value
        if value == 'x':
            self.print_row_num = rdrowx
            self.print_row = True
            rdrowx, wtrowx = self.pending_row
            self.next.row(rdrowx,wtrowx+self.wtrowxi)
            self.wtrowxi -= 1
        else:
            self.print_row = False
        if (rdrowx == self.print_row_num):
            self.next.cell(rdrowx,rdcolx,wtrowx+self.wtrowxi,wtcolx)

process(Reader(),Filter(),Writer())

--------------------------------------------------------------------------- --------------------------------------------------
test.xls grid below

A1      B1      C1      D1      E1      Header
x       B2      C2      D2      E2      0.6
C1      B3      C3      D3      E3      6
D1      B4      x       D4      E4      66

--------------------------------------------------------------------------- --------------------------------------------------

results from running python on test.xls

x       B2      C2      D2      E2      0.6
                x       D4      E4      66

===============================================================
On 22 Oct, 08:50, Sage Arbor


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Chris Withers  
View profile  
 More options Oct 28, 3:24 am
From: Chris Withers <ch...@simplistix.co.uk>
Date: Tue, 27 Oct 2009 16:24:31 +0000
Local: Wed, Oct 28 2009 3:24 am
Subject: Re: [pyxl] Re: stripping lines from an excel file

You don't appear to actually use "print_row" at all...

How about the following instead:

class Filter(BaseFilter):

    pending_row = None
    wtrowxi = 0
    filtered = False

    def flush(self):
        if self.pending_row and not self.filtered:
            rdrowx,wtrowx = self.pending_row
            wtrowx += self.wtrowxi
            self.next.row(rdrowx,wtrowx)
            for cell in self.cells:
                rdcolx,wtcolx = cell
                self.next.cell(rdrowx,rdcolx,wtrowx,wtcolx)

    def workbook(self,rdbook,wtbook_name):
        self.flush()
        self.next.workbook(rdbook,'filtered-'+wtbook_name)

    def sheet(self,rdsheet,wtsheet_name):
        self.flush()
        self.rdsheet = rdsheet
        self.next.sheet(rdsheet,wtsheet_name)

    def row(self,rdrowx,wtrowx):
        self.flush()
        self.pending_row = (rdrowx,wtrowx)
        self.filtered = False
        self.cells = []

    def cell(self,rdrowx,rdcolx,wtrowx,wtcolx):
        if self.filtered:
            return
        elif self.rdsheet.cell_value(rdrowx,rdcolx)=='x':
             self.filtered==True
             self.wtrowxi-=1
             return
        else:
             assert (rdrowx,wtrowx)==self.pending_row
             self.cells.append((rdcolx,wtcolx))

cheers,

Chris

--
Simplistix - Content Management, Batch Processing & Python Consulting
            - http://www.simplistix.co.uk


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google