Google Groups Home
Help | Sign in
Message from discussion python sitemap_gen.py MemoryError
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
BadXAsh  
View profile
 More options Jul 3, 1:15 am
From: BadXAsh
Date: Wed, 2 Jul 2008 08:15:29 -0700 (PDT)
Local: Thurs, Jul 3 2008 1:15 am
Subject: python sitemap_gen.py MemoryError
Hello all,

I have something of a problem i was hoping the gods of the web that
reside here could help me with. I'm making my sitemap for google, and
my site is rather large (2 million+ pages) and when running my python
script it starts off without a hitch. Works beautifully, that is
untill it hits sitemap54.xml.gz... then without fail it crashes. Below
is the message I get. (I cut the file path down to save space as your
don't need to see the huge file path it goes through.)

---
Writing Sitemap file "(file path)/sitemap53.xml.gz" with 50000 URLs
Sorting and normalizing collected URLs.
Writing Sitemap file "(file path)/sitemap54.xml.gz" with 50000 URLs
Traceback (most recent call last):
File "sitemap_gen.py", line 2208, in ?
sitemap.Generate()
File "sitemap_gen.py", line 1780, in Generate
input.ProduceURLs(self.ConsumeURL)
File "sitemap_gen.py", line 979, in ProduceURLs
os.path.walk(self._path, PerDirectory, None)
File "/usr/lib/python2.4/posixpath.py", line 298, in walk
walk(name, func, arg)
File "/usr/lib/python2.4/posixpath.py", line 298, in walk
walk(name, func, arg)
File "/usr/lib/python2.4/posixpath.py", line 298, in walk
walk(name, func, arg)
File "/usr/lib/python2.4/posixpath.py", line 298, in walk
walk(name, func, arg)
File "/usr/lib/python2.4/posixpath.py", line 298, in walk
walk(name, func, arg)
File "/usr/lib/python2.4/posixpath.py", line 290, in walk
func(arg, top, names)
File "sitemap_gen.py", line 974, in PerDirectory
PerFile(dirpath, name)
File "sitemap_gen.py", line 959, in PerFile
consumer(url, False)
File "sitemap_gen.py", line 1841, in ConsumeURL
self._urls[hash] = 1
MemoryError
---

Anyone have any incite or work arounds to how i can free up the
apparent memory that is gummed up by this process? Any help is
greatfully appreciated!

THANK YOU!


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2008 Google