Hello all,
I have something of a problem i was hoping the gods of the web that
reside here could help me with. I'm making my sitemap for google, and
my site is rather large (2 million+ pages) and when running my python
script it starts off without a hitch. Works beautifully, that is
untill it hits sitemap54.xml.gz... then without fail it crashes. Below
is the message I get. (I cut the file path down to save space as your
don't need to see the huge file path it goes through.)
---
Writing Sitemap file "(file path)/sitemap53.xml.gz" with 50000 URLs
Sorting and normalizing collected URLs.
Writing Sitemap file "(file path)/sitemap54.xml.gz" with 50000 URLs
Traceback (most recent call last):
File "sitemap_gen.py", line 2208, in ?
sitemap.Generate()
File "sitemap_gen.py", line 1780, in Generate
input.ProduceURLs(self.ConsumeURL)
File "sitemap_gen.py", line 979, in ProduceURLs
os.path.walk(self._path, PerDirectory, None)
File "/usr/lib/python2.4/posixpath.py", line 298, in walk
walk(name, func, arg)
File "/usr/lib/python2.4/posixpath.py", line 298, in walk
walk(name, func, arg)
File "/usr/lib/python2.4/posixpath.py", line 298, in walk
walk(name, func, arg)
File "/usr/lib/python2.4/posixpath.py", line 298, in walk
walk(name, func, arg)
File "/usr/lib/python2.4/posixpath.py", line 298, in walk
walk(name, func, arg)
File "/usr/lib/python2.4/posixpath.py", line 290, in walk
func(arg, top, names)
File "sitemap_gen.py", line 974, in PerDirectory
PerFile(dirpath, name)
File "sitemap_gen.py", line 959, in PerFile
consumer(url, False)
File "sitemap_gen.py", line 1841, in ConsumeURL
self._urls[hash] = 1
MemoryError
---
Anyone have any incite or work arounds to how i can free up the
apparent memory that is gummed up by this process? Any help is
greatfully appreciated!
THANK YOU!