Gmail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Message from discussion Python 3000: Standard API for archives?

View parsed - Show only message text

Path: g2news1.google.com!news2.google.com!newshub.sdsu.edu!newscon04.news.prodigy.net!prodigy.net!newsdst01.news.prodigy.net!prodigy.com!postmaster.news.prodigy.com!newssvr25.news.prodigy.net.POSTED!218e49e3!not-for-mail
From: samwyse <dejan...@email.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; rv:1.7.3) Gecko/20040913 Thunderbird/0.8 Mnenhy/0.7.3.0
X-Accept-Language: en-us, en
MIME-Version: 1.0
Newsgroups: comp.lang.python
Subject: Python 3000: Standard API for archives?
X-Face: 9&i6.Np|xZV[?+9RRi;EQj9pR'm(g_UgAwmBfX\|i4Y)]hipL#Vc<`/E='jr|ahzAHOqE#~`u#LBXZ:ztkU"uNK;(Qe]/X;hsE5}<3/{|GBn)UEP$,,R#d^9~p\]],~H7O|E,!}xD7%d]L7qDsJN,n+_onw]j"DPJYb8bU9+R>7@$\*zx|)QRPZ)XJKTD=43-dX9AX0,}<55:.CatZnIWfv.myc+6Qd=OE'|[^[@?=}sKop&HyDtYAA_))6RjM'o#OfW~u[^RE(G(VC#=Bg]A|<be|[h
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 30
Message-ID: <%aT8i.12569$rO7.12072@newssvr25.news.prodigy.net>
NNTP-Posting-Host: 70.130.156.36
X-Complaints-To: abuse@prodigy.net
X-Trace: newssvr25.news.prodigy.net 1180958523 ST000 70.130.156.36 (Mon, 04 Jun 2007 08:02:03 EDT)
NNTP-Posting-Date: Mon, 04 Jun 2007 08:02:03 EDT
Organization: SBC http://yahoo.sbc.com
X-UserInfo1: Q[R_@SJGABWYBQX[ARHD]_\@VR]^@B@MCPWZKB]MPXHBTWICYFWUQBKZQLYJX\_ITFD_KFVLUN[DOM_A_NSYNWPFWNS[XV\I]PZ@BQ[@CDQDPCL^FKCBIPC@KLGEZEFNMDYMKHRL_YYYGDSSODXYN@[\BK[LVTWI@AXGQCOA_SAH@TPD^\AL\RLGRFWEARBM
Date: Mon, 04 Jun 2007 12:02:03 GMT

I'm a relative newbie to Python, so please bear with me.  There are 
currently two standard modules used to access archived data:  zipfile 
and tarfile.  The interfaces are completely different.  In particular, 
script wanting to analyze different types of archives must duplicate 
substantial pieces of logic.  The problem is not limited to method 
names; it includes how stat-like information is accessed.

I think it would be a good thing if a standardized interface existed, 
similar to PEP 247.  This would make it easier for one script to access 
multiple types of archives, such as RAR, 7-Zip, ISO, etc.  In 
particular, a single factory class could produce PEP 302 import hooks 
for future as well as current archive formats.

I think that an archive module adhering to the standard should adopt a 
least-common-denominator approach, initially supporting read-only access 
without seek, i.e. tar files on actual tape.  For applications that 
require a seek method (such as importers) a standard wrapper class could 
transparently cache archive members in temp files; this would fit in 
well with Python 3000's rewrite of the I/O interface to support 
stackable interfaces.  To this end, we'd need is_seekable and 
is_writable attributes for both the module and instances (moduel level 
would declare if something is possible, not if it is always true).

Most importantly, all archive modules should provide a standard API for 
accessing their individual files via a single archive_content class that 
provides a standard 'read' method.  Less importantly but nice to have 
would be a way for archives to be auto-magically scanned during walks of 
  directories.

Feedback?

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google