Web Images Videos Maps News Groups Gmail more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Message from discussion How do you htmlentities in Python

View parsed - Show only message text

From: "Thomas Jollans" <tho...@jollans.NOSPAM.com>
Newsgroups: comp.lang.python
References: <mailman.8674.1180963921.32031.python-list@python.org> <1180965792.757685.132580@q75g2000hsh.googlegroups.com>
Subject: Re: How do you htmlentities in Python
Date: Mon, 4 Jun 2007 17:14:56 +0100
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.3028
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028
X-RFC2646: Format=Flowed; Original
Lines: 25
Message-ID: <46643c3c$0$2892$6e1ede2f@read.cnntp.org>
Organization: CNNTP
NNTP-Posting-Host: f570ae08.read.cnntp.org
X-Trace: DXC=a=nDNhTT]fGEo=@V>U?P3GWoT\PAgXa?AKQcNg`?eU^G7Bf=h81CgiOPHUL<2?C2\IJXRSOefeV2JNdWbU23E]XO
X-Complaints-To: abuse@cnntp.org
Path: g2news1.google.com!news4.google.com!proxad.net!feeder1-2.proxad.net!feed.ac-versailles.fr!news.ecp.fr!news.albasani.net!feed.cnntp.org!news.cnntp.org!not-for-mail

"Adam Atlas" <a...@atlas.st> wrote in message 
news:1180965792.757685.132580@q75g2000hsh.googlegroups.com...
> As far as I know, there isn't a standard idiom to do this, but it's
> still a one-liner. Untested, but I think this should work:
>
> import re
> from htmlentitydefs import name2codepoint
> def htmlentitydecode(s):
>    return re.sub('&(%s);' % '|'.join(name2codepoint), lambda m:
>         name2codepoint[m.group(1)], s)
>

'&(%s);' won't quite work: HTML (and, I assume, SGML, but not XHTML being 
XML) allows you to skip the semicolon after the entity if it's followed by a 
white space (IIRC). Should this be respected, it looks more like this: 
r'&(%s)([;\s]|$)'

Also, this completely ignores non-name entities as also found in XML. (eg 
%x20; for ' ' or so) Maybe some part of the HTMLParser module is useful, I 
wouldn't know. IMHO, these particular batteries aren't too commonly needed.

Regards,
Thomas Jollans 



Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google