Message from discussion
How do you htmlentities in Python
Path: g2news1.google.com!postnews.google.com!q19g2000prn.googlegroups.com!not-for-mail
From: Matimus <mccre...@gmail.com>
Newsgroups: comp.lang.python
Subject: Re: How do you htmlentities in Python
Date: Mon, 04 Jun 2007 17:17:27 -0000
Organization: http://groups.google.com
Lines: 34
Message-ID: <1180977447.745432.109040@q19g2000prn.googlegroups.com>
References: <mailman.8674.1180963921.32031.python-list@python.org>
NNTP-Posting-Host: 134.134.136.3
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-Trace: posting.google.com 1180977448 9055 127.0.0.1 (4 Jun 2007 17:17:28 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Mon, 4 Jun 2007 17:17:28 +0000 (UTC)
In-Reply-To: <mailman.8674.1180963921.32031.python-list@python.org>
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4,gzip(gfe),gzip(gfe)
X-HTTP-Via: 1.1 jffwpr03.jf.intel.com:911 (squid/2.5.STABLE12)
Complaints-To: groups-abuse@google.com
Injection-Info: q19g2000prn.googlegroups.com; posting-host=134.134.136.3;
posting-account=M_6TYQwAAAArPByfBgv1JGPpAkaflA4L
On Jun 4, 6:31 am, "js " <ebgs...@gmail.com> wrote:
> Hi list.
>
> If I'm not mistaken, in python, there's no standard library to convert
> html entities, like & or > into their applicable characters.
>
> htmlentitydefs provides maps that helps this conversion,
> but it's not a function so you have to write your own function
> make use of htmlentitydefs, probably using regex or something.
>
> To me this seemed odd because python is known as
> 'Batteries Included' language.
>
> So my questions are
> 1. Why doesn't python have/need entity encoding/decoding?
> 2. Is there any idiom to do entity encode/decode in python?
>
> Thank you in advance.
I think this is the standard idiom:
>>> import xml.sax.saxutils as saxutils
>>> saxutils.escape("&")
'&'
>>> saxutils.unescape(">")
'>'
>>> saxutils.unescape("A bunch of text with entities: & > <")
'A bunch of text with entities: & > <'
Notice there is an optional parameter (a dict) that can be used to
define additional entities as well.
Matt