Message from discussion
How do you htmlentities in Python
Path: g2news1.google.com!news1.google.com!border1.nntp.dca.giganews.com!nntp.giganews.com!local01.nntp.dca.giganews.com!nntp.insightbb.com!news.insightbb.com.POSTED!not-for-mail
NNTP-Posting-Date: Tue, 05 Jun 2007 14:08:01 -0500
Newsgroups: comp.lang.python
Subject: Re: How do you htmlentities in Python
References: <mailman.8674.1180963921.32031.python-list@python.org> <1180977447.745432.109040@q19g2000prn.googlegroups.com>
X-Newsreader: trn 4.0-test76 (Apr 2, 2001)
From: cla...@lairds.us (Cameron Laird)
Originator: cla...@lairds.us (Cameron Laird)
Date: Tue, 5 Jun 2007 18:36:26 +0000
Message-ID: <a2ngj4-t2i.ln1@lairds.us>
Lines: 41
NNTP-Posting-Host: 74.132.196.196
X-Trace: sv3-hvYxlIPR9Yi6JsEtZsmjm2fQ/o1Y3im8Tpe+Bd1RAOPkjwP6oGO4iLh3wTx6HvHgbAWGGIagSxjkZYL!jwFxy9Ax4u0TjIEotzkGx8loLdi2tl5J1oHD6OMBfZA6f+Fql5TCZQxlrnHHzeO2sg==
X-Complaints-To: abuse@insightbb.com
X-DMCA-Complaints-To: ab...@insightbb.com
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.34
In article <1180977447.745432.109...@q19g2000prn.googlegroups.com>,
Matimus <mccre...@gmail.com> wrote:
>On Jun 4, 6:31 am, "js " <ebgs...@gmail.com> wrote:
>> Hi list.
>>
>> If I'm not mistaken, in python, there's no standard library to convert
>> html entities, like & or > into their applicable characters.
>>
>> htmlentitydefs provides maps that helps this conversion,
>> but it's not a function so you have to write your own function
>> make use of htmlentitydefs, probably using regex or something.
>>
>> To me this seemed odd because python is known as
>> 'Batteries Included' language.
>>
>> So my questions are
>> 1. Why doesn't python have/need entity encoding/decoding?
>> 2. Is there any idiom to do entity encode/decode in python?
>>
>> Thank you in advance.
>
>I think this is the standard idiom:
>
>>>> import xml.sax.saxutils as saxutils
>>>> saxutils.escape("&")
>'&'
>>>> saxutils.unescape(">")
>'>'
>>>> saxutils.unescape("A bunch of text with entities: & > <")
>'A bunch of text with entities: & > <'
>
>Notice there is an optional parameter (a dict) that can be used to
>define additional entities as well.
.
.
.
Good points; I like your mention of the optional entity dictionary.
It's possible that your solution is to a different problem than the original
poster intended. <URL: http://wiki.python.org/moin/EscapingHtml > has de-
tails about HTML entities vs. XML entities.