[CPS-users] CPS RSS-1.0.0 | UnicodeEncodeError: 'latin-1' codec can't encode characters ...

Richard MAHONEY r.mahoney at iconz.co.nz
Mon Apr 21 10:34:13 CEST 2008


Dear Readers,

CPS_RSS-1.0.0
CPS-3.4.6
feedparser.py  3.2 (default) & 4.1 (latest)

Refreshing the following Japanese feed in the RSS Tool:

http://blogs.dion.ne.jp/sanskrit/index.rdf

Results in:

UnicodeEncodeError: 'latin-1' codec can't encode characters ...

[see the full error log attached as cps-latin-error.log]

Subsequently the RSS Tool becomes completely unusable. (The only way I
could manage to get the RSS Tool running again was to reinstall the
whole CPS site from backup.)

The problematic feed should render as follows (using SPIP):

Indica et Buddhica - Tabulae :: Kataoka, Kei
http://tabulae.indica-et-buddhica.org/rubrique.php3?id_rubrique=261

I'm not sure if this issue with Japanese characters is related to the
incorrect rendering of Latin diacritics with the following feed -- many
commonly used in Romanised Sanskrit transliteration, e.g., a, u and i
macron, S acute, n under-dot &c.:

http://www.informaworld.com/ampp/rss~content=t713405669

Incorrect (using CPS RSS):

Indica et Buddhica - Recently Published issues of Asian Philosophy
http://indica-et-buddhica.org/sections/tabulae/periodica/a/asian-philosophy/asp-recently-published

Correct (using SPIP):

Indica et Buddhica - Tabulae :: Asian Philosophy - Recently Published
http://tabulae.indica-et-buddhica.org/rubrique.php3?id_rubrique=238


I'd be very happy to receive any thoughts on how these issues might be
resolved.


Kind regards,

 Richard MAHONEY



-- 
Richard MAHONEY | internet: http://indica-et-buddhica.org/
Littledene      | telephone/telefax (man.): +64 3 312 1699
Bay Road        | cellular: +64 27 482 9986
OXFORD, NZ      | email: r.mahoney at indica-et-buddhica.org
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Indica et Buddhica: Materials for Indology and Buddhology
Scholia: http://scholia.indica-et-buddhica.org/
Tabulae: http://tabulae.indica-et-buddhica.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cps-latin-error.log
Type: text/x-log
Size: 5116 bytes
Desc: not available
Url : http://lists.nuxeo.com/pipermail/cps-users/attachments/20080421/408dc970/cps-latin-error.bin



This list archive provided by Nuxeo, the leaders of open source ECM. Check out the Nuxeo 5 open source, standards-based ECM project.