00570: Illegal XML code generated by RSS

Summary: Illegal XML code generated by RSS
Created: 2005-10-24 11:23
Status: Closed - fixed for 2.1.beta14
Category: Bug
From: Jean-Dom?
Assigned:
Priority: 3
Version: 2.0.10
OS: Linux, Apache, PHP5 or Windows, Apache, PHP4

Description: This bug seems to be a variation of 00235.

When the text of the page contains characters with accent, they are not converted from ISO-8859-1 to UTF-8.

So, the generated RSS Flow is detected as invalid.

For instance, when I use the Feed Validator on my [(approve links) edit diff], I get the following error:

*************************
Warning
This feed is valid, but may cause problems for some users. We recommend fixing these problems.
    * line 94, column 1: description contains bad characters (6 occurrences) [help]
*************************
93. #           <description>Bienvenue&#160;!
94. # L&#146;&#233;conomie de march&#233;, ce n&#146;est pas la jungle&#133; mais parfois, &#231;
*************************

The actual text is: L'économie de marché, ce n'est pas la jungle… mais parfois

The HTML pages are coded with the page code ISO-8859-1, as written in the HTML Headers. Because the XML code only supports UTF-8 one, the characters should be translated during RSS creation.

Thanks in advance,

Jean-Dominique, Annecy, France.


For my wiki, I made a unicode RSS 2.0 feed with the full HTML of pages embedded. See [(approve links) edit diff] . (Yes, it's valid.) Email boxcarl at my domain if you want to see my modifications to RSS.php

--Boxcarl