Thought I'd mention that the preferred approach to this in other software systems seems to be to map the illegal characters to/from the Unicode Private Use Area.
So illegal XML character 11 0xB becomes codepoint 0xE00B.
Apparently this approach is used by some pieces of commercial software, notably Microsoft Visio.
http://msdn.microsoft.com/en-us/library/office/aa218415%28v=office.10%29.aspx
...mikeb
--
Mike Beckerle | OGF DFDL WG Co-Chair
Tel: 781-330-0412