作者:戴晓珊_340 | 来源:互联网 | 2023-05-20 23:52
有谁知道如何char
使用Java的XMLStreamWriter 正确输出扩展字符(非BMP,超过1 )?例如,尝试输出Unicode U + 10480
:
import java.io.OutputStreamWriter;
import java.nio.charset.StandardCharsets;
import javax.xml.stream.XMLOutputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamWriter;
public class XmlStreamWriterExtendedCharactersFail {
public static void main(String[] args) throws XMLStreamException {
String inlineStr = "inlineStr = ";
// create string using StringBuilder to avoid Java file encoding confusion:
String sbStr = new StringBuilder("sbStr = ").appendCodePoint(0x10480).toString();
assert sbStr.equals(inlineStr);
System.out.println(sbStr);
OutputStreamWriter outWriter = new OutputStreamWriter(System.out,
StandardCharsets.UTF_8.newEncoder());
XMLStreamWriter writer = XMLOutputFactory.newFactory()
.createXMLStreamWriter(outWriter);
writer.writeStartDocument("UTF-8", "1.1");
writer.writeStartElement("el");
writer.writeCharacters(sbStr);
writer.writeEndElement();
writer.writeEndDocument();
writer.close();
}
}
results in:
sbStr =
sbStr = ��
Note that ��
是无效的代码点,在使用SAX解析时会导致错误.
预期产量:
sbStr =
sbStr =
sbStr = ⣰
也会在紧要关头做,但第一个更好.