xml中文编码问题
在将*.xml文件读入msxmldom时,该dom对象将中编码的设置项去除了
即俺的encoding="gb2312"不见了
该怎么做才能在dom中设置中文编码
问题点数:0、回复次数:7Top
1 楼losebaby(可乐罐)回复于 2003-09-03 09:35:07 得分 0
俺急啊,各位大哥大姐帮帮忙回答一下吧,给点提示也好啊
UPTop
2 楼surfw3()回复于 2003-09-03 10:52:17 得分 0
试
<script language="VBscript">
set xmldoc=createobject("microsoft.xmldom")
xmldoc.async=flase
xmldoc.load("1.xml")
msgbox(xmldoc.xml)
</script>
这时msgbox中不会出现encoding="gb2312",big5也一样,如果是其它编码则会出现。
再试
<script language="VBscript">
set xmldoc=createobject("microsoft.xmldom")
xmldoc.async=flase
xmldoc.load("1.xml")
set docinstruction=xmldoc.childnodes.item(0)
msgbox(docinstruction.nodevalue)
</script>
此时msgbox中显示的声明中的编码就是XML中设定的,DOM中的XML编码并没有改变。
Top
3 楼losebaby(可乐罐)回复于 2003-09-03 14:06:08 得分 0
如果<?xml version="1.0" encoding="gb2312">语句中不设置中文字符集,则用ie不能正常显示带有中文的xml文件啊Top
4 楼surfw3()回复于 2003-09-03 17:21:37 得分 0
try:
set PI=xmldoc.createProcessingInstruction("xml","version=""1.0"","encoding=""gb2312""")
xmldoc.insertBefore PI,xmldoc.choldNodes.item(0)
不行的话,先将原来的声明删除,再用上面的。
Top
5 楼jfly()回复于 2003-09-03 17:59:06 得分 0
为什么要将xml直接提交给ie呢?如果是用xsl转为html提交给ie,只要在xsl中的html代码里设置了encoding就行了。
下面这段话请务必保存:
We really need a KB article on encodings, because these same questions
come up repeatedly.
Here is one principle that can't be overstated If you want encodings
preserved, don't use strings. Strings are always encoded in UTF-16 on
the Win32 platform. You can't ask MSXML to output to a GB2312 string.
It's a contradiction. Either MSXML has to output GB2312 bytes, or it
has to output a UTF-16 encoded string. It can't do both. Use stream
methods (like load, transformNodeToObject) to output bytes, and string
methods (like loadXML, transformNode) to output strings.
In your example code, you were dealing almost totally in strings,
causing major encoding headaches
1. loadXML() takes a UTF-16 string as an argument
2. responseText returns the response converted to a UTF-16 string
3. transformNode() returns a UTF-16 string
4. Response.Write() takes a UTF-16 string as an argument. It then
magically converts the string to a byte stream encoded using the current
session codepage.
The solution is to rewrite your code to avoid caching intermediate
output in string form
Get response object
responseXML returns a document created by parsing the response
stream, so no need to call load
var xmlResponseDoc = xmlPostObject.responseXML;
Create stylesheet object
var xslFile = nihao.xsl;
var xslDoc = new ActiveXObject(MSXML2.DOMDocument);
xslDoc.async = 0;
xslDoc.load(Server.MapPath(xslFile));
Apply stylesheet to xml
Do not allow intermediate result to be cached in a string.
Instead, output directly to the ASP response stream in order to
preserve the requested encoding.
xmlResponseDoc.transformNodeToObject(xslDoc, Response);
I also recommend setting Response.Charset = GB2312 as well as
Response.Content=texthtml so that the browser can be quickly informed
of the content and encoding of the incoming page (this avoids
auto-detection logic).
This solution is faster and cleaner than the string solution. Use
strings when you want to display output, not when you're just shuttling
it to the next piece of a processing pipeline.
~Andy Kimball
MSXSL DevTop
6 楼binzhi()回复于 2003-09-04 17:52:02 得分 0
你只有在写硬盘前将这句加到头部<?xml version='1.0' encoding='GB2312'?>
我是在JAVA中一直这样做的,书上也是这样的,不知道有没有其它办法Top
7 楼towzy(晚枫)回复于 2003-09-04 20:55:07 得分 0
JDOM解析器里面有个类,可用于设置编码:org.jdom.output.XMLOutputter;
XMLOutputter OutPuter = new XMLOutputter(" ", true, "gb2312");
OutPuter.outputString(org.jdom.Document));
将org.jdom.Document替换成你自己的XML文档实例。
Top




