在Java XML 解析中 - 合并 xi:include的输出 发帖人想要使用 XInclude 但在include
元素上使用了错误的命名空间。我认为将XMLFilter放在 XInclude 感知解析器之前,其中 XMLFilter 负责更正命名空间,可以解决这个问题(无需分别手动编辑文件,而无需单独的处理步骤首先创建具有更正命名空间的中间文件) .
所以我写了以下内容XMLFilter
,扩展了XMLFilterImpl
SAX 提供的内容:
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.XMLFilterImpl;
public class XIncludeNsFixup extends XMLFilterImpl {
static String correctURI = "http://www.w3.org/2001/XInclude";
static String oldURI = "http://www.w3.org/2003/XInclude";
@Override
public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException {
if (uri.equals(oldURI)) {
super.startElement(correctURI, localName, qName, atts);
}
else {
super.startElement(uri, localName, qName, atts);
}
}
@Override
public void endElement(String uri, String localName, String qName) throws SAXException {
if (uri.equals(oldURI)) {
super.endElement(correctURI, localName, qName);
}
else {
super.endElement(uri, localName, qName);
}
}
@Override
public void startPrefixMapping(String prefix, String uri) throws SAXException {
if (uri.equals(oldURI)) {
super.startPrefixMapping(prefix, correctURI);
}
else {
super.startPrefixMapping(prefix, uri);
}
}
}
然后我分别创建了一个 XInclude 感知 SAXParserXMLReader
以链接到该过滤器并将示例文档作为SAXSource
来自该过滤器的 a 加载到默认值Transformer
中以构建一个DOMResult
:
SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setNamespaceAware(true);
spf.setXIncludeAware(true);
XMLReader inputReader = spf.newSAXParser().getXMLReader();
XMLFilter fixNs = new XIncludeNsFixup();
fixNs.setParent(inputReader);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer builder = tf.newTransformer();
DOMResult fixedInput = new DOMResult();
builder.transform(new SAXSource(fixNs, new InputSource("file3.xml")), fixedInput);
Document doc = (Document) fixedInput.getNode();
Transformer serializer = tf.newTransformer();
serializer.transform(new DOMSource(doc), new StreamResult(System.out));
file3.xml
我使用的示例文档xi:xinclude
在正确的 XInclude 命名空间中有一个元素,在旧的不受支持的命名空间中有一个元素:
<?xml version="1.0" encoding="UTF-8"?>
<contexts>
<context name="a">
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="reuse.xml"/>
<foo>Original text 1.</foo>
</context>
<context name="b">
<xi:include xmlns:xi="http://www.w3.org/2003/XInclude" href="reuse.xml"/>
<foo>Original text 2.</foo>
</context>
</contexts>
我的期望是过滤器首先修复命名空间,然后 XMLReader 对两个元素执行 XInclude。但是,使用 Java 1.8 运行代码时,输出如下:
<?xml version="1.0" encoding="UTF-8" standalone="no"?><contexts>
<context name="a">
<text xml:base="reuse.xml">I am XIncluded text.</text>
<foo>Original text 1.</foo>
</context>
<context name="b">
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="reuse.xml"/>
<foo>Original text 2.</foo>
</context>
</contexts>
因此过滤器已将命名空间固定在第二个include
元素上,但 XMLReader 仅在第一个include
元素上应用了 XInclude 包含。
我哪里出错了?如何链接过滤器和 XInclude 感知 XMLReader 来修复命名空间并对命名空间更正的元素执行 XInclude 包含?
为了完整起见,这里是reuse.xml
:
<?xml version="1.0" encoding="UTF-8"?>
<text>I am XIncluded text.</text>
以及便于测试的 Java 程序的完整代码:
import java.io.IOException;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParserFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMResult;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.sax.SAXSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLFilter;
import org.xml.sax.XMLReader;
public class XIncludeTest1 {
public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException, TransformerConfigurationException, TransformerException {
SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setNamespaceAware(true);
spf.setXIncludeAware(true);
XMLReader inputReader = spf.newSAXParser().getXMLReader();
XMLFilter fixNs = new XIncludeNsFixup();
fixNs.setParent(inputReader);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer builder = tf.newTransformer();
DOMResult fixedInput = new DOMResult();
builder.transform(new SAXSource(fixNs, new InputSource("file3.xml")), fixedInput);
Document doc = (Document) fixedInput.getNode();
Transformer serializer = tf.newTransformer();
serializer.transform(new DOMSource(doc), new StreamResult(System.out));
}
}
我还尝试将来自 Apache 的最新 Xerces Java 放在类路径中,以查看它是否解决了问题,但输出保持不变。