1

Java XML 解析中 - 合并 xi:include的输出 发帖人想要使用 XInclude 但在include元素上使用了错误的命名空间。我认为将XMLFilter放在 XInclude 感知解析器之前,其中 XMLFilter 负责更正命名空间,可以解决这个问题(无需分别手动编辑文件,而无需单独的处理步骤首先创建具有更正命名空间的中间文件) .

所以我写了以下内容XMLFilter,扩展了XMLFilterImplSAX 提供的内容:

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.XMLFilterImpl;


public class XIncludeNsFixup extends XMLFilterImpl {

    static String correctURI = "http://www.w3.org/2001/XInclude";
    static String oldURI = "http://www.w3.org/2003/XInclude";

    @Override
    public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException {
        if (uri.equals(oldURI)) {
            super.startElement(correctURI, localName, qName, atts);
        }
        else {
            super.startElement(uri, localName, qName, atts);
        }
    }

    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException {
        if (uri.equals(oldURI)) {
            super.endElement(correctURI, localName, qName);
        }
        else {
            super.endElement(uri, localName, qName);
        }
    }

    @Override
    public void startPrefixMapping(String prefix, String uri) throws SAXException {
        if (uri.equals(oldURI)) {
            super.startPrefixMapping(prefix, correctURI);
        }
        else {
            super.startPrefixMapping(prefix, uri);
        }
    }

}

然后我分别创建了一个 XInclude 感知 SAXParserXMLReader以链接到该过滤器并将示例文档作为SAXSource来自该过滤器的 a 加载到默认值Transformer中以构建一个DOMResult

    SAXParserFactory spf = SAXParserFactory.newInstance();
    spf.setNamespaceAware(true);
    spf.setXIncludeAware(true);

    XMLReader inputReader = spf.newSAXParser().getXMLReader();

    XMLFilter fixNs = new XIncludeNsFixup();
    fixNs.setParent(inputReader);

    TransformerFactory tf = TransformerFactory.newInstance();

    Transformer builder = tf.newTransformer();

    DOMResult fixedInput = new DOMResult();

    builder.transform(new SAXSource(fixNs, new InputSource("file3.xml")), fixedInput);

    Document doc = (Document) fixedInput.getNode();

    Transformer serializer = tf.newTransformer();

    serializer.transform(new DOMSource(doc), new StreamResult(System.out));

file3.xml我使用的示例文档xi:xinclude在正确的 XInclude 命名空间中有一个元素,在旧的不受支持的命名空间中有一个元素:

<?xml version="1.0" encoding="UTF-8"?>
<contexts>
    <context name="a">
        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="reuse.xml"/>
        <foo>Original text 1.</foo>
    </context>
    <context name="b">
        <xi:include xmlns:xi="http://www.w3.org/2003/XInclude" href="reuse.xml"/>
        <foo>Original text 2.</foo>
    </context>
</contexts>

我的期望是过滤器首先修复命名空间,然后 XMLReader 对两个元素执行 XInclude。但是,使用 Java 1.8 运行代码时,输​​出如下:

<?xml version="1.0" encoding="UTF-8" standalone="no"?><contexts>
    <context name="a">
        <text xml:base="reuse.xml">I am XIncluded text.</text>
        <foo>Original text 1.</foo>
    </context>
    <context name="b">
        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="reuse.xml"/>
        <foo>Original text 2.</foo>
    </context>
</contexts>

因此过滤器已将命名空间固定在第二个include元素上,但 XMLReader 仅在第一个include元素上应用了 XInclude 包含。

我哪里出错了?如何链接过滤器和 XInclude 感知 XMLReader 来修复命名空间并对命名空间更正的元素执行 XInclude 包含?

为了完整起见,这里是reuse.xml

<?xml version="1.0" encoding="UTF-8"?>
<text>I am XIncluded text.</text>

以及便于测试的 Java 程序的完整代码:

import java.io.IOException;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParserFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMResult;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.sax.SAXSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLFilter;
import org.xml.sax.XMLReader;


public class XIncludeTest1 {


    public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException, TransformerConfigurationException, TransformerException {

        SAXParserFactory spf = SAXParserFactory.newInstance();
        spf.setNamespaceAware(true);
        spf.setXIncludeAware(true);

        XMLReader inputReader = spf.newSAXParser().getXMLReader();

        XMLFilter fixNs = new XIncludeNsFixup();
        fixNs.setParent(inputReader);

        TransformerFactory tf = TransformerFactory.newInstance();

        Transformer builder = tf.newTransformer();

        DOMResult fixedInput = new DOMResult();

        builder.transform(new SAXSource(fixNs, new InputSource("file3.xml")), fixedInput);

        Document doc = (Document) fixedInput.getNode();

        Transformer serializer = tf.newTransformer();

        serializer.transform(new DOMSource(doc), new StreamResult(System.out));
    }

}

我还尝试将来自 Apache 的最新 Xerces Java 放在类路径中,以查看它是否解决了问题,但输出保持不变。

4

0 回答 0