26

我有一个使用 JAXB 编组为 XML 的对象。一个元素包含一个包含引号 (") 的字符串。生成的 XML 具有"" 存在的位置。

尽管这通常是首选,但我需要我的输出来匹配遗留系统。如何强制 JAXB 不转换 HTML 实体?

--

感谢您的答复。但是,我从来没有看到处理程序 escape() 被调用。你能看看我做错了什么吗?谢谢!

package org.dc.model;

import java.io.IOException;
import java.io.Writer;

import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Marshaller;

import org.dc.generated.Shiporder;

import com.sun.xml.internal.bind.marshaller.CharacterEscapeHandler;

public class PleaseWork {
    public void prettyPlease() throws JAXBException {
        Shiporder shipOrder = new Shiporder();
        shipOrder.setOrderid("Order's ID");
        shipOrder.setOrderperson("The woman said, \"How ya doin & stuff?\"");

        JAXBContext context = JAXBContext.newInstance("org.dc.generated");
        Marshaller marshaller = context.createMarshaller();
        marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);
        marshaller.setProperty(CharacterEscapeHandler.class.getName(),
                new CharacterEscapeHandler() {
                    @Override
                    public void escape(char[] ch, int start, int length,
                            boolean isAttVal, Writer out) throws IOException {
                        out.write("Called escape for characters = " + ch.toString());
                    }
                });
        marshaller.marshal(shipOrder, System.out);
    }

    public static void main(String[] args) throws Exception {
        new PleaseWork().prettyPlease();
    }
}

--

输出是这样的:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<shiporder orderid="Order's ID">
    <orderperson>The woman said, &quot;How ya doin &amp; stuff?&quot;</orderperson>
</shiporder>

如您所见,回调永远不会显示。(一旦我得到回调被调用,我会担心让它真正做我想做的事。)

--

4

14 回答 14

13

我的队友找到的解决方案:

PrintWriter printWriter = new PrintWriter(new FileWriter(xmlFile));
DataWriter dataWriter = new DataWriter(printWriter, "UTF-8", DumbEscapeHandler.theInstance);
marshaller.marshal(request, dataWriter);

不要将 xmlFile 传递给 marshal(),而是传递知道编码和适当的转义处理程序(如果有)的 DataWriter。

注意:由于 DataWriter 和 DumbEscapeHandler 都在 com.sun.xml.internal.bind.marshaller 包中,因此您必须引导 javac。

于 2009-10-05T21:01:57.993 回答
10

我刚刚将我的自定义处理程序作为这样的类:

import java.io.IOException;
import java.io.StringWriter;
import java.io.Writer;

import com.sun.xml.bind.marshaller.CharacterEscapeHandler;

public class XmlCharacterHandler implements CharacterEscapeHandler {

    public void escape(char[] buf, int start, int len, boolean isAttValue,
            Writer out) throws IOException {
        StringWriter buffer = new StringWriter();

        for (int i = start; i < start + len; i++) {
            buffer.write(buf[i]);
        }

        String st = buffer.toString();

        if (!st.contains("CDATA")) {
            st = buffer.toString().replace("&", "&amp;").replace("<", "&lt;")
                .replace(">", "&gt;").replace("'", "&apos;")
                .replace("\"", "&quot;");

        }
        out.write(st);
        System.out.println(st);
    }

}

在 marshaller 方法中只需调用:

marshaller.setProperty(CharacterEscapeHandler.class.getName(),
                new XmlCharacterHandler());

它工作正常。

于 2013-08-02T12:41:20.583 回答
4

我一直在玩您的示例并调试 JAXB 代码。似乎它与所使用的 UTF-8 编码有关。的 escapeHandler 属性MarshallerImpl似乎设置正确。但是,并非在所有情况下都使用它。如果我搜索MarshallerImpl.createEscapeHandler()我发现的电话:

public XmlOutput createWriter( OutputStream os, String encoding ) throws JAXBException {
    // UTF8XmlOutput does buffering on its own, and
    // otherwise createWriter(Writer) inserts a buffering,
    // so no point in doing a buffering here.

    if(encoding.equals("UTF-8")) {
        Encoded[] table = context.getUTF8NameTable();
        final UTF8XmlOutput out;
        if(isFormattedOutput())
            out = new IndentingUTF8XmlOutput(os,indent,table);
        else {
            if(c14nSupport)
                out = new C14nXmlOutput(os,table,context.c14nSupport);
            else
                out = new UTF8XmlOutput(os,table);
        }
        if(header!=null)
            out.setHeader(header);
        return out;
    }

    try {
        return createWriter(
            new OutputStreamWriter(os,getJavaEncoding(encoding)),
            encoding );
    } catch( UnsupportedEncodingException e ) {
        throw new MarshalException(
            Messages.UNSUPPORTED_ENCODING.format(encoding),
            e );
    }
}

请注意,在您的设置中,顶部部分(...equals("UTF-8")...)已被考虑在内。但是,这个不带escapeHandler. 但是,如果您将编码设置为任何其他,则此方法的底部称为 ( createWriter(OutputStream, String)) 并且此方法使用escapeHandler,因此 EH 发挥了作用。所以,加...

    marshaller.setProperty(Marshaller.JAXB_ENCODING, "ASCII");

使您的自定义CharacterEscapeHandler被调用。不太确定,但我猜这是 JAXB 中的一种错误。

于 2009-10-06T08:19:39.303 回答
4

我想说最简单的方法是覆盖CharacterEscapeHandler

marshaller.setProperty("com.sun.xml.bind.characterEscapeHandler", new CharacterEscapeHandler() {
    @Override
    public void escape(char[] ch, int start, int length, boolean isAttVal,
                       Writer out) throws IOException {
        out.write(ch, start, length);
    }
});
于 2017-03-23T17:10:24.553 回答
3

@Elliot您可以使用来启用编组器进入 characterEscape 功能。这很奇怪,但如果您设置“ Unicode ”而不是“UTF-8”,它就可以工作。在设置 CharacterEscapeHandler 属性之前或之后添加它。

marshaller.setProperty(Marshaller.JAXB_ENCODING, "Unicode");

但是,不要仅通过检查IDE 中的控制台来确定,因为它的显示应取决于工作区编码。最好也从这样的文件中检查它:

marshaller.marshal(shipOrder, new File("C:\\shipOrder.txt"));
于 2012-02-10T12:12:45.747 回答
2

我发现了同样的问题,我在 xmlWriter 文件中使用 xmlWriter 解决了这个问题,如果您不希望在 < 到 < 之间进行转换,则有一种方法 isEscapeText() 和 setEscapeTest 默认为 true,此时您需要 setEscapeTest(false); 在编组期间

JAXBContext jaxbContext = JAXBContext.newInstance(your class);
Marshaller marshaller = jaxbContext.createMarshaller();

marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);

// Create a filter that will remove the xmlns attribute
NamespaceFilter outFilter = new NamespaceFilter(null, false);

// Do some formatting, this is obviously optional and may effect
// performance
OutputFormat format = new OutputFormat();
format.setIndent(true);
format.setNewlines(true);

// Create a new org.dom4j.io.XMLWriter that will serve as the
// ContentHandler for our filter.
XMLWriter writer = new XMLWriter(new FileOutputStream(file), format);
writer.setEscapeText(false); // <----------------- this line
// Attach the writer to the filter
outFilter.setContentHandler(writer);
// marshalling
marshaller.marshal(piaDto, outFilter);
marshaller.marshal(piaDto, System.out);

这个改变writer.setEscapeText(false); 解决了我的问题希望这对您有所帮助

于 2018-01-22T07:13:59.363 回答
1

似乎使用Sun 的 JAXB implementation是可能的,尽管我自己没有做过。

于 2009-10-01T23:29:37.167 回答
1

我检查了 XML 规范。 http://www.w3.org/TR/REC-xml/#sec-references说“格式正确的文档不需要声明以下任何实体:amp, lt, gt, apos, quot。”所以看起来遗留系统使用的 XML 解析器不符合要求。

(我知道它不能解决您的问题,但至少能够说出哪个组件坏了是件好事)。

于 2010-04-05T07:32:15.237 回答
1

阅读其他帖子后,这对我有用:

javax.xml.bind.JAXBContext jc = javax.xml.bind.JAXBContext.newInstance(object);
marshaller = jc.createMarshaller();         marshaller.setProperty(javax.xml.bind.Marshaller.JAXB_FORMATTED_OUTPUT, true);
marshaller.setProperty(javax.xml.bind.Marshaller.JAXB_ENCODING, "UTF-8");                   marshaller.setProperty(CharacterEscapeHandler.class.getName(), new CustomCharacterEscapeHandler());


public static class CustomCharacterEscapeHandler implements CharacterEscapeHandler {
        /**
         * Escape characters inside the buffer and send the output to the Writer.
         * (prevent <b> to be converted &lt;b&gt; but still ok for a<5.)
         */
        public void escape(char[] buf, int start, int len, boolean isAttValue, Writer out) throws IOException {
            if (buf != null){
                StringBuilder sb = new StringBuilder();
                for (int i = start; i < start + len; i++) {
                    char ch = buf[i];

                    //by adding these, it prevent the problem happened when unmarshalling
                    if (ch == '&') {
                        sb.append("&amp;");
                        continue;
                    }

                    if (ch == '"' && isAttValue) {
                        sb.append("&quot;");
                        continue;
                    }

                    if (ch == '\'' && isAttValue) {
                        sb.append("&apos;");
                        continue;
                    }


                    // otherwise print normally
                    sb.append(ch);
                }

                //Make corrections of unintended changes
                String st = sb.toString();

                st = st.replace("&amp;quot;", "&quot;")
                       .replace("&amp;lt;", "&lt;")
                       .replace("&amp;gt;", "&gt;")
                       .replace("&amp;apos;", "&apos;")
                       .replace("&amp;amp;", "&amp;");

                out.write(st);
            }
        }
    }
于 2014-01-27T14:38:05.250 回答
0

有趣,但你可以尝试使用字符串

Marshaller marshaller = jaxbContext.createMarshaller();
StringWriter sw = new StringWriter();
marshaller.marshal(data, sw);
sw.toString();

至少对我来说,这不会逃避引号

于 2011-03-07T15:14:54.870 回答
0

当使用 sun 的 Marshaller 实现时,最简单的方法是提供您自己的 CharacterEscapeEncoder 实现,它不会转义任何东西。

    Marshaller m = jcb.createMarshaller();
m.setProperty(
    "com.sun.xml.bind.marshaller.CharacterEscapeHandler",
    new NullCharacterEscapeHandler());

public class NullCharacterEscapeHandler implements CharacterEscapeHandler {

    public NullCharacterEscapeHandler() {
        super();
    }


    public void escape(char[] ch, int start, int length, boolean isAttVal, Writer writer) throws IOException {
        writer.write( ch, start, length );
    }
}
于 2011-08-03T22:10:53.487 回答
0

由于某种原因,我没有时间去寻找,它在设置时对我有用

marshaller.setProperty(Marshaller.JAXB_ENCODING, "utf-8");

与使用"UTF-8"或相反"Unicode"

我建议您尝试一下,正如@Javatar 所说,使用以下命令检查它们是否转储到文件中:

marshaller.marshal(shipOrder, new File("<test_file_path>"));

并使用像记事本++这样的体面的文本编辑器打开它

于 2014-03-31T07:50:17.860 回答
0

CharacterEscapeHandler由于上述原因,我建议不要使用(它是一个内部类)。相反,您可以使用Woodstox并将您自己的文件EscapingWriterFactory提供给XMLStreamWriter. 就像是:

XMLOutputFactory2 xmlOutputFactory = (XMLOutputFactory2)XMLOutputFactory.newFactory();
xmlOutputFactory.setProperty(XMLOutputFactory2.P_TEXT_ESCAPER, new EscapingWriterFactory() {

    @Override
    public Writer createEscapingWriterFor(Writer w, String enc) {
        return new EscapingWriter(w);
    }

    @Override
    public Writer createEscapingWriterFor(OutputStream out, String enc) throws UnsupportedEncodingException {
        return new EscapingWriter(new OutputStreamWriter(out, enc));
    }

});

marshaller.marshal(model, xmlOutputFactory.createXMLStreamWriter(out);

EscapingWriter可以在CharacterEscapingTest中看到如何编写 an 的示例。

于 2018-09-26T17:29:59.473 回答
0

在尝试了以上所有解决方案后,终于得出了结论。

通过自定义转义处理程序进行编组逻辑。

final StringWriter sw = new StringWriter();
    final Class classType = fixml.getClass();
    final JAXBContext jaxbContext = JAXBContext.newInstance(classType);
    final Marshaller marshaller = jaxbContext.createMarshaller();
    final JAXBElement<T> fixmsg = new JAXBElement<T>(new QName(namespaceURI, localPart), classType, fixml);
    marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
    marshaller.setProperty(CharacterEscapeHandler.class.getName(), new JaxbCharacterEscapeHandler());
    marshaller.marshal(fixmsg, sw);
    return sw.toString();

自定义转义处理程序如下:

import java.io.IOException;
import java.io.Writer;

public class JaxbCharacterEscapeHandler implements CharacterEscapeHandler {

    public void escape(char[] buf, int start, int len, boolean isAttValue,
                    Writer out) throws IOException {

            for (int i = start; i < start + len; i++) {
                    char ch = buf[i];
                    out.write(ch);
            }
    }
}
于 2018-10-11T15:54:30.793 回答