1

我有一个xml结构如下的文件:

<?xml version="1.0" encoding="utf-8"?>
<ScheduleMessage DtdVersion="3" DtdRelease="0">
  <MessageIdentification v="ETSOVista-DMinus1TotalLoadForecast-DE-2012-1" />
  <MessageVersion v="1" />
  <MessageType v="A11" />
    <ScheduleTimeSeries>
    <SendersTimeSeriesIdentification v="10YCB-GERMANY--8" />
    <SendersTimeSeriesVersion v="1" />
    <BusinessType v="A05" />
    <Period>
      <TimeInterval v="2012-11-15T23:00Z/2012-11-16T23:00Z" />
      <Resolution v="PT60M" />
      <Interval>
        <Pos v="1" />
        <Qty v="52452" />
      </Interval>
      <Interval>
        <Pos v="2" />
        <Qty v="50527" />
      </Interval>
      <Interval>
       <Pos v="3" />
       <Qty v="49221" />
      </Interval>
      <Interval>
       <Pos v="4" />
       <Qty v="49344" />
      </Interval>
    </Period>
   </ScheduleTimeSeries>
   <ScheduleTimeSeries>
    <SendersTimeSeriesIdentification v="10YCB-GERMANY--8" />
    <SendersTimeSeriesVersion v="1" />
    <BusinessType v="A05" />
    <Period>
     <TimeInterval v="2012-11-16T23:00Z/2012-11-17T23:00Z" />
     <Resolution v="PT60M" />
     <Interval>
      <Pos v="1" />
      <Qty v="50935" />
     </Interval>
     <Interval>
      <Pos v="2" />
      <Qty v="48918" />
     </Interval>
     <Interval>
      <Pos v="3" />
      <Qty v="47347" />
     </Interval>
     <Interval>
      <Pos v="4" />
      <Qty v="46382" />
  </Interval>
 </Period>
</ScheduleTimeSeries>
</ScheduleMessage>

我只需要这些Qty值。到目前为止,我的代码如下所示:

xml <- xmlInternalTreeParse(file = "test.xml")
xml_top <- xmlRoot(xml)
xml_children <- xmlChildren(x = xml_top)

但是,当我尝试通过以下方式更深入地了解文件时:

xml_children2 <- xmlChildren(x = xml_children)

我收到以下错误:

Error in UseMethod("xmlChildren") : 
no applicable method for 'xmlChildren' applied to an object of class "c('XMLInternalNodeList', 'XMLNodeList')"

我也尝试使用[]or对文件进行子集化[[]],但它总是引导我进入相同的错误。

4

2 回答 2

0

这使用 XQuery 处理器要简单得多,例如xqilla

$ echo 'for $v in //Qty/@v return xs:string($v)' | xqilla -i test.xml /dev/stdin
52452
50527
49221
49344
50935
48918
47347
46382

然后可以使用轻松读取输出read.table。您也许还可以使用该RXQuery包在 R 中运行它,或者如本答案所示。

致谢:回答通过XPath 提取属性节点的值

于 2013-07-22T14:54:41.090 回答
0

我通过使用解决了我的问题:

xpathSApply(doc = xml_top, 
            file = "//ScheduleMessage/ScheduleTimeSeries/Period/Interval/Qty", 
            fun = xmlAttrs)
于 2013-07-23T09:45:44.607 回答