0

elementtree 非常新,所以我正在尝试为 xbmc 解析 tv 插件的 xml 文件。下面是我遇到问题的代码。我认为我的 xpath 不正确,并且占位符不适用于该属性!

这是我正在使用的 xml 文件 - http://services.tvrage.com/myfeeds/episode_list.php?key=ag6txjP0RH4m0c8sZk2j&sid=2930

    seasonnum = root2.findall("/Show/Episodelist/Season[@no='%s']/episode/seasonnum" % (season))


        import xml.etree.ElementTree as ET
        import urllib            
        tree2 = ET.parse(urllib.urlopen(url))
        root2 = tree2.getroot()
        seasonnum = tree2.findall("./Episodelist/Season[@no='%s']/episode/seasonnum" % '1')
        print seasonnum

SyntaxError:预期的路径分隔符([)是我得到的

4

4 回答 4

2

使用元素树:

>>> from xml.etree import ElementTree
>>> import urllib2
>>> url = 'http://services.tvrage.com/myfeeds/episode_list.php?key=ag6txjP0RH4m0c8sZk2j&sid=2930'
>>> request = urllib2.Request(url, headers={"Accept" : "application/xml"})
>>> u = urllib2.urlopen(request)
>>> tree = ElementTree.parse(u)
>>> rootElem = tree.getroot()
>>> [s.text for s in rootElem.findall('.//Season[@no="2"]/episode/seasonnum')]
['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12', '13', '14', 
 '15', '16', '17', '18', '19', '20', '21', '22']
于 2014-02-20T04:12:34.510 回答
1

根据xml.etree.ElementTree文档 - XPath 支持

该模块为在树中定位元素的 XPath 表达式提供了有限的支持。目标是支持缩写语法的一小部分;完整的 XPath 引擎超出了模块的范围。

lxml您可能需要使用 XPath之类的第三方库。

例子:

>>> import lxml.etree
>>>
>>> url = 'http://services.tvrage.com/myfeeds/episode_list.php?key=ag6txjP0RH4m0c8sZk2j&sid=2930'
>>> tree = lxml.etree.parse()
>>> tree.xpath("/Show/Episodelist/Season[@no='%s']/episode/seasonnum/text()" % 1)
['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12']

更新

要使用lxml.etree.ElementTree,应稍微修改 xpath:

>>> import urllib
>>> import xml.etree.ElementTree as ET
>>>
>>> f = urllib.urlopen(url)
>>> tree = ET.parse(f)
>>> [e.text for e in tree.findall("./Episodelist/Season[@no='%s']/episode/seasonnum" % 1)]
['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12']
于 2014-02-20T04:03:53.287 回答
0
    import xml.etree.ElementTree as ET
    import urllib
    content = urllib.urlopen(url).read()
    tree2 = ET.fromstring(content)
    tvrage_seasons = tree2.findall('.//Season' )

必须像这样工作,因为在 xbmc Elementtree 中出于某种原因,必须有错误或无法使其工作的东西。但这对我有用!

于 2014-02-20T19:36:54.203 回答
0

我已经尝试了您的示例并且它有效。这是一个精简的完整版本:

import urllib
import xml.etree.ElementTree as ET

url = 'http://services.tvrage.com/myfeeds/episode_list.php?key=ag6txjP0RH4m0c8sZk2j&sid=2930'
tree = ET.parse(urllib.urlopen(url))
seasons = tree.findall("./Episodelist/Season[@no='%s']/episode/seasonnum" % '1')

for s in seasons:
    print s.text

我能想到的唯一问题是,不知何故,您下载了部分 XML 文档——不太可能,但我不知道任何其他解释。请注意,上述脚本取自您的问题。我只添加了for循环。

于 2014-02-20T16:11:25.960 回答