2

这怎么行不通:

$url = "http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20xpath%3D%22%2F%2Fmeta%22%20and%20url%3D%22http://www.cnn.com%22&format=xml&diagnostics=false";

$xml = (simplexml_load_file($url))

我收到多个错误,告诉我 HTTP 请求失败。最终我想将此文件中的结果放入一个数组中,例如

描述 = CNN.com 提供最新的突发新闻等。

关键字 = CNN、CNN 新闻、CNN.com、CNN 电视等。

但是这个初始阶段是行不通的。请问有什么帮助吗?

编辑 附加信息:

错误:

警告:simplexml_load_file(http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20xpath%3D%22//meta%22%20and%20url%3D% 22http://www.cnn.com%22&format=xml&diagnostics=false) [function.simplexml-load-file]:打开流失败:HTTP请求失败!
# 警告:simplexml_load_file() [function.simplexml-load-file]:I/O 警告:未能加载外部实体“http://query.yahooapis.com/v1/public/yql?q=select%20*% 20from%20html%20where%20xpath%3D%22//meta%22%20and%20url%3D%22http://www.cnn.com%22&format=xml&diagnostics=false"
  • 来自我的 phpinfo():allow_url_fopen On On
  • PHP 版本 5.2.11
  • 认为它是有效的(http://query.yahooapis.com/v1/public/yql?q=select%20 *%20from%20html%20where%20xpath%3D%22//meta%22%20and%20url%3D% 22http://www.cnn.com%22&format=xml&diagnostics=false)
4

2 回答 2

3

(注意:一旦找到真正的答案,可能无用的答案......)


在您解决 XML 问题时(继续努力!)知道您也可以将 YQL 响应返回为 JSON。这是一个快速示例:

$url = "http://query.yahooapis.com/v1/public/yql?q=select+%2A+"
     . "from+html+where+xpath%3D%22%2F%2Fmeta%5B%40name%3D%27"
     . "Keywords%27+or+%40name%3D%27Description%27%5D%22+and+"
     . "url%3D%22http%3A%2F%2Fwww.cnn.com%22&format=json&diagnostics=false";

// Grab YQL response and parse JSON
$json   = file_get_contents($url);
$result = json_decode($json, TRUE);

// Loop over meta results looking for what we want
$items = $result['query']['results']['meta'];
$metas = array();
foreach ($items as $item) {
    $metas[$item['name']] = $item['content'];
}
print_r($metas);

给出一个数组,如(屏幕截断的文本):

Array
(
    [Description] => CNN.com delivers the latest breaking news and …
    [Keywords] => CNN, CNN news, CNN.com, CNN TV, news, news online …
)

请注意,YQL 查询(在控制台中尝试)与您的略有不同,以使 PHP 更简单。

于 2010-03-09T22:43:24.017 回答
0

好吧,XML 是 GETable。至于有效,它缺少<?xml version="1.0"?>,但我认为它不是必需的。

<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:count="5" yahoo:created="2010-03-09T05:09:03Z" yahoo:lang="en-US" yahoo:updated="2010-03-09T05:09:03Z" yahoo:uri="http://query.yahooapis.com/v1/yql?q=select+*+from+html+where+xpath%3D%22%2F%2Fmeta%22+and+url%3D%22http%3A%2F%2Fwww.cnn.com%22"><results><meta content="HTML Tidy for Java (vers. 26 Sep 2004), see www.w3.org" name="generator"/><meta content="1800;url=?refresh=1" http-equiv="refresh"/><meta content="CNN.com delivers the latest breaking news and information on the latest top stories, weather, business, entertainment, politics, and more. For in-depth coverage, CNN.com provides special reports, video, audio, photo galleries, and interactive guides." name="Description"/><meta content="CNN, CNN news, CNN.com, CNN TV, news, news online, breaking news, U.S. news, world news, weather, business, CNN Money, sports, politics, law, technology, entertainment, education, travel, health, special reports, autos, developing story, news video, CNN Intl" name="Keywords"/><meta content="text/html; charset=iso-8859-1" http-equiv="content-type"/></results></query><!-- total: 250 --> 

在我的本地服务器(PHP 5.3)上对其进行了测试,没有报告错误。我已经使用了您的源代码,并且可以正常工作。这是一个 print_r():


SimpleXMLElement Object
(
    [results] => SimpleXMLElement Object
        (
            [meta] => Array
                (
                    [0] => SimpleXMLElement Object
                        (
                            [@attributes] => Array
                                (
                                    [content] => HTML Tidy for Java (vers. 26 Sep 2004), see www.w3.org
                                    [name] => generator
                                )

                        )

                    [1] => SimpleXMLElement Object
                        (
                            [@attributes] => Array
                                (
                                    [content] => 1800;url=?refresh=1
                                    [http-equiv] => refresh
                                )

                        )

                    [2] => SimpleXMLElement Object
                        (
                            [@attributes] => Array
                                (
                                    [content] => CNN.com delivers the latest breaking news and information on the latest top stories, weather, business, entertainment, politics, and more. For in-depth coverage, CNN.com provides special reports, video, audio, photo galleries, and interactive guides.
                                    [name] => Description
                                )

                        )

                    [3] => SimpleXMLElement Object
                        (
                            [@attributes] => Array
                                (
                                    [content] => CNN, CNN news, CNN.com, CNN TV, news, news online, breaking news, U.S. news, world news, weather, business, CNN Money, sports, politics, law, technology, entertainment, education, travel, health, special reports, autos, developing story, news video, CNN Intl
                                    [name] => Keywords
                                )

                        )

                    [4] => SimpleXMLElement Object
                        (
                            [@attributes] => Array
                                (
                                    [content] => text/html; charset=iso-8859-1
                                    [http-equiv] => content-type
                                )

                        )

                )

        )

)

我建议您对 URL 进行编码,但这已经完成了。您可以尝试使用 cURL 执行查询。

于 2010-03-09T17:16:30.140 回答