0

有 HTML:

   <table width="100%" cellpadding="0" cellspacing="0" border="0">
    <tr>
    <td width="27%" align="left" valign="top">
    <span class="param">Text0</span> 23<br />
    <span class="param">Text1</span> 173<br />
    <span class="param">Text2</span> 54<br />
    <span class="param">Text3</span> 2<br /><br />
    </td>
    <td width="27%" align="left" valign="top">
    <span class="param">Text4</span><br />
    one <br />
    two <br />
    three <br />
    </td>
    <td width="46%" align="left" valign="top">
    <span class="param">Text5</span><br /> 
    one -<br />
    two -<br />
    three -<br />
    </td>
    </tr>
    </table>

我可以获取值 Text0-3 解析代码更改 get(0)-get(3),但无法获取 Text4 和 Text5:

Document doc = Jsoup.connect("text.html").get();

Element param = doc.select("span[class=param]").get(0);

Node node = param.nextSibling();

System.out.println(node.toString());

如何获取值 Text4 和 Text5?get(4) 或 get(5),现在返回 br,但我需要得到“一、二、三”

现在我使用这段代码:

Document doc = Jsoup.connect("text.hml").get();

        Elements params = doc.select("span[class=param]");
        int i;
        for (i=0; i<6; i++) {
        Element param = params.get(i);

        Node node = param.nextSibling();

        System.out.println(node.toString());

        }

这个打印:

 23
 173
 54
 2
<br>
<br>

我需要:

 23
 173
 54
 2
 one two three
 one two three

疯狂的代码答案:

Document doc = Jsoup.connect("text.html").get();

        Elements params = doc.select("span[class=param]");
        int i;
        for (i=0; i<3; i++) {
        Element param = params.get(i);

        Node node = param.nextSibling();

        System.out.println(node.toString());
        }

        for (i=4; i<5; i++){

            Element apar = params.get(i);

            Node apan = apar.nextSibling();

            System.out.println("apar: "+apan.nextSibling().toString());
            System.out.println("apar: "+apan.nextSibling().nextSibling().nextSibling().toString());
            System.out.println("apar: "+apan.nextSibling().nextSibling().nextSibling().nextSibling().nextSibling().toString());
            //System.out.println(apan.nextSibling().toString());


        }
        for (i=5; i<6; i++){

            Element vih = params.get(i);

            Node vihn = vih.nextSibling();

            System.out.println("vih: "+vihn.nextSibling().toString());
            System.out.println("vih: "+vihn.nextSibling().nextSibling().nextSibling().toString());
            System.out.println("vih: "+vihn.nextSibling().nextSibling().nextSibling().nextSibling().nextSibling().toString());
            //System.out.println(apan.nextSibling().toString());


        }

    }

这个疯狂的(?)代码打印出我想要的。

4

1 回答 1

0

当您执行 a 时,Element param = doc.select("span[class=param]")您会返回一个元素列表。您需要遍历列表以处理每个<span>元素。在您的代码中,您只能通过执行Element param = doc.select("span[class=param]").get(0);

Document doc = Jsoup.connect("text.hml").get(); 
Elements params = doc.select("span[class=param]");
for(Element element: params){
    //Will print out the text contained within the <span>...</span>
    System.out.println(element.ownText());
}

params = doc.select("td");
for(Element element: params){
    //Will print out the text contained in all children nodes of <td> nodes, that are text nodes 
    System.out.println(element.ownText());
    //System.out.println(element.text());
}

上面的代码会打印出来:

Text0
Text1
Text2
Text3
Text4
Text5
23 173 54 2
one two three
one - two - three -

这应该足以让你到达你要去的地方。祝你好运!

于 2016-03-25T16:02:15.637 回答