2

我正在尝试使用 Beautiful Soup 查找特定网页中的所有 # 元素。

import requests
from bs4 import BeautifulSoup as Soup


source = "https://www.runinrabbit.com/"


def getPageContents(source):

    req = requests.get(source)
    print("req : ",req,type(req))
    print("***************************")
    content = Soup(req.text, 'html.parser')
    print("content data",type(content),content)
    return content

就像内容一样,我只是得到了除标记值之外的所有内容。

例如,带有标签的字符串(如下所示)不会在我的函数中打印:getPageContents。

#marathoner,#winner,#runinrabbit,#topoathletic,#hartfordmarathon,#rabbitpro,#marathon,#olympictrials,#runnergirl,#winning,#finisher,#run,#running,#runner,#runnersofinstagram,#runnersworld,#runnerscommunity , #breezyback, #lightweight, #simple, #runinrabbit, #borntorunfree, #breezyback, #lightweight, #simple, #runinrabbit, #borntorunfree”, #racerollcall, #racetime, #runfast, #goodluck, #RADrabbit, #rabbitELITE, #rabbitELITEtrail、#rabbitPRO、#runinrabbit、#borntorunfree”

4

0 回答 0