我正在尝试使用 Beautiful Soup 查找特定网页中的所有 # 元素。
import requests
from bs4 import BeautifulSoup as Soup
source = "https://www.runinrabbit.com/"
def getPageContents(source):
req = requests.get(source)
print("req : ",req,type(req))
print("***************************")
content = Soup(req.text, 'html.parser')
print("content data",type(content),content)
return content
就像内容一样,我只是得到了除标记值之外的所有内容。
例如,带有标签的字符串(如下所示)不会在我的函数中打印:getPageContents。
#marathoner,#winner,#runinrabbit,#topoathletic,#hartfordmarathon,#rabbitpro,#marathon,#olympictrials,#runnergirl,#winning,#finisher,#run,#running,#runner,#runnersofinstagram,#runnersworld,#runnerscommunity , #breezyback, #lightweight, #simple, #runinrabbit, #borntorunfree, #breezyback, #lightweight, #simple, #runinrabbit, #borntorunfree”, #racerollcall, #racetime, #runfast, #goodluck, #RADrabbit, #rabbitELITE, #rabbitELITEtrail、#rabbitPRO、#runinrabbit、#borntorunfree”