python - Twitter 不再适用于请求库 python

Question

我有一个 python 函数，它使用 requests 库和 BeautifulSoup 来抓取特定用户的推文。

import requests
from bs4 import BeautifulSoup

contents = requests.get("https://twitter.com/user")
soup = BeautifulSoup(contents.text, "html.parser")

当请求库访问 Twitter 时，它使用 Twitter 的旧版本。然而，由于 Twitter 最近放弃了对其旧版本的支持，请求库不再工作，并返回 html 代码，说明此版本的 Twitter 已过时。

有没有办法让请求库访问更新版本的 Twitter？

score 0 · Accepted Answer

我也遇到了这个问题。其根本原因是 Twitter 拒绝“旧版”浏览器，不幸的是其中包含 Python 的 requests 库。

User-AgentTwitter 通过查看作为请求的一部分发送的标头来确定您正在使用的浏览器。所以我对这个问题的解决方案就是欺骗这个标题。

在您的特定情况下，请尝试执行以下操作；

import requests
from bs4 import BeautifulSoup

contents = requests.get(
    "https://twitter.com/user",
    headers={"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36"}
)
soup = BeautifulSoup(contents.text, "html.parser")

score 0 · Accepted Answer

无法直接回答（也没有足够的评论点），但遇到同样的问题，我确实找到了一些新工具。https://github.com/bisguzar/twitter-scraper使用 requests_html 来获取推文（参见他们的 tweets.py 模块）。而https://github.com/Mottl/GetOldTweets3/是另一个强大的用于抓取推文的 Python 工具。

score 0 · Accepted Answer

该requests库将访问您传递给它的 URL。我建议检查Twitter API 文档并更新您的代码以对应最新版本。

python - Twitter 不再适用于请求库 python

3 回答 3

Related

Reference