1

我正在尝试抓取某些特定 Instagram 帐户的所有关注者。我正在使用 Python 3.8.3 和最新版本的 Instaloader 库。我写的代码如下:

# Import the required libraries:
import instaloader
import time
from random import randint

# Start time:
start = time.time()

# Create an instance of instaloader:
loader = instaloader.Instaloader()

# Credentials & target account:
user_id = USERID
password = PASSWORD
target = TARGET # Account of which the list of followers need to be scraped;

# Login or load the session:
loader.login(user_id, password)

# Obtain the profile metadata of the target:
profile = instaloader.Profile.from_username(loader.context, target)

# Print the list of followers and save it in a text file:
try:

    # The list to store the collected user handles of the followers:
    followers_list = []

    # Variables used to apply pauses to slow down scraping:
    count = 0 
    short_counter = 1
    short_pauser = randint(19, 24)
    long_counter = 1
    long_pauser = randint(4900, 5000)

    # Fetch the followers one by one:
    for follower in profile.get_followers():

        sleeper = randint(840, 1020)
    
        # Short pause for the process:
        if (short_counter % short_pauser == 0):
            short_counter = 0
            short_pauser = randint(19, 24)
            print('\nShort Pause.\n')
            time.sleep(1)

        # Long pause for the process:
        if (long_counter % long_pauser == 0):
            long_counter = 0
            long_pauser = randint(4900, 5000)
            print('\nLong pause.\n')
            time.sleep(sleeper)
        
        # Append the list and print the follower's user handle:
        followers_list.append(follower.username)
        print(count,'', followers_list[count])
    
        # Increment the counters accordingly:
        count = count + 1
        short_counter = short_counter + 1
        long_counter = long_counter + 1

    # Store the followers list in a txt file:
    txt_file = target + '.txt'
    with open(txt_file, 'a+') as f:
        for the_follower in followers_list:
            f.write(the_follower)
            f.write('\n')

except Exception as e:
    print(e)

# End time:
end = time.time()

total_time = end - start

# Print the time taken for execution:
print('Time taken for complete execution:', total_time,'s.')

抓取一些数据后出现以下错误:

HTTP Error 400 (Bad Request) on GraphQL Query. Retrying with shorter page length.
HTTP Error 400 (Bad Request) on GraphQL Query. Retrying with shorter page length.
400 Bad Request

实际上,当 Instagram 检测到异常活动并禁用帐户一段时间并提示用户更改密码时,就会发生错误。

我努力了 -

(1) 减慢刮削过程。

(2) 中间添加停顿,使程序更人性化。

尽管如此,还是没有进展。

如何绕过这些限制并获得所有关注者的完整列表?如果无法获得整个列表,那么在不被禁止/禁用帐户/面临此类不便的情况下获得至少 20,000 个关注者列表(来自多个帐户)的最佳方法是什么?

4

0 回答 0