第一步 在目标网站登录账号

在这里插入图片描述

第二步 进入开发者界面

台式电脑按F12 笔记本按Fn+F12
在这里插入图片描述

第三步 刷新界面 复制cookie

在这里插入图片描述

第四步 将cookie填入相应的程序

import requests
import re
import random
user_agent_list = [
    "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36",
    "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36",
    #"Mozilla/5.0 (Windows NT 10.0; …) Gecko/20100101 Firefox/61.0", 属性值过长使用了省略号 打咩!!
    "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36",
    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.62 Safari/537.36",
    "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36",
    'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 Safari/537.36',
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36",
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36",
    "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)",
    "Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.5; en-US; rv:1.9.2.15) Gecko/20110303 Firefox/3.6.15",
    ]
head={
    "Cookie" : """""",
    'User-Agent' : random.choice(user_agent_list)#随机设备
}
#https://movie.douban.com/top250?start=0&filter=
def job () :
    with open("dp250.txt",'w',encoding='utf-8') as fp:#写入
        idx=0
        while idx<250:
            url="https://movie.douban.com/top250?start="+str(idx)+"&filter="
            res=requests.get(url,headers = head)
            response=res.text
            res_url=re.findall("<div class=\"hd\">.*?<a href=\"(.*?)\" class=\"\">.*?<span",response,re.S)#存储网址
            res_name=re.findall("<span class=\"title\">(.*?)</span>",response,re.S)#存储电影名和电影原名
            i=0#访问res_url
            j=0#访问res_name
            while i<len(res_url) :
                w="Top"+str(idx+1)+": "+res_url[i]+" "+res_name[j]
                if j+1<len(res_name) and res_name[j+1][0]=='&' :
                    res_name[j+1]=re.sub("&nbsp;/&nbsp;"," ",res_name[j+1])
                    res_name[j+1]=re.sub("&#39;","",res_name[j+1])
                    w=w+res_name[j+1]
                    j+=1
                print(w)
                fp.write(w+'\n')
                i+=1
                j+=1
                idx+=1
job()
Logo

为开发者提供学习成长、分享交流、生态实践、资源工具等服务,帮助开发者快速成长。

更多推荐