一、本次是使用古诗文网获取到动态验证码模拟登录

验证码识别,本次没有提及,代码中有调用。

方法一、使用requests库中的session自动获取,服务器返回的set_cookie

from fake_useragent import UserAgent
from Common.Get_Localhost_Time import Get_Localhost_Time
from Common.Get_Project_Path import Get_Project_Path
from Common.get_code import get_code_text
from Common.Get_Ip池 import Get_IP_data
import requests
import os

class Login_Gushiwen():
    def login_gushiwen1(self):
        img_code_url = "https://so.gushiwen.cn/RandCode.ashx"  # 古诗文网生成的验证码的url地址
        useragernt = UserAgent().random  # 伪装请求头
        headers = {"User-Agent": useragernt, "Connection": "close"}  # 设置请求头
        ip_list = Get_IP_data().random_ip_data()  # 获取到代理IP
        proxies = {"{}".format(ip_list[0]): "{}".format(ip_list[1])}
        self.session = requests.session()  # 获取到返回的Session
        response = self.session.get(url=img_code_url, headers=headers, proxies=proxies)  # 请求古诗文网的验证码生成地址,并获取到对应的session
        project_path = Get_Project_Path().get_project_path()
        time = Get_Localhost_Time().get_localhost_time()
        file_path = os.path.join(project_path, "验证码", "img_code_picture")
        file_name = r"{}\{}.jpg".format(file_path, time)
        with open(file=file_name, mode="wb") as f:  # 存储验证码图片
            f.write(response.content)
        img_code = get_code_text(file_path=file_name)  # 读取验证码图片
        data = {"from": "http://so.gushiwen.cn/user/collect.aspx", "email": "enamil", "pwd": "password",
                "code": img_code, "denglu": "登录"}  # 设置登录数据
        login_url = "https://so.gushiwen.cn/user/login.aspx?from=http%3a%2f%2fso.gushiwen.cn%2fuser%2fcollect.aspx"
        login_response = self.session.post(url=login_url, headers=headers, proxies=proxies, data=data)
        login_response.encoding = "utf-8"
        shoucao_url="https://so.gushiwen.cn/user/collect.aspx?type=m&id=3111151&sort=t"
        response_1=self.session.get(url=shoucao_url,headers=headers,proxies=proxies)
        print(response_1.text)
        response.close()
        login_response.close()

方法二、使用requests库,手动获取set_cookie,进行请求。

    def login_gushiwen2(self):
        img_code_url = "https://so.gushiwen.cn/RandCode.ashx"  # 古诗文网生成的验证码的url地址
        useragernt = UserAgent().random  # 伪装请求头
        headers = {"User-Agent": useragernt, "Connection": "close"}  # 设置请求头
        ip_list = Get_IP_data().random_ip_data()  # 获取到代理IP
        proxies = {"{}".format(ip_list[0]): "{}".format(ip_list[1])}
        response=requests.get(url=img_code_url,headers=headers,proxies=proxies)
        cookies=response.cookies.get_dict() # 将获取到cookie转换成字典格式
        project_path = Get_Project_Path().get_project_path()
        time = Get_Localhost_Time().get_localhost_time()
        file_path = os.path.join(project_path, "验证码", "img_code_picture")
        file_name = r"{}\{}.jpg".format(file_path, time)
        with open(file=file_name, mode="wb") as f:  # 存储验证码图片
            f.write(response.content)
        img_code = get_code_text(file_path=file_name)  # 读取验证码图片
        headers_login={"User-Agent": useragernt, "Connection": "close"}
        data = {"from": "http://so.gushiwen.cn/user/collect.aspx", "email": "email", "pwd": "password","code": img_code, "denglu": "登录"}  # 设置登录数据
        login_url = "https://so.gushiwen.cn/user/login.aspx?from=http%3a%2f%2fso.gushiwen.cn%2fuser%2fcollect.aspx"
        login_response = requests.post(url=login_url, headers=headers_login, proxies=proxies, data=data,cookies=cookies) # 发起请求
        print(login_response.text) #使用HTML文件查看响应文本符合需求

使用方法二注意事项:

cookie要以字典形式储存,cookie不用在headers中写入,在get或post的请求中添加;

requests中自带转换字典格式,可以直接获取

例如:

cookies=response.cookies.get_dict()
login_response = requests.post(url=login_url, headers=headers_login, proxies=proxies, data=data,cookies=cookies)

注:Common下的包是自己封装的

方法二思路:通过获取古诗文网的动态验证码及验证码的cookie,登陆时将动态验证码的及对应的cookie在的登陆时提交。(若返回验证码错误,原因是验证码和cookie不对应)

Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐