python - Scrapy.FormRequest写对了吗?帮忙看看

 思念某女人_959 发布于 2022-10-29 15:02
  • 从m.zhihu.com/topics找到所有的话题的内容,点击“更多”,发现请求的是'https://m.zhihu.com/node/TopicsPlazzaListV2', 并且发送的FormData为:

    def get_topic_url(self, response):
        topics = response.css('.item .blk > a[target=_blank]::attr(href)').extract()
        _xsrf = response.css('input[name="_xsrf"]::attr(value)').extract()[0]
        for topic in topics:
            print topic
        data = response.css('.zh-general-list::attr(data-init)').extract()
        import json
        param = json.loads(data[0])
        topic_id = param['params']['topic_id']
        hash_id = param['params']['hash_id']
        offset = param['params']['offset']
        yield scrapy.FormRequest(
                url="https://m.zhihu.com/node/TopicsPlazzaListV2",
                headers=headers,
                formdata={
                    "method":"next",
                    "params":{
                        "topic_id":topic_id,
                        "offset":offset,
                        "hash_id":hash_id,
                    },
                "_xsrf":_xsrf,
                },
                meta={
                    "proxy": proxy,
                    "cookiejar": response.meta["cookiejar"],
                },
                callback=self.get_topic_url,
        )

但是返回的是400代码,是不是代码哪里写错了?请指教

2016-05-08 10:43:52 [scrapy] DEBUG: Retrying https://m.zhihu.com/node/TopicsPlazzaListV2>; (failed 1 times): 400 Bad Request
2016-05-08 10:43:53 [scrapy] DEBUG: Retrying https://m.zhihu.com/node/TopicsPlazzaListV2>; (failed 2 times): 400 Bad Request
2016-05-08 10:43:53 [scrapy] DEBUG: Gave up retrying https://m.zhihu.com/node/TopicsPlazzaListV2>; (failed 3 times): 400 Bad Request
2016-05-08 10:43:53 [scrapy] DEBUG: Crawled (400) https://m.zhihu.com/node/TopicsPlazzaListV2>; (referer: https://m.zhihu.com/topics)
2016-05-08 10:43:53 [scrapy] DEBUG: Ignoring response <400 https://m.zhihu.com/node/TopicsPlazzaListV2>;: HTTP status code is not handled or not allowed
1 个回答
  • 把header设置成手机浏览器的header试试。

    2022-11-12 01:48 回答
撰写答案
今天,你开发时遇到什么问题呢?
立即提问
热门标签
PHP1.CN | 中国最专业的PHP中文社区 | PNG素材下载 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有