Requests库是用Python编写的,基于urllib,采用Apache2 Licensed开源协议的HTTP库,相比urllib库,Requests库更加方便,可以节约我们大量的工作,完全满足HTTP测试需求。
Requests是用python语言基于urllib编写的,采用的是Apache2 Licensed开源协议的HTTP库如果你看过上篇文章关于urllib库的使用,你会发现,其实urllib还是非常不方便的,
而Requests它会比urllib更加方便,可以节约我们大量的工作。(用了requests之后,你基本都不愿意用urllib了)一句话,requests是python实现的最简单易用的HTTP库,建议爬虫使用requests库。
默认安装好python之后,是没有安装requests模块的,需要单独通过pip安装
pip insatll request
import requests response = requests.get('http://www.baidu.com/')#首先调用requests的get方法 print(type(response)) print(response.status_code) print(type(response.text)) print(response.text)#不需要解码就可以直接打印信息print(response.cookies) print(response.cookies)
import requests requests.post('http://httpbin.org/post') requests.put('http://httpbin.org/put') requests.delete('http://httpbin.org/delete') requests.head('http://httpbin.org/get') requests.options('http://httpbin.org/get')
import requests response = requests.get('http://httpbin.org/get') print(response.text)#对比urllib,无需用decode解码
import requests response = requests.get('http://httpbin.org/get?name=Arise&age=22') print(response.text)
另一种写法:
import requests data = { 'name':'Arise', 'age':22 }
#直接使用requests.get的params参数即可实现以上的操作 response = requests.get('http://httpbin.org/get',params = data) print(response.text)
import requests response = requests.get('http://httpbin.org/get') print(type(response.text)) print(response.json()) #可以对比一下Json转化的和直接调用response的Json方法的区别print(response.json()) print(type(response.json()))
另一种方式比较:
import requests import json response = requests.get('http://httpbin.org/get') print(type(response.text)) print(response.json()) print(json.loads(response.text)) print(type(response.json()))
import requests response2 = requests.get('https://github.com/favicon.ico')
print(type(response2.text),type(response2.content)) print(response2.text) print(response2.content)
将图片抓取下来:
import requests response2 = requests.get('https://github.com/favicon.ico') with open('E://favicon.ico','wb')as f: f.write(response2.content) f.close()
import requests #不加headers有可能会被禁,造成爬取失败 response = requests.get('http://www.zhihu.com/explore') print(response.text)
获取的结果:
<html> <head><title>400 Bad Request</title></head> <body bgcolor="white"> <center><h1>400 Bad Request</h1></center> <hr><center>openresty</center> </body> </html>
获取的结果
本文链接:http://task.lmcjl.com/news/6776.html