上一次我们爬虫我们已经成功的爬下了网页的源代码,那么这一次我们将继续来写怎么抓去具体想要的元素

首先回顾以下我们BeautifulSoup的基本结构如下

复制代码
#!/usr/bin/env python # -*-coding:utf-8 -*- from bs4 import BeautifulSoup import requests

headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36',
}

url = "爬取网页的地址" web_data = requests.get(url,headers=headers)
soup = BeautifulSoup(web_data.text,"lxml"