用python中django创建网页_python请求网页,获取网页信息

激活谷笔记 • 2025-05-25 18:32 • 阅读 101

在Python中实现网页链接的获取，通常有以下几种方法：

1. 使用`requests`库：

```python

import requests

获取网页内容

url = 'http://www.example.com'

response = requests.get（url）

html_content = response.text

使用正则表达式提取所有链接

import re

link_list = re.findall（r'http[s]？://（？:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\（\\）,]|（？:%[0-9a-fA-F][0-9a-fA-F]））+', html_content）

打印提取到的链接

for link in link_list:

print（link）

2. 使用`urllib2`库（Python 2.x版本）：```pythonimport urllib2
import re
 连接到URL
website = urllib2.urlopen（url）
html_content = website.read（）
 使用正则表达式提取所有链接
link_list = re.findall（r'http[s]？://（？:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\（\\）,]|（？:%[0-9a-fA-F][0-9a-fA-F]））+', html_content）
 打印提取到的链接
for link in link_list:
 print（link）

3. 使用`BeautifulSoup`库解析HTML内容：

```python

from bs4 import BeautifulSoup

import requests

获取网页内容

url = 'http://www.example.com'

response = requests.get（url）

html_content = response.text

使用BeautifulSoup解析HTML内容

soup = BeautifulSoup（html_content, 'html.parser'）

提取所有链接

for link in soup.find_all（'a'）:

href = link.get（'href'）

if href and href.startswith（'http'）:

print（href）

以上代码示例展示了如何使用`requests`和`BeautifulSoup`库获取网页上的所有链接。请根据您的需求选择合适的方法。

用python中django创建网页_python请求网页,获取网页信息

相关推荐