python获取数据代码_用python爬取网页数据

激活谷笔记 • 2026-03-19 20:14 • 阅读 50

在Python中获取数据可以通过多种方法，以下是一些常用的方法：

文件读写

使用内置的`open`函数读取本地或远程文件。

示例代码：

python

file_path = 'text_files/file.txt' 替换为你的文件路径

with open（file_path, 'r'） as file:

content = file.read（）读取文件内容

print（content）打印文件内容

Pandas库读取数据

用于读取和处理结构化数据，如CSV、Excel等。

示例代码：

python

import pandas as pd

df = pd.read_csv（'data.csv'）读取CSV文件

print（df.head（））打印前几行数据

Numpy库读取数据

用于读取和处理数值数据。

示例代码：

python

import numpy as np

data = np.loadtxt（'data.txt'）读取文本文件中的数值数据

print（data）打印数据

网络爬虫

使用库如`requests`和`BeautifulSoup`从网页中提取数据。

示例代码：

python

import requests

from bs4 import BeautifulSoup

url = 'https://example.com' 替换为你的目标网址

response = requests.get（url）

soup = BeautifulSoup（response.text, 'html.parser'）

data = soup.find_all（'div', class_='content'）查找所有class为content的div标签

for div in data:

print（div.text）打印文本内容

使用Selenium库

从软件界面抓取数据，需要安装Selenium库和相应的浏览器驱动。

示例代码：

python

from selenium import webdriver

driver = webdriver.Chrome（）需要安装Chrome浏览器和对应的驱动

driver.get（'https://example.com'）替换为你的目标网址

data = driver.find_elements_by_class_name（'content'）查找所有class为content的素

for element in data:

print（element.text）打印文本内容

driver.quit（）关闭浏览器

正则表达式

使用`re`模块提取字符串中的特定数据模式。

示例代码：

python

import re

text = 'Hello 123 World 456' 目标字符串

pattern = re.compile（r'\d+'）匹配一个或多个连续数字

numbers = pattern.findall（text）提取所有数字

print（numbers）打印提取的数字列表

以上方法可以帮助你在Python中获取不同类型的数据。请根据你的具体需求选择合适的方法

python获取数据代码_用python爬取网页数据

文件读写

Pandas库读取数据

Numpy库读取数据

网络爬虫

使用Selenium库

正则表达式

相关推荐