首页爬取链家某地租房信息

爬取链家某地租房信息

时间: 2025-01-17 18:06:07 浏览: 37

爬取链家某地租房信息可以通过编写Python脚本来实现，主要使用`requests`库来发送HTTP请求，使用`BeautifulSoup`库来解析HTML内容。以下是一个简单的示例代码： ```python import requests from bs4 import BeautifulSoup import csv def get_page(url): headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36' } response = requests.get(url, headers=headers) if response.status_code == 200: return response.text else: return None def parse_page(html): soup = BeautifulSoup(html, 'html.parser') house_list = soup.find_all('div', class_='content__list--item') data = [] for house in house_list: title = house.find('p', class_='content__list--item--title').get_text(strip=True) price = house.find('span', class_='content__list--item-price').get_text(strip=True) location = house.find('p', class_='content__list--item--des').get_text(strip=True) data.append([title, price, location]) return data def save_to_csv(data, filename): with open(filename, 'w', newline='', encoding='utf-8') as f: writer = csv.writer(f) writer.writerow(['标题', '价格', '位置']) writer.writerows(data) def main(): url = 'https://bj.lianjia.com/zufang/' html = get_page(url) if html: data = parse_page(html) save_to_csv(data, 'lianjia_rent.csv') print('数据已保存到 lianjia_rent.csv') else: print('请求失败') if __name__ == '__main__': main() ``` 这个脚本的主要步骤如下： 1. **发送HTTP请求**：使用`requests.get`方法获取网页内容。 2. **解析HTML内容**：使用`BeautifulSoup`解析HTML，提取租房信息。 3. **保存数据**：将提取的数据保存到CSV文件中。

阅读全文