Python数据可视化分析(Python链家网租房数据可视化分析)

1:写在前言

前言:今天和大家分享一个基于PythonWeb的Django框架写的一款租房数据可视化分析的项目. 

2:项目简介:

1⃣ 项目名称:基于django的链家网租房房源数据可视化分析.

2⃣ 项目实现功能:1、用户登录注册,2、个人信息编辑以及个人密码修改,3、数据分页总览以及实现了用户可以对心仪房源进行收藏,4、首页大屏展示了用户的注册数据以及数据库中所有房源数据的基本属性数据,5、针对爬取的租房房源数据的各个字段做可视化图表分析处理.

3⃣ 项目涉及技术:Python、Django、mysql、Echarts 、爬虫…

3:项目代码分析

3.1:数据爬虫

import requests  # 用于获取响应
from lxml import etree  # 用于解析HTML网页
import time  # 用于控制时间
import pymysql
cnx = pymysql.connect(
    host="localhost",
    user="root",
    password="123456",
    database="kunming"
)
cursor = cnx.cursor()
# 导入写好的连接数据库的包
headers={'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.64 Safari/537.36'}

# 定义网页
url = 'https://km.lianjia.com/zufang/pg8/#contentList'
res = requests.get(url, headers=headers)
# 用于解决etree.HTML解析异常
html = etree.HTML(res.text)
# 使用xpath获取 房源标题 信息
name = html.xpath('//*[@id="content"]/div[1]/div[1]/div/div/p[1]/a/text()')
# 去除列表中的 换行符 \n
name = [x.strip() for x in name if x.strip() != '']
# 所在区
district = html.xpath('//*[@id="content"]/div[1]/div[1]/div/div/p[2]/a[1]/text()')
# 所在路
street = html.xpath('//*[@id="content"]/div[1]/div[1]/div/div/p[2]/a[2]/text()')
# 房屋面积
floor_space1 = html.xpath('//*[@id="content"]/div[1]/div[1]/div/div/p[2]/text()[5]')
# 定义一个空列表
floor_space = []
for i in floor_space1:
    # 将获取到的数据去除不需要的符号、空格、换行符
    floor_space1 = str(i).replace('㎡', '').replace("\n", '').replace(' ', '')
    floor_space.append(floor_space1) # 使用.append()方法将处理过的数据已追加的方存入列表中
# 朝向
orientation = html.xpath('//*[@id="content"]/div[1]/div[1]/div/div/p[2]/text()[6]')
# 房型
house_type = html.xpath('//*[@id="content"]/div[1]/div[1]/div/div/p[2]/text()[7]')
# 去除列表中的 换行符 \n
house_type = [x.strip() for x in house_type if x.strip() != '']
# 价格
price = html.xpath('//*[@id="content"]/div[1]/div[1]/div/div/span/em/text()')
for i in range(len(name)):
    print('开始爬取第'+str(i+1)+'条数据')
    title = name[i].split('·')[1].split(' ')[0]

    type = name[i].split('·')[0]


    building = name[i].split('·')[1].split(' ')[1]

    city = district[i]
    street1 = street[i]

    if '.' not in floor_space[i].strip():
        area = 80.00
    else:
        area = float(floor_space[i].strip())
    direct = orientation[i].strip().replace('㎡','')
    price1 = float(price[i])
    link = url
    insert_query = "INSERT INTO House(title,type,building,city,street,area,direct,price,link) VALUES (%s, %s, %s,%s, %s, %s,%s, %s, %s)"
    data = (title,type,building,city,street1,area,direct,price1,link)
    cursor.execute(insert_query, data)
    cnx.commit()
    data = ()

cursor.close()
cnx.close()

PS:这个爬虫代码在 url = 'https://km.lianjia.com/zufang/pg8/#contentList'指明了爬取的城市和页数,比如km就是昆明,bj自然就是北京啦,但是这个页数就是需要每次爬完手段再换一页,我尝试写一个循环,把页数作为变量,并将爬取过程放在try-except异常里面。发现链家网的网站可能动态阻止了,因为给一个同学赶得比较近,她第二天就要这个项目还有论文,所以我也就没有再去做处理了,不过感觉使用selenium自动化测试也该可以解决的,后面再看吧!

3.2数据库设计


from django.db import models
from django.utils.safestring import mark_safe

# 用户数据库
class User(models.Model):
    id = models.AutoField('id',primary_key=True)
    username = models.CharField(verbose_name="姓名", max_length=22, default='')
    password = models.CharField(verbose_name="密码", max_length=32, default='')
    phone = models.CharField(verbose_name="手机号", max_length=11, default='')
    email = models.CharField(verbose_name="邮箱", max_length=22, default='')
    time = models.DateField(verbose_name="创建时间", auto_now_add=True)
    avatar = models.FileField(verbose_name="头像", default='user/avatar/default.gif', upload_to="user/avatar/")
    def admin_sample(self):
        return mark_safe('<img src="/media/%s" height="60" width="60" />' % (self.avatar,))
    admin_sample.short_description = '用户头像'
    admin_sample.allow_tags = True
    def __str__(self):
        return self.username
    class Meta:
        db_table = 'User'
        verbose_name_plural = '用户管理'
# 新房房源
class House(models.Model):
    title = models.CharField(max_length=100, verbose_name='房源名称')
    type = models.CharField(max_length= 100,verbose_name='房源类型')
    building=models.CharField(max_length=100,verbose_name='房源布局')
    city=models.CharField(max_length=100,verbose_name='行政区')
    street=models.CharField(max_length=300,verbose_name='街道')
    area=models.IntegerField(verbose_name='房源面积')
    direct=models.CharField(max_length=100,verbose_name='朝向')
    price=models.IntegerField(verbose_name='价钱')
    link=models.CharField(max_length=100, verbose_name='链接详情')
    class Meta:
        db_table = "house"
    class Meta:
        db_table = 'House'
        verbose_name_plural = '房源管理'

class Histroy(models.Model):
    id = models.AutoField(primary_key=True, verbose_name='ID')
    house = models.ForeignKey(House,on_delete=models.CASCADE)
    user = models.ForeignKey(User,on_delete=models.CASCADE)
    count = models.IntegerField("点击次数",default=1)
    class Meta:
        db_table = "histroy"
    class Meta:
        db_table = 'History'
        verbose_name_plural = '房源收藏'

3.3:后台业务处理

from django.shortcuts import render
# enconding='utf-8'
import time
from collections import defaultdict
from django.core.paginator import Paginator
from django.db.models import F, Avg
from django.shortcuts import render, redirect
from django.db.models import Count
from app.models import House, User, Histroy
from app.backends import wouldCloud, getHistoryTableData
# Create your views here.
def login(request):
    if request.method == 'GET':
        return render(request, 'login.html')
    if request.method == 'POST':
        name = request.POST.get('name')
        password = request.POST.get('password')
        if User.objects.filter(username=name, password=password):
            user=User.objects.get(username=name, password=password)
            request.session['username'] = {'username':user.username,'avatar':str(user.avatar)}
            return redirect('index')
        else:
            msg = '信息错误!'
            return render(request, 'login.html', {"msg": msg})

# 02用户注册
def register(request):
    if request.method == 'POST':
        name = request.POST.get('name')
        password = request.POST.get('password')
        phone = request.POST.get('phone')
        email = request.POST.get('email')
        avatar = request.FILES.get('avatar')
        stu = User.objects.filter(username=name)
        if stu:
            msg = '用户已存在!'
            return render(request, 'register.html', {"msg": msg})
        else:
            User.objects.create(username=name,password=password,phone=phone,email=email,avatar=avatar)
            msg = "注册成功!"
            return render(request, 'login.html', {"msg": msg})
    if request.method == 'GET':
        return render(request,'register.html')
# 退出登录
def logOut(request):
    request.session.clear()
    return redirect('login')

def index(request):
    users = User.objects.all()
    data = {}
    for u in users:
        if data.get(str(u.time),-1) == -1:
            data[str(u.time)] = 1
        else:
            data[str(u.time)] += 1
    result = []
    for k,v in data.items():
        result.append({
            'name':k,
            'value':v
        })
    timeFormat = time.localtime()
    year = timeFormat.tm_year
    month = timeFormat.tm_mon
    day = timeFormat.tm_mday
    monthList = ["January","February","March","April","May","June","July","August","September","October","November","December"]
    username = request.session['username'].get('username')
    useravatar = request.session['username'].get('avatar')
    newuserlist = User.objects.all()
    houses=House.objects.all().distinct()
    # 数据总量
    houseslength=len(houses)
    # 用户总量
    userlength=len(User.objects.all())
    averageprice=House.objects.all().order_by('-price')[0].price
    buildingtype = House.objects.all().order_by('-price')[0].building
    area_max = House.objects.all().order_by('-area')[0].area
    dict0={};str0=""
    for i in House.objects.all():
        if dict0.get(i.type,-1)==-1:
            dict0[i.type]=1
        else:
            dict0[i.type]+=1
    sorted_items = sorted(dict0.items(), key=lambda x: x[1], reverse=True)
    top_type_3_keys = [item[0] for item in sorted_items[:3]]
    for s in top_type_3_keys:
        str0=str0+s+"~"
    str0=str0[:-1]

    dict1={};str1=""
    for i in House.objects.all():
        if dict1.get(i.city,-1)==-1:
            dict1[i.city]=1
        else:
            dict1[i.city]+=1
    sorted_items = sorted(dict1.items(), key=lambda x: x[1], reverse=True)
    top_3_keys = [item[0] for item in sorted_items[:3]]
    for s in top_3_keys:
        str1=str1+s+"~"
    str1=str1[:-1]
    context={'username':username,'useravatar':useravatar,'houses':houses,'userTime':result,'newuserlist':newuserlist,'year':year,'month':monthList[month-1],'day':day,'houseslength':houseslength,'userlength':userlength
             ,'averageprice':averageprice,'str1':str1,'str0':str0,'buildingtype':buildingtype,'area_max':area_max}
    return render(request,'index.html',context)

def selfInfo(request):
    username = request.session['username'].get('username')
    useravatar = request.session['username'].get('avatar')
    if request.method == 'POST':
        phone=request.POST.get("phone")
        email=request.POST.get("email")
        password=request.POST.get("password")
        selfmes=User.objects.get(username=username)
        selfmes.phone=phone
        selfmes.email=email
        selfmes.password=password
        # selfmes.avatar = request.FILES['avatar']
        selfmes.save()
        userInfo = User.objects.get(username=username)
        context = {'username': username, 'useravatar': useravatar, 'userInfo': userInfo}
        return render(request, 'selfInfo.html', context)
    userInfo=User.objects.get(username=username)
    context={'username':username,'useravatar':useravatar,'userInfo':userInfo}
    return render(request,'selfInfo.html',context)

def tableData(request):
    username = request.session['username'].get('username')
    useravatar = request.session['username'].get('avatar')
    houses=House.objects.all().distinct()
    context={'username':username,'useravatar':useravatar,'houses':houses,}
    return render(request,'tableData.html',context)

def historyTableData(request):
    username = request.session['username'].get('username')
    userInfo = User.objects.get(username=username)
    historyData = getHistoryTableData.getHistoryData(userInfo)
    return render(request, 'collectTableData.html', {
        'username':username,
        'userInfo': userInfo,
        'historyData':historyData
    })
# 收藏
def addHistory(request,houseID):
    username = request.session.get("username").get('username')
    userInfo = User.objects.get(username=username)
    getHistoryTableData.addHistory(userInfo,houseID)
    return redirect('historyTableData')
# 房源发布
def houseDistribute(request):
    username = request.session['username'].get('username')
    useravatar = request.session['username'].get('avatar')
    house = House.objects.all().distinct();dict1={};result1=[];dict2={};result2=[]
    for i in house:
        if dict1.get(i.city,-1)==-1:
            dict1[i.city]=1
        else:
            dict1[i.city]+=1
    for k,v in dict1.items():
        result1.append({
            'value': v,
            "name":k
        })
    for i in house:
        if dict2.get(i.street, -1) == -1:
            dict2[i.street] = 1
        else:
            dict2[i.street] += 1
    for k, v in dict2.items():
        result2.append({
            'value': v,
            "name": k
        })

    for k,v in dict2.items():
        result2.append({
            'value': v,
            "name":k
        })
    context={'result1':result1,'result2':result2,'username':username,'useravatar':useravatar}
    return render(request,'houseDistribute.html',context)

def housetyperank(request):
    username = request.session['username'].get('username')
    useravatar = request.session['username'].get('avatar')
    list1_legend=[];list1=[];list2_legend=[];list2=[];list3_legend=[];list3=[]
    # 查询数据库,获取所有唯一的城市
    cities = House.objects.values_list('city', flat=True).distinct()
    # 将城市放入列表中
    citylist = list(cities)
    cityname=request.GET.get("cityname")
    top_3_types = House.objects.values('type').annotate(type_count=Count('type')).order_by('-type_count')[:3]
    # 如果需要返回具体的类型和数量,可以这样做:
    result = [{'type': item['type'], 'count': item['type_count']} for item in top_3_types]
    list_top_three = []
    for i in result:
        list_top_three.append(i['type'])
    if cityname !='不限':
        type1house=House.objects.filter(city=cityname).filter(type=list_top_three[0]).distinct().order_by('-price')[:10]
        type2house=House.objects.filter(city=cityname).filter(type=list_top_three[1]).distinct().order_by('-price')[:10]
        type3house=House.objects.filter(city=cityname).filter(type=list_top_three[2]).distinct().order_by('-price')[:10]
        for p in type1house:
            if p.title in list1_legend:
                pass
            else:
                list1_legend.append(p.title)
                list1.append({'value':p.price,'name':p.title})
        for p in type2house:
            if p.title in list2_legend:
                pass
            else:
                list2_legend.append(p.title)
                list2.append({'value':p.price,'name':p.title})
        for p in type3house:
            if p.title in list3_legend:
                pass
            else:
                list3_legend.append(p.title)
                list3.append({'value':p.price,'name':p.title})
        context={'username':username,'useravatar':useravatar,'citylist':citylist,'list1_legend':list1_legend,'list1':list1,'list2_legend':list2_legend,'list2':list2,'list3_legend':list3_legend,'list3':list3}
    if cityname not in citylist:
        type1house=House.objects.all().filter(type=list_top_three[0]).order_by('-price')[:10]
        type2house=House.objects.all().filter(type=list_top_three[1]).order_by('-price')[:10]
        type3house=House.objects.all().filter(type=list_top_three[2]).order_by('-price')[:10]
        for p in type1house:
            list1_legend.append(p.title)
            list1.append({'value':p.price,'name':p.title})
        for p in type2house:
            list2_legend.append(p.title)
            list2.append({'value':p.price,'name':p.title})
        for p in type3house:
            list3_legend.append(p.title)
            list3.append({'value':p.price,'name':p.title})
        context = {'username': username, 'useravatar': useravatar, 'citylist': citylist, 'list1_legend': list1_legend,'list1': list1, 'list2_legend': list2_legend, 'list2': list2,
                   'list3_legend': list3_legend,'list3': list3,'list_top_three': list_top_three}
    return render(request, 'housetyperank.html', context)

def housewordcloud(request):
    username = request.session['username'].get('username')
    useravatar = request.session['username'].get('avatar')
    # wouldCloud.wouldCloudMain_street()
    # wouldCloud.wouldCloudMain_building()
    context = {'username':username,'useravatar':useravatar}
    return render(request,'housewordcloud.html',context)

def typeincity(request):
    username = request.session['username'].get('username')
    useravatar = request.session['username'].get('avatar')

    # 获取所有独特的房源类型
    types = list(House.objects.values_list('type', flat=True).distinct())
    # 获取所有独特的城市
    cities = list(House.objects.values_list('city', flat=True).distinct())
    # 创建一个字典来存储房源类型和城市的计数
    house_counts = {}
    # 初始化字典,为每种房源类型创建一个与城市数量相同的列表,初始值为0
    for type_ in types:
        house_counts[type_] = [0] * len(cities)
    # 获取所有房源按类型和城市的计数
    annotated_houses = House.objects.values('type', 'city').annotate(count=Count('id'))
    # 填充字典
    for ah in annotated_houses:
        type_ = ah['type']
        city = ah['city']
        count = ah['count']
        # 找到城市在城市列表中的索引
        city_index = cities.index(city)
        # 更新房源类型的城市计数
        house_counts[type_][city_index] = count
    # 将字典转换为嵌套列表
    result = [house_counts[type_] for type_ in sorted(types)]

    context = {'house_counts':house_counts,'username':username,'useravatar':useravatar,'types':types,'cities':cities,'result':result}
    return render(request,'typeincity.html',context)

def servicemoney(request):
    username = request.session['username'].get('username')
    useravatar = request.session['username'].get('avatar')

    # 获取所有房源数据
    houses = House.objects.all().distinct()
    cities = list(House.objects.values_list('city', flat=True).distinct())

    # 计算每个城市每种房源类型的平均价格
    avg_prices = defaultdict(lambda: defaultdict(list))

    for house in houses:
        avg_prices[house.city][house.type].append(house.price)

    # 计算平均值
    for city, types in avg_prices.items():
        for house_type, prices in types.items():
            avg_prices[city][house_type] = round(sum(prices) / len(prices),2)
    # 准备 yAxis 和 series 数据
    yAxis_data = list(avg_prices.keys())
    series = []

    for house_type in set(house.type for house in houses):
        series_data = [avg_prices[city][house_type] if house_type in avg_prices[city] else 0 for city in yAxis_data]
        series.append({
            'name': house_type,
            'type': 'bar',
            'stack': 'total',
            'label': {
                'show': 'true'
            },
            'emphasis': {
                'focus': 'series'
            },
            'data': series_data
        })

    # 输出最终结果
    print(series)
    context = {'username': username, 'useravatar':useravatar, 'series': series,'cities': cities}
    return render(request, 'servicemoney.html',context)

3.4:系统路由

"""
URL configuration for 基于Python的携程Top10热门景点数据分析与展示 project.

The `urlpatterns` list routes URLs to views. For more information please see:
    https://docs.djangoproject.com/en/5.0/topics/http/urls/
Examples:
Function views
    1. Add an import:  from my_app import views
    2. Add a URL to urlpatterns:  path('', views.home, name='home')
Class-based views
    1. Add an import:  from other_app.views import Home
    2. Add a URL to urlpatterns:  path('', Home.as_view(), name='home')
Including another URLconf
    1. Import the include() function: from django.urls import include, path
    2. Add a URL to urlpatterns:  path('blog/', include('blog.urls'))
"""

from django.urls import path
from . import views
urlpatterns = [
    path('login/', views.login, name='login'),
    path('register/', views.register, name='register'),
    path('index/', views.index, name='index'),
    path('logOut/', views.logOut, name='logOut'),
    path('selfInfo/', views.selfInfo, name='selfInfo'),
    path('tableData/', views.tableData, name='tableData'),
    path('addHistory/<int:houseID>', views.addHistory, name='addHistory'),
    path('historyTableData/', views.historyTableData, name='historyTableData'),
    path('houseDistribute/', views.houseDistribute, name='houseDistribute'),
    path('housetyperank/', views.housetyperank, name='housetyperank'),
    path('typeincity/', views.typeincity, name='typeincity'),

    path('housewordcloud/', views.housewordcloud, name='housewordcloud'),
    path('servicemoney/', views.servicemoney, name='servicemoney'),


]

4:项目截图

 

 

 

 

 

 

 

 

 

 

 

 5:最后

需要该项目的同学可以点赞关注私信我哦,或者添加我下方名片以及个人资料中的方式获取哦,小红书中介绍了许多类似的数据可视化分析的项目,都配有相关技术文档和论文的!

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

小熊Coding

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值