【Py】关于aiohttp的一段示例代码

最新推荐文章于 2025-04-22 14:02:18 发布

micromicrofat

最新推荐文章于 2025-04-22 14:02:18 发布

阅读量183

点赞数

分类专栏： Python 异步编程文章标签： python aiohttp asyncio 异步协程

本文链接：https://blog.csdn.net/MacwinWin/article/details/122458627

版权

Python 同时被 2 个专栏收录

180 篇文章

订阅专栏

异步编程

7 篇文章

订阅专栏

本文对比了使用Python的Requests库和aiohttp库进行同步与异步网页抓取的效率。通过具体示例展示了异步请求如何显著提高批量网页抓取的速度。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

"""Comparison of fetching web pages sequentially vs. asynchronously
Requirements: Python 3.5+, Requests, aiohttp, cchardet
For a walkthrough see this blog post:
http://mahugh.com/2017/05/23/http-requests-asyncio-aiohttp-vs-requests/
"""
import asyncio
from timeit import default_timer

from aiohttp import ClientSession
import requests

def demo_sequential(urls):
    """Fetch list of web pages sequentially."""
    start_time = default_timer()
    for url in urls:
        start_time_url = default_timer()
        _ = requests.get(url)
        elapsed = default_timer() - start_time_url
        print('{0:30}{1:5.2f} {2}'.format(url, elapsed, asterisks(elapsed)))
    tot_elapsed = default_timer() - start_time
    print(' TOTAL SECONDS: '.rjust(30, '-') + '{0:5.2f} {1}'. \
        format(tot_elapsed, asterisks(tot_elapsed)) + '\n')

def demo_async(urls):
    """Fetch list of web pages asynchronously."""
    start_time = default_timer()

    loop = asyncio.get_event_loop() # event loop
    future = asyncio.ensure_future(fetch_all(urls)) # tasks to do
    loop.run_until_complete(future) # loop until done

    tot_elapsed = default_timer() - start_time
    print(' WITH ASYNCIO: '.rjust(30, '-') + '{0:5.2f} {1}'. \
        format(tot_elapsed, asterisks(tot_elapsed)))

async def fetch_all(urls):
    """Launch requests for all web pages."""
    tasks = []
    fetch.start_time = dict() # dictionary of start times for each url
    async with ClientSession() as session:
        for url in urls:
            task = asyncio.ensure_future(fetch(url, session))
            tasks.append(task) # create list of tasks
        _ = await asyncio.gather(*tasks) # gather task responses

async def fetch(url, session):
    """Fetch a url, using specified ClientSession."""
    fetch.start_time[url] = default_timer()
    async with session.get(url) as response:
        resp = await response.read()
        elapsed = default_timer() - fetch.start_time[url]
        print('{0:30}{1:5.2f} {2}'.format(url, elapsed, asterisks(elapsed)))
        return resp

def asterisks(num):
    """Returns a string of asterisks reflecting the magnitude of a number."""
    return int(num*10)*'*'

if __name__ == '__main__':
    URL_LIST = ['https://facebook.com',
                'https://github.com',
                'https://google.com',
                'https://microsoft.com',
                'https://yahoo.com']
    demo_sequential(URL_LIST)
    demo_async(URL_LIST)

https://facebook.com           1.45 **************
https://github.com             0.51 *****
https://google.com             1.14 ***********
https://microsoft.com          1.43 **************
https://yahoo.com              2.07 ********************
-------------- TOTAL SECONDS:  6.60 ******************************************************************

https://github.com             0.53 *****
https://microsoft.com          1.05 **********
https://facebook.com           1.07 **********
https://google.com             1.15 ***********
https://yahoo.com              1.92 *******************
--------------- WITH ASYNCIO:  1.93 *******************

参考：https://gist.github.com/dmahugh/b043ecbc4c61920aa685e0febbabb959