Why Is The Asyncio Library Slower Than Threads For This I/o-bound Operation?
Solution 1:
First, I can't reproduce a performance difference nearly as large as the one you're seeing on my Linux machine. I'm consistently seeing about 20-25 seconds for the threaded version, and between 24-34 seconds for the asyncio
version.
Now, why is asyncio
slower? There are a few things that contribute to this. First, the asyncio
version has to print sequentially, but the threaded version doesn't. Printing is I/O, so the GIL can be released while it's happening. That means potentially two or more threads can print at the exact same time, though in practice it may not happen often, and probably doesn't make all that much difference in performance.
Second, and much more importantly, the asyncio
version of getaddrinfo
is actually just calling socket.getaddrinfo
in a ThreadPoolExecutor
:
defgetaddrinfo(self, host, port, *,
family=0, type=0, proto=0, flags=0):
if self._debug:
return self.run_in_executor(None, self._getaddrinfo_debug,
host, port, family, type, proto, flags)
else:
return self.run_in_executor(None, socket.getaddrinfo,
host, port, family, type, proto, flags)
It's using the default ThreadPoolExecutor
for this, which only has five threads:
# Argument for default thread pool executor creation._MAX_WORKERS = 5
That's not nearly as much parallelism you want for this use-case. To make it behave more like the threading
version, you'd need to use a ThreadPoolExecutor
with 1000 threads, by setting it as the default executor via loop.set_default_executor
:
loop = asyncio.get_event_loop()
loop.set_default_executor(ThreadPoolExecutor(1000))
coroutines = asyncio.wait([getaddr(loop, i+site) foriincreate_host(char)])
loop.run_until_complete(coroutines)
Now, this will make the behavior more equivalent to threading
, but the reality here is you're really not using asynchronous I/O - you're just using threading
with a different API. So the best you can do here is identical performance to the threading
example.
Finally, you're not really running equivalent code in each example - the threading
version is using a pool of workers, which are sharing a queue.Queue
, while the asyncio
version is spawning a coroutine for every single item in the url list. If I make the asyncio
version to use a asyncio.Queue
and pool of coroutines, in addition to the removing the print statements and making a larger default executor, I get essentially identical performance with both versions. Here's the new asyncio
code:
import asyncio
import string
import time
from concurrent.futures import ThreadPoolExecutor
start = time.time()
defcreate_host(char):
for i in char:
yield i
for i in create_host(char):
iflen(i)>1:
returnFalsefor c in char:
yield c + i
char = string.digits + string.ascii_lowercase
site = '.google.com'@asyncio.coroutinedefgetaddr(loop, q):
whileTrue:
url = yieldfrom q.get()
ifnot url:
breaktry:
res = yieldfrom loop.getaddrinfo(url,80)
except:
pass@asyncio.coroutinedefload_q(loop, q):
for host in create_host(char):
yieldfrom q.put(host+site)
for _ inrange(NUM):
yieldfrom q.put(None)
NUM = 1000
q = asyncio.Queue()
loop = asyncio.get_event_loop()
loop.set_default_executor(ThreadPoolExecutor(NUM))
coros = [asyncio.async(getaddr(loop, q)) for i inrange(NUM)]
loop.run_until_complete(load_q(loop, q))
loop.run_until_complete(asyncio.wait(coros))
end = time.time()
print(end-start)
And Output of each:
dan@dandesk:~$ python3 threaded_example.py
20.409344911575317
dan@dandesk:~$ python3 asyncio_example.py
20.39924192428589
Note that there is some variability due to the network, though. Both of them will sometimes be a few seconds slower than this.
Post a Comment for "Why Is The Asyncio Library Slower Than Threads For This I/o-bound Operation?"