PyCon 2024 showcased a number of ways to speed the pokey Python programming language including sub-interpreters, immortal objects, just-in-time compilation and more.
In all the stuff I do in Python, runtime is not a consideration at all. Developer productivity is far more of a bottleneck. Having said that, I do of course see the value in these endeavours.
If everyone had a magic lamp that told them whether performance was going to be an issue when they started a project then maybe it wouldn’t matter. But in my experience people start Python projects with “performance doesn’t matter”, write 100k lines of code and then ask “ok it’s too slow now, what do we do”. To which the answer is “you fucked up, you shouldn’t have used Python”.
No, it’s usually “microservices” or “better queries” or something like that. Python performance shouldn’t be an issue in a well-architected application. Source: I work on a project with hundreds of thousands of lines of Python code.
We also do “real work,” and that uses libraries that use C(++) under the hood, like scipy, numpy, and tensorflow. We do simulations of seismic waves, particle physics simulations, etc. Most of our app is business logic in a webapp, but there’s heavy lifting as well. All of “our” code is Python. I even pitched using Rust for a project, but we were able to get the Python code “fast enough” with numba.
We separate expensive logic that can take longer into background tasks from requests that need to finish quickly. We auto-scale horizontally as needed so everything remains responsive.
That’s what I mean by “architected well,” everything stays responsive and we just increase our hosting costs instead of development costs. If we need to, we could always rewrite parts in a faster language, provided that costs less than the development costs. We really don’t spend much time at all optimizing python code, so we’re not at that point yet.
That being said, I do appreciate faster-running code. I use Rust for most of my personal projects, but that’s because I don’t have to pay a team to maintain my projects.
Matrix code is the very best case for offloading work from Python to something else though.
Think about something like a build system (e.g. scons) or a package installer (pip). There is no part of them that you can point to and say “that’s the slow bit, write it in C” because the slowness is distributed through the entire thing.
Both of those are largely bound by i/o, but with some processing in between, so the best way to speed things up is probably am async i/o loop that feeds a worker pool. In Python, you’d use processes, which can be expensive and a little complicated, but workable.
And as you pointed out, scons and pip exist, and they’re fast enough. I actually use poetry, and it’s completely fine.
You could go all out and build something like cargo, but it’s the architecture decisions that matter most in something i/o bound like that.
Strong disagree. I switched from pip to uv and it sped my install time up from 58 seconds to 7. Yeah really. If pip is i/o bound where is all that speed up coming from?
That’s pretty impressive! We have a bunch of a bunch of compiled stuff (numpy, tensorflow, etc), so I’m guessing we wouldn’t see as dramatic of an improvement.
Then again, <1 min is “good enough” for me, certainly good enough to not warrant a rewrite. But I’ll have to try uv out, maybe we’ll switch to it. We switched from requirements.txt -> pyproject.toml using poetry, so maybe it’s worth trying out the improved pyproject.toml support. Our microservices each take ~30s to install (I think w/o cache?), which isn’t terrible and it’s a relatively insignificant part of our build pipelines, but rebuilding everything from scratch when we upgrade Python is a pain.
In all the stuff I do in Python, runtime is not a consideration at all. Developer productivity is far more of a bottleneck. Having said that, I do of course see the value in these endeavours.
If everyone had a magic lamp that told them whether performance was going to be an issue when they started a project then maybe it wouldn’t matter. But in my experience people start Python projects with “performance doesn’t matter”, write 100k lines of code and then ask “ok it’s too slow now, what do we do”. To which the answer is “you fucked up, you shouldn’t have used Python”.
No, it’s usually “microservices” or “better queries” or something like that. Python performance shouldn’t be an issue in a well-architected application. Source: I work on a project with hundreds of thousands of lines of Python code.
Well yeah if by “well architected” you mean “doesn’t use Python”.
Not everything is a web service. Most of the slow Python code I encounter is doing real work.
We also do “real work,” and that uses libraries that use C(++) under the hood, like scipy, numpy, and tensorflow. We do simulations of seismic waves, particle physics simulations, etc. Most of our app is business logic in a webapp, but there’s heavy lifting as well. All of “our” code is Python. I even pitched using Rust for a project, but we were able to get the Python code “fast enough” with numba.
We separate expensive logic that can take longer into background tasks from requests that need to finish quickly. We auto-scale horizontally as needed so everything remains responsive.
That’s what I mean by “architected well,” everything stays responsive and we just increase our hosting costs instead of development costs. If we need to, we could always rewrite parts in a faster language, provided that costs less than the development costs. We really don’t spend much time at all optimizing python code, so we’re not at that point yet.
That being said, I do appreciate faster-running code. I use Rust for most of my personal projects, but that’s because I don’t have to pay a team to maintain my projects.
Matrix code is the very best case for offloading work from Python to something else though.
Think about something like a build system (e.g. scons) or a package installer (pip). There is no part of them that you can point to and say “that’s the slow bit, write it in C” because the slowness is distributed through the entire thing.
Both of those are largely bound by i/o, but with some processing in between, so the best way to speed things up is probably am async i/o loop that feeds a worker pool. In Python, you’d use processes, which can be expensive and a little complicated, but workable.
And as you pointed out, scons and pip exist, and they’re fast enough. I actually use poetry, and it’s completely fine.
You could go all out and build something like cargo, but it’s the architecture decisions that matter most in something i/o bound like that.
Strong disagree. I switched from pip to uv and it sped my install time up from 58 seconds to 7. Yeah really. If pip is i/o bound where is all that speed up coming from?
That’s pretty impressive! We have a bunch of a bunch of compiled stuff (numpy, tensorflow, etc), so I’m guessing we wouldn’t see as dramatic of an improvement.
Then again, <1 min is “good enough” for me, certainly good enough to not warrant a rewrite. But I’ll have to try uv out, maybe we’ll switch to it. We switched from requirements.txt -> pyproject.toml using poetry, so maybe it’s worth trying out the improved pyproject.toml support. Our microservices each take ~30s to install (I think w/o cache?), which isn’t terrible and it’s a relatively insignificant part of our build pipelines, but rebuilding everything from scratch when we upgrade Python is a pain.