Link to the thread: https://programming.dev/post/8969747
Hello everyone, I’ve followed this thread yesterday and noticed a few very negative reactions towards the choice of Java. I follow Java evolution from far away, but it seemed like it was evolving in a good direction since the last few years, and that performance-wise it would make sense for the back-end of a Lemmy-like platform.
Is it indeed the case? I was just curious to see that much negativity towards one of the most popular languages.
Well Lemmy is Rust - Plus Lemmy is already an alternative for Reddit, so all the “normies” are still on Reddit - So Lemmy itself is already a bit of a niche sample size.
Rust developers are already known (/memed) to be elitist about Rust - and “Java is Bad” is also just the general consensus among developers, especially ones using more niche languages
Rust developers are already known (/memed) to be elitist about Rust
They’re also extremely toxic. An example from 4 months ago when they vandalized cppreference.com :
The meme is that most Rust devs merely shout slogans like “memory-safety” without knowing what they mean, precisely because many of them come from web dev backgrounds (this video by Prime Time proves why that’s problematic: https://www.youtube.com/watch?v=Wz0H8HFkI9U , the guy has no clue what
std::unique_ptr
is) and have never touched a pointer in their lives. Easy and “appealing to hobbyists” languages are always an issue as the community usually ends up becoming toxic and full of wrong practices being normalized, and a prime example of that is PHP.Another example is how Lemmy initially struggled to handle 10k~20k users during the Reddit exodus despite the backend being written in the “ultra-fast memory-safe totally-will-replace-C++” Rust. Why? See this: https://github.com/LemmyNet/lemmy/issues/2877 and they were doing stuff like joining huge-ass tables before the filtering. If phiresky didn’t save them with his SQL prowess Lemmy would have literally died and its backend being written in Rust would not have changed a single thing.
Rust gives hobbyists the illusion that their projects will suddenly become fast and bug-free if they write them in Rust, and they don’t even hide that mentality as you can see that on almost every single project that’s written in Rust they list “written in Rust” as the main selling argument. This is probably the only language I’ve seen where this happens.
Now as for the “Java bad”, I’m kind of guilty of it too. I very much dislike how academia is obsessed with UML diagrams and the “Java way” of seeing OOP and interfaces everywhere. CPUs and GPUs do not think in OOP. They do not see “objects”. They see data, registers, caches, branches but certainly not your “beautiful abstract class”. When you think you did a good job of crafting a “clean” UML diagram with lots of “nice interfaces” which you then implement using virtual polymorphism in C++ and abuse
dynamic_cast
, you’re torturing the CPU with indirections, cache misses and branch mispredictions. Dynamic polymorphism and virtual inheritance in particular should not be the standard way to solve problems, yet that’s exactly what academia teaches and most of those who push those ideas coincidentally also happen to be from Java backgrounds and that’s why the “Java bad” meme is still alive.That said, beyond academia, I think it’s obviously stupid to religiously shit on Java. Lot’s of advanced features are coming out, Android is a thing thanks to Java and lots of web services are working with high reliability thanks to it. Also obviously, one has a much better chance at landing a high-paid software engineering job if one knows Java than if one knew only Rust.
They’re also extremely toxic. An example from 4 months ago when they vandalized cppreference.com
What does this even mean? One dopey teenager defaces a website, so now everyone associated with Rust is toxic?
This whole argument is just young edgelords bickering with old edgelords, in an eternal and pointless cycle.
Just curious, what other languages have had “one dopey teenager” of their community go and deface cppreference.com ? (which by the way happened multiple times with Rust kiddies, not just 4 months ago)
Re: “the guy has no clue what
std::unique_ptr
is”, are you saying that because of his assertion thatunique_ptr
has a non-zero cost, whereas Rust’sBox
does not?He’s actually correct about that, although the difference is fairly minimal, and I believe the difference is outweighed by the unwinding (i.e. panic/exception handling) code that needs to be generated in both cases. But with unwinding disabled, you can see clearly that Rust generates exactly the same code for a
Box
as for a raw pointer, whereas C++ does not:The reason I looked into this is because of a Chandler Carruth talk primarily about
unique_ptr
called “There Are No Zero-Cost Abstractions”, which explains in detail why C++ fundamentally can’t optimizeunique_ptr
to generate the same code as a raw pointer.That’s a bad apples-to-oranges comparison,
unique_ptr
frees memory upon destruction, which with the raw pointer version you don’t do. The least you could do is use rvalue references. The class layout ofunique_ptr
is also hard to optimize away (unless via LTO) becauseconsume
isn’t in the same translation unit and the compiler has to let your binary be ABI compatible with the rest of your binaries. (Also, you’re using Clang 9 by the way, we are at version 17 now)This is much fairer: https://godbolt.org/z/v4PYcd8hf
Then, if you additionally make the functions’ bodies accessible to the compiler and add a
free
to the raw pointer version (for fairness if you insist to haveconsume
orfoo
destroy the resource), you should get an almost identical assembly code (with still an extra indirection that you’ll see in an extramov
due to the fact that the C++ compiler still doesn’t see how you use them, but IMO that should still be a textbook case for LTO), and the non-zero difference should disappear altogether once you actually use those functions and if it doesn’t you absolutely should file a bug report.Carruth, while an excellent presenter, has been on a “C++ standard committee bad, why don’t we do more ABI-breaking changes, y’all suck, Abseil and Carbon rule” rant spree, with that basically materialized by Google stopping active participation in Clang (haven’t followed the drama since then so not sure if Google backtracked on that decision), and it’s hard to consider him to be objective about this since he also has the Carbon project and his recent Carbon talks are painful to watch as it’s hard to ignore how he’s going from a “C++ optimization chad” that he used to be to a Google marketing/sales person.
That’s a bad apples-to-oranges comparison,
unique_ptr
frees memory upon destruction, which with the raw pointer version you don’t do.I intentionally crafted an example where the code is simply using
unique_ptr
(andBox
) without freeing the memory, just as it uses the raw pointer without freeing it. Theconsumes
function would of course free it, hence the name. Freeing the memory shouldn’t be all that different betweenfree
,~unique_ptr
, andBox::drop
.Moreover, the Rust code is doing the same thing the C++ code is doing;
Box
frees memory just likeunique_ptr
does.The least you could do is use rvalue references.
I was surprised to see how much lower-overhead that looks, and I couldn’t remember why I originally wrote the example as passing by value until I reviewed Carruth’s video. But he actually talks about using rvalue references around the 22 minute mark, and then goes back to passing by value, so I assume that’s why I wrote it the way I did. I do think it’s pretty counterintuitive that a type that’s semantically a pointer needs to be passed by reference for efficiency.
The class layout of
unique_ptr
is also hard to optimize away (unless via LTO)…The “class layout” of
unique_ptr
is just a pointer; are you talking about the struct needing to be on the stack in order to satisfy the ABI? That’s true, but people do in fact need to pass data between multiple different translation units (and even into and out of dynamically-loaded libraries), so that should be possible to do in an efficient manner. And, again, both the raw-pointer version and the Rust version manage to make this work.you’re using Clang 9 by the way…
Oops, good catch; I crafted this example a long time ago and did try it with the most recent version, but I guess that must have been in a different tab. But it doesn’t actually make much of a difference here.
Then, if you additionally make the functions’ bodies accessible to the compiler and add a free to the raw pointer version… and the non-zero difference should disappear altogether once you actually use those functions…
Yes, sure, compiling in one translation unit helps, but as I mentioned above, passing an owning pointer between translation units shouldn’t be inherently inefficient. But also, as far as I can tell, making those changes doesn’t actually make the
unique_ptr
and raw-pointer assembly equivalent. The&&
in the signature for “consumes” is odd because the function doesn’t actually take ownership of the pointer so it doesn’t actually free it, and consequently the inlining of the function is a no-op and the destructor is called insidefoo
. But that doesn’t hinder the raw-pointer comparison much, because the C version just inlinesconsumes
. I don’t read assembly well enough to understand whether the extramov
in theunique_ptr
version is very significant or why it exists. (Theprint_global
function is only here to prevent the other functions from being turned into no-ops.)https://godbolt.org/z/83T8Gfszv
“Abseil and Carbon rule…”
Abseil is…a collection of C++ libraries? How does that make him biased against the C++ standards committee? Carbon was announced in 2022, and the talk I linked was given in 2019, so I don’t know if Carruth was on his “rant spree” in your opinion at that point. But the point of linking to Carruth’s talk was just to explain where that example originally came from and to let someone more knowledgeable than myself explain why it would require ABI breakage for C++ to optimize
unique_ptr
as well as Rust optimizesBox
.The reason I said to use rvalue references is because otherwise it is an apples-to-oranges comparison: in the C++ code you have implicit ABI decisions around the call convention and whose responsibility it is to destroy the temporary.
Yes, sure, compiling in one translation unit helps, but as I mentioned above, passing an owning pointer between translation units shouldn’t be inherently inefficient
https://godbolt.org/z/9875qMM6Y (or alternatively: https://godbolt.org/z/9xehs3sYP)
The assembly is identical, the ownership is clearly transferred, and this doesn’t need LTO or looking at the function bodies and is entirely done by the C++ compiler. It involves using (when available) a vendor attribute (see trivial_abi, shouldn’t be an issue given Rust devs are fine with having only one compiler anyway) and writing a
UniquePtr
class (shouldn’t be used in production code, what I’ve given there is only for illustration purposes) that assumes that the custom deleter cannot have an internal state.This is a zero-runtime-cost abstraction. Now whether the zeroing of that cost can depend on what ABI assumptions you’re ready to make, or whether you want to depend on LTO is another thing. We’re literally discussing a “problem” that is not really a problem because Rust doesn’t have the luxury yet to have that problem: you’re easily forgetting that Rust has only one compiler.
Carbon was announced in 2022
A project like that usually takes years, so again, very likely that they began working on it years before that. For instance, Google designed Go in 2007 and announced it in November 2009.