What's New in Rust 1.58 and 1.59

Jon Gjengset: Hello, Ben.

Ben Striegel: Hello, Jon, how are you doing?

Jon: Oh man, I am— First of all, so excited my cats are getting along, and second, so excited for two and a half new Rust releases.

Ben: 2.5, you say. More like 2.1 actually, I’d say.

Jon: Ah, I see what you did there. That was that was very clever. Very clever.

I’m pretty excited about these. I think both of these have pretty serious improvements that I’m excited to get into.

Ben: There’s some definite really big user-facing improvements in these two releases, but I think that they’re more quality of life things than real, like, significant, dramatic changes that shake the foundations of the language. But still, I think people are going to enjoy these.

Jon: I don’t know, I don’t know, stabilized inline assembly is probably, it’s—

Ben: Yeah, well, we’ll get to that. Oh no, don’t jump the gun like that. We’ve got to have some amount of mystery.

Jon: I’m sorry, I’m sorry, I take it back, I take it back.

Ben: Come on. Okay, listeners, please forget what you just heard, because that’s not— that doesn’t exist, because right now it’s Rust 1.58. It is January 13th, 2022. Rust 1.58 just came out, and we’re talking about captured identifiers in format strings.

Jon: This, I believe, is a topic that you know a decent amount, Ben, wouldn’t you say?

Ben: I mean, I know a decent amount about that, but I don’t want to take credit for implementing this, or like, RFCing this, but I’ve been kind of like— I’ve been involved in surrounding efforts for this, and I’ve probably talked about this, I think I’ve spoken about this in a few different podcasts. But in this case, so what’s new in this release is that, if you’re familiar with format strings from other languages, like if you’re like, you know, if you’re used to C, for example, you’re used to the current, like, Rust style of you have, like, some kind of token somewhere in your format string, and then later on you have the name of the thing you want to sub in. So currently in Rust you would say like, you know, the example here is Hello {}. And then after the string you have, like some identifier, like foo. And in other languages like Python, what you can do is, instead of having to have a list of things where, first a string, then a comma, and then, like, thing one you want to print, thing two you want to print— you can just put those things inside the string itself. And so it makes things read a bit more naturally. It can get out of hand if you’re not careful. But so for example, the new hello world, if you have like "world" assigned to a variable called foo, you can just write a string of Hello {foo}. And so that way you can avoid having to have an extra list of things afterwards. And it’s actually pretty nice. It’s a nice little, like, quality of life thing.

The thing is, though, that unlike, say, Python. In Python, for example, you can put arbitrary expressions into format strings. And so Rust isn’t willing to go that far just yet. This is kind of just a taste of what could be. There’s no guarantee that it will ever grow beyond this. So some people would like, for example, the ability to access struct fields here as well, because if you can access a variable, there’s really no risk in also letting you just look at the field of a struct. It’s still a constant-time operation, just as a thing in memory somewhere— give me the value, all done. More radical would be to let you access, say, methods on a thing. And so now you’re running arbitrary code which might be happening just from having a format string somewhere, which some folks might not like. Might be too implicit. So we’ll see. There’s plenty of ways to possibly expand this feature, that may or may not happen. And it’s still very up in the air. But for now, if you just have simple needs— That’s a great thing. It’s a great little quality of life thing, for— especially for new users, because I mean, hello world example, like this, are really prevalent in a lot of documentation.

Also tests can use this. Also not just println!, any macro that uses the format machinery gets this for free. And so we have the panic! macro gets this; we have the unreachable! macro, which we’ll talk about later, also gets this. So anything like that in std. It’s also, if you use any of your own crates, so the log! macro provided by the logging, the log crate, also gets this for free. And so it’s actually a really nice little, just magic upgrade that every macro can now take advantage of, just by relying on the format_args macro.

Jon: Yeah, it’s really neat. I also only learned by reading the release notes that you can use this for parameters as well. Right? Like if you have, if you’re formatting a floating point number and you want to specify the precision, like the number of decimal places you want to include, that can be a variable too, and gets the same kind of in-line captured identifiers as you get for the actual variable itself.

Ben: That’s cool. I didn’t know you could actually capture that. That’s neat.

Jon: Yeah. I mean, I only realized by reading it, which was, you know— it’s almost like this is the point of release notes. Yeah, I think this is a change that has been in the pipeline for so long, and I’m really excited to see it land now. I think this is a major ergonomic improvement.

Ben: I know one of the other, like— the thing I’ve probably talked about before is the future work that I’ve worked on— possibly layed the groundwork for— is the ability to have actual format strings like in Python. Right now, this format machinery works in quote-unquote format strings where format strings are kind of a DSL that only works inside of certain macros. So if you had actual format strings— So today in Rust we have, say, byte strings. You stick a b in front of your string literal and you get a byte string. Or raw strings, put an r on there, you get a raw string. You can imagine like an f in front of your string literal, and it could, like, give you this behavior by default, and then that could desugar to a format_args macro invocation, say. And then you could, like, also pass that to things, and imagining someday— I’m weird.

So macros, right? Macros are cool, but like, having— okay, macros are great in the hands of users, for letting users extend the language without having to actually go through and pester the devs and get a thing upstream. It lets you extend the language in your own way, in very small ways. But my opinion is that a widely used macro is kind of like a bug report against your language, right? Or it’s like saying, hey, we have this thing that’s kind of a hack on the language. And so if we’re— I’m a user, I can’t like, control the language, so I can extend it. That’s great. But you control the language. If you have— this is your language. Just make it so I don’t need this. And so like the argument is, like if we just had like say, variadic arguments. Right? So currently the reason that we previously, the reason that like, you know, println! had to have this machinery was because it takes apart the format string at compile time. But if that’s all being delegated to an actual secret compiler-internal thing, now the only reason that println! exists is let you pass in more than one argument to the macro. And if you just had format strings, you could have a println function that only ever took one argument. That was just the format string itself, you could have println(f"...") and then your string, and that would be like, I don’t know, pretty cool. I don’t hate macros, to be clear. I just think that there are some cases where we can think about more regular solutions to certain problems. So who knows?

Jon: Yeah, no, I think you’re right. And I think it is a good way to think about language design, too. Right? Which is, why should your standard library get to be special? Like, why can’t you extend these same capabilities to, you know, users of the language, without having to essentially use escape hatches.Basically reducing the number of escape hatches needed over time.

Ben: Moving on, “Reduced Windows Command search path,” you want to take this?

Jon: Yeah, so this is a fun one. This stemmed out of, I think a CVE that was published for ripgrep, where some users realized that ripgrep executes commands as part of things like searching decompressed files. Like if you’re trying to search, you know, an .xz compressed file or something, ripgrep will decompress it using the xz command, and then search the output of that command. And on Windows, the way that the standard library executes commands is using this CreateProcess API, which is provided by Windows, but it searches for binaries in a slightly weird way, in that it also includes the current directory. So this means that if, in this case ripgrep was executing, you know, xc, then it would look for xc.exe in the current directory and execute that over, you know, xc provided by the system. Which is really unfortunate, right? It means that if you git clone some project from the internet, and it happens to have, like a malicious xc.exe in there, and you run ripgrep, then ripgrep would end up executing that file for you, which is not really what anyone expected. And Andrew, the maintainer of ripgrep, I think correctly identified that this, while this is a problem in ripgrep, It’s really a problem in the standard library. Like, it shouldn’t be the case that we allow, on Windows, execution of binaries from the current directory, because it really is a security risk. And so what changed in 1.58 was that the Command type in the standard library— this is std::process::Command— now uses the same search path as on other platforms, which basically means it searches the PATH environment variable, it searches the system’s directory and the Windows directory and stuff, but it does not look in the current directory. Which is the same way things work on Linux, for example, where if you want to run something from the current directory, you have to say, you know, ./ and the name of the program.

Ben: Right. And I’m curious, is this some kind of, like, old recommendation from Windows to do this kind of thing, because Linux does not do this. And so I’m wondering why this deviation in behavior occurred.

Jon: You know, it’s interesting. I think this is more that this is just how windows traditionally worked. Right? They have this like CreateProcess function in the win32 api, and this is just it’s behavior. Like, if you open cmd on Windows, and you’re not running PowerShell, just regular cmd, and you run a command, I think it really just runs the command from the current directory. If one matches the name of the command you’re running. Which, you know, is convenient, it’s just— it’s probably not the most safe default. But Windows cares a lot about backwards compatibility, you know, like Linux does too, arguably. And because they had the API work this way, they couldn’t really change it. Whereas I think in Rust here, we’re sort of saying, this is a bug and you shouldn’t get platform-specific behavior for Command. You should get, you know, the documented Rust behavior everywhere.

We— there’s another change from the standard library and this is probably one that’s going to make the most people notice that something changed. And this is more additions of the must_use attributes in the standard library. Do you do you want to talk about must_use? I think we’ve mentioned it before.

Ben: Yeah. So if you’ve used Rust, you’ve noticed how, if you call any kind of operation, say, like, an I/O operation, that returns a Result, and if you don’t then actually look at the Result— and so I use I/O as a good example of where sometimes you might want to just, like, write a thing and then I don’t care if it succeeded or whatever. I’m like, maybe just making a— kind of a shell script or some simple thing, a little test. And I might just say, you know, blah.foo.write(some_buffer). And then if you do that, and don’t actually ever inspect the Result that gets returned, you’ll get a warning.

This is called the must_use warning. It’s kind of misleading. It should probably be called “should use”, I guess, because it’s not actually an error. It’s only a warning. You can obviously promote them to errors if you want to, like any warning. But the idea is that there are certain types, where to not use them might be masking some kind of bug. And so in this case, because these I/O APIs don’t require you to actually access the result to continue, so many APIs return Results, like store the— because you want to get some kind of value back, and that value is stored inside the result, in the Ok arm. But like, for an I/O API, where it’s kind of like, just do it and then tell me if there was an error. It’s much easier to ignore it, and just forget to check it. And so you should be checking it and making sure that, you know, either at least bubbling it up with the question mark operator—

Jon: Yeah. And this happens a lot in C libraries especially, where people just, you know, don’t check the return value, because nothing screams at them if they don’t.

Ben: People make mistakes, kind of like, you know, I don’t want to say lazy, but— Programmers are kind of lazy, but we gotta get our work done, and not necessarily focus on the small stuff, but sometimes we should. And in this case, this must_use attribute just says, hey, you probably intended to look at the result of this thing. And so originally it was just for I think Result, but it’s been expanding more and more, and nowadays you can also use it on functions too, to say, you know, not just— I don’t want every value of this type to warn, but for this function, definitely, just no matter what it returns, just warn if they don’t use this result. So I believe there’s all kinds of things now tagged with this. Do you happen to know the list?

Jon: Ooh, that’s a good question. Let me see if I can dig that up. I wanted to to also point out that there’s sort of a heuristic that they’ve developed as part of the RFC for must_use, which tries to get at when is it appropriate to use. And I think it’s something like, anywhere where the return value is the primary effect of the function. So the idea being that if you have a function that doesn’t really do anything except produce its return value, then if you don’t check the return value, then why were you even calling the function?

I found the PR now, and it is a— it’s actually a tracking issue, that tracks 25 pull requests. Each of— and I think there’s about 800 additions. And as one way to see how large this is, one of the PRs is “Add #[must_use] to remaining std functions (A-N)”.

Ben: And they say Rust has a small standard library.

Jon: Yeah, I know, right? Well, I think the realization here was that there are a lot of functions where not checking the return value probably means you’re doing something wrong. Like, I know that I’ve run into problems with this in the past, with stuff like checked_add and checked_sub, because they don’t mutate the number that you’re calling them on. They return the modified added or subtracted number. So just calling them doesn’t actually do anything; your original number is left the way it was. And that can be really confusing, and this is definitely an error that’s hit me in the past.

Ben: Yeah. And from like, a language design perspective, there’s kind of a— Back, like in the early days of Rust, this was more of a big topic, but the idea of what is a linear type, versus, like, an affine type. The idea of a linear type is a type that must be used where “used” is kind of a vague, nebulous concept, but the idea is that if you have a language with a linear type system, then you can guarantee that the user uses something. This is kind of like a halfway hack towards a linear type system. And I guess you would classify Rust as an affine type system, where things are required to be used zero or one time. That’s the whole ownership thing, is you can own a thing, but you can also just, like, not use it. You could just, like, never use this thing. So it’s an interesting consideration that we’re moving more towards, the sort of like faux- linear type paradigm over time.

Jon: Yeah, that’s true. Yeah, I’m reading through the list here, and, like— I think basically all of these are, yes, clearly you need to use the return value here. It’s stuff like char::to_digit. It’s stuff like, basically any method that’s called like as_ or is_ because there the return value is what you care about. But less obvious ones are things like Arc::downgrade and Weak::upgrade. Right? So these are things that turn an Arc into a Weak, and a Weak into an Arc, for reference counting. Where it’s not immediately obvious when you call this, You’re not modifying the thing that you have. You’re creating a return value that you need to deal with. There’s things like for Path, with_extension which returns a path with the extension added. It does not modify— it does not change the current path to have that extension. It’s all sorts of things like that, where they’re really just— if you weren’t already using it, your code is probably wrong.

I think that brings us to stabilized APIs. So for stabilized APIs, there’s nothing too major here. File::options is kind of nice. So the File type, std::fs::File, gained an options sort of helper static method, that constructs an OpenOptions type for you. So OpenOptions has been stable for a while, and it’s basically a way to set a bunch of options for a file you want to open. Like for example, you could say, I want to open it in append mode, or read-only mode, or read and append but not write. Or I want it to be truncated if it already exists. All that you can— or sort of— more advanced ways you can customize how a file is opened. And traditionally, you had to like, use std::fs::OpenOptions and create a new one of those, and then like call .open() on it. But now that there’s a helper function on File, you can just do, you know, std::fs::File::options() and then .append(true).read(true).open() and then get that back. So you don’t really need to know about this additional type. It just makes things a little bit more concise, which is nice.

Ben: I wanted to highlight some of these unwrap_unchecked, which are on Option and Result. And you’ll be familiar with unwrap if you use Rust, where it’s— you have an Option or Result, and you know, you’re pretty sure, either you’re pretty sure you don’t care whether or not you have a None or an Err case, and just want to get the thing out. Just give me the thing. So these are unchecked variants, which are analogous. So if— for indexing in Rust, if you use the indexing syntax on a Vec or on a slice, it will panic by default if you index out of bounds. And if you don’t want that panic, if you’re pretty sure or you don’t care, I guess. If you index of bounds, there are get_unchecked variants, which just let you get the thing. They’re unsafe obviously, so they’re unsafe functions, and they just like— there are no checks, you just like- you gotta know what you’re doing, and you can get them. And so these unwrapped_unchecked methods exist to do the same, but for Options and Results. And I think these are interesting, actually, not because I think you should use them. I think you probably shouldn’t use these. I think you should probably, in almost all cases, just use unwrap and not use unwrapped_unchecked, because unless you’re profiling and you realize that, like, any of these checks are actually causing any kind of slowdown, it really isn’t worth it to introduce unsafe code into your code base, just for the sake of using these functions, and like, I don’t know, saving a single nanosecond of checking.

Jon: Yeah, the benchmarks would really have to show that it’s the panic code, that never gets executed, that’s causing a slowdown or instruction bloat or something that matters, in like a super hot loop.

Ben: But despite that, I think these were good additions, even though I don’t think you should use them. And the reason is that, so standard libraries, you can kind of look to for example, like NPM with JavaScript, where JavaScript has kind of historically, I don’t know about these days, it’s much bigger these days, but historically JavaScript’s standard library was quite small, and it led to kind of a proliferation of lots of, like, nano-libraries, with like, a single function, a single convenience thing in them. Also, in Rust specifically, it really matters where your unsafe code comes from, like what underlies your thing. So currently out there, there are crates that give you unwrap_unchecked; people obviously want this kind of thing once in a while.

And so, rather than have people, one, rely on a crate that gives you, like, a single method on Option or Result, and two, contains unsafe code, it is one of the best features of— or the best, you know, the— what I think, one of the prerogatives of the std is to encapsulate unsafe patterns. So that kind of like, you can kind of be sure that they’re kosher, they’re legit, and to kind of prevent this proliferation of micro-libraries, because obviously having— Cargo is great. Having, you know, a big dependency tree still is kind of a pain. Just in terms of like, the current watchword of the day is supply-chain attacks, with like, how do you know the person who has control of your dependencies actually isn’t, you know, keylogging you, or running a crypto miner on your system, that kind of thing. Although you wouldn’t be able to tell, with the Rust compiler running at the same time, right?

Jon: Yeah, exactly.

Ben: So yeah, even though I don’t think you should use these, I think this is an interesting case study in, like, why it makes sense to add things that maybe you should be careful of using, to std because it’s just, it should be— if there was, like, a manifesto for std, it would be like, okay, unsafe code patterns, encapsulate these, have all these here. Ideally you shouldn’t have any unsafe code of your own in your own libraries, your own crates. I say ideally, obviously there are exceptions.

Jon: Yeah. And I think we’ve touched on this in the past, too. That one thing Rust really tries to do is to have a standard library that’s sort of not very broad, but very deep, very complete in the things that it does support. And I think this is more of that. Right? Where we want to make sure that the standard library has all of the, you know, functions, methods, helpers that you would want for Option. You should never need another crate to give you things on Option because we fully support Option, and sort of, take responsibility for all of the ergonomics of Option.

I don’t think I had anything super interesting from the detailed release notes under Rust or Cargo for 1.58. So I think we can move on to the the point release that happened for 1.58. So this one was due to a CVE. And in this CVE— we’re not going to go into, like, a lot of detail. There’s a pretty good explanation in the security advisory that we’ll link to in the show notes.

But basically, this is a vulnerability that isn’t really Rust-specific. Like, it is an issue in the Rust standard library, but it is one where I think attackers— I think it was a group of security researchers actually, that discovered that the standard libraries of a lot of different languages, including I think C and C++, have this vulnerability. And it’s basically a race condition in recursive removal functions from the file system. So specifically for Rust, it’s in std::fs::remove_dir_all, where remove_dir_all promises in its documentation that it does not follow symbolic links. So like, soft links where, you know, one file or one directory is really just a pointer to a directory that lives somewhere else. remove_dir_all is supposed to not follow that link, and instead just remove the link itself, and then not recurse into such linked directories. And there’s a race condition in the implementation, where there’s a check for whether something is a symlink, and then separately there is the recursion into that directory, and it turned out that that code was written in such a way that an attacker could have something be a normal directory and then, like, remove it and replace it with a symlink, just after the check for whether it was symlink, but just before the standard library code recurses into that path.

And so this is something where the the Linux kernel, for example, actually provides a way to mitigate this attack, but the standard library wasn’t using it, in part because I think that mitigation is somewhat new. And that also means that there are operating systems, like macOS before version 10.10, where this can’t be mitigated, because the operating system doesn’t provide a way to not have this race happen. The security advisory has more details on this. It’s a pretty interesting issue. It’s something that shouldn’t really affect a lot of users because it requires that an attacker has the ability to like, add symlinks and remove directories, and add directories in a place where your program is, like— in the working directory basically, or in the directory where your program is operating and calling remove_dir_all. But if you are in that situation, this is definitely one to watch out for.

Ben: I mean, you say there’s no way to mitigate it. But what I would do, is I would just have it erase everything by default, and then check whether or not it was a symlink. Right? So there you go.

Jon: Uh, well the problem is erasing requires that you recurse.

Ben: That was a joke, by the way.

Jon: Oh, damm it. Damn it. I’m bad at those joke things. Oh wait, wait, I know what to do— (drum sound). I did it! Did I do that right, Ben?

Ben: I heard it, I heard it. It was there.

Jon: But yeah, I think this is a good lesson in like— there are a lot of latent bugs in— not just in Rust, but in standard libraries. Like the fact that this is a latent bug in the C++ standard library sort of tells us that ultimately there will be new classes of attacks, or even old classes, but new ways to trigger them, over time. And I’m really happy to see the Rust release team sort of have a good process for this, and making it really clear in the announcements, like what the bug was, what the fix is, what the limitations of the fix was. I found their announcement for this and the security advisory really easy to read and that’s— major props to them for that.

Ben: Now you say you’re happy, but I hear you did have to spend a lot of time on your own end fixing, or working because of this bug.

Jon: Yeah, I mean it’s like, whenever there’s a security release, right? Like anyone who consumes Rust at scale has to go deal with it, and that happens to end up being part of my job. But even so, like in that capacity too, I’m happy that the write-up about what the problem was and the mitigation was so thorough, because it made my job easier. I think that brings us to 1.59, and I’m very excited for 1.59, because it is getting very close to Rust 2.0, I hear?

Ben: What makes you say that?

Jon: Those are the rules, right?

Ben: When it gets to 99, you think it gets to 2.0?

Jon: Yeah. Is that not— Is that not the case? Have I been tricked all this time?

Ben: I think that would’ve been the case for 1.9. And so actually, by that logic we’re on Rust 15 by now, we’re about to be on Rust 16.

Jon: That’s— that seems scary. I don’t—

Ben: Rust 16.0.

Jon: Yeah, no, I’m not okay with 16.0. That sounds like something that— something went wrong somewhere in the line here.

Ben: Sounds like a browser release, now.

Jon: That’s right. What is this, Chrome?

So obviously the headline feature of 1.59 is inline assembly, which has been on the, sort of menu, listed as “coming soon” for probably years now. But it’s landed, and it is glorious. Let me tell you. I think one of the things that’s really nice about the assembly thing here is they didn’t just go, we’re just going to plop down the, you know, inline assembly syntax that’s used by, you know, GCC or Clang or anything. We’re actually going to think really, really thoroughly about the problem space, and how we want to solve this in a rustic way. And I think they really succeeded there.

Ben: Yeah. So the history of this feature— so, inline assembly, if you’re not familiar with, like, system stuff, if you’re from Python or JavaScript— so this is a feature that’s been in C for— since forever, essentially. But— I say in C, it’s not actually in C or C++, it’s in various compilers. Right? So in GCC or Clang, there are extensions that you can use, to access inline assembly. And so in that case, it’s kind of slightly more standardized in Rust now than it is in C. As far as, you know, Rust isn’t standardized at all, but someday when it is, fingers crossed, you will have actual, kind of like, if you write a compiler, here is the spec that you write it to. And so there is some amount of portability for inline assembly, which is actually pretty cool. And it’s useful and important, because even though it was unstable for so long, there are some things you just can’t do without it. It’s kind of like how, like in Rust— when it comes to unsafe code and Rust, I kind of like, split them into two worlds. One is unsafe code, because you’re like, want to go faster and you can, you know— it’s not because you can’t do a thing, it’s just because you have, like, some performance need that you just can’t do otherwise. Right? So, and you’re using it. Okay, that’s fine. You need to go faster here, this is a hot loop. I understand. The other is to do things that you literally just can’t do. And very often that means something like calling into C, because you’re calling some kind of syscall on the platform directly, and there’s just like, no way of doing it. The platform library is written in C, so you better just like make an unsafe call and you know, worry about that there.

Jon: Or even lower level, right? Like, you need to issue a specific CPU instruction? Because you’re writing, like, bootloader code.

Ben: Yeah. And that’s that’s the exact use case that inline assembly is great at, where you just have a single instruction, you just want to run it, this is just— the CPU like, hey, CPU, do this, and it can be like, gotcha. And so like, no OS involved, nothing but the hardware right there. And so it’s kind of one of those things where like, to be like, you know, put on your big boy pants and be a real language, to gate-keep there, shortly. People have in the past looked at Rust and said, okay, you— like, cool, here’s your language, but where’s your inline assembly? And now we can be like, it’s right here. It’s stable now.

Jon: Yeah. Do you want to talk about why it has to be unsafe? Because this might not be immediately obvious.

Ben: I mean, so— in the same way that C is unsafe, where, so Rust— safety in Rust is a big topic, and it’s more about what you can’t do, than what you can do, right? And so when you call out to C, C lets you do plenty of things and Rust can’t prove a lot of what C does is safe, by Rust’s definition of safe. Right? And if it were so easy to define that in C, then why would anyone ever write Rust? We would just write, you know, we would just figure out how to do this in C. The problem is that C lets you do too much. Right? C is— it has too much power to do things that can’t be proven. And there’s this kind of, like, interesting dichotomy here, where the more expressive your language is, the less that you can prove about that language. So the idea is, Rust lets you do less, but by doing less, it lets you know more about your program. And so because C lets you do too much, and assembly lets you do way too much. And that’s why it’s unsafe. You can do all kinds of nasty things where Rust just isn’t— like, Rust, the language, doesn’t know anything about, like, registers or the stack or, you know, any of these things that inline assembly can theoretically change, and tweak, and so it is kind of caveat emptor, where once you get in there, like, make sure you know what you’re doing, because there are no more rails, it is all just like, up to you to know what you’re doing. And so this is again one of those things, where ideally you wouldn’t need to ever write it, right? We’re talking just about encapsulation of unsafe code in the standard library, where ideally— so for a long time now, it’s actually been fairly tractable for folks to not need inline assembly for most of what they do, just because like std provides things like, say, the black_box function, which is actually unstable but forget that for now. So the idea is that—

Jon: I guess intrinsics is maybe the example. Yeah.

Ben: Also, yeah, intrinsics too. So various things that like, you know, are kind of like, CPUish or, you know, a single instruction, are kind of like, you can, there’s a thing called intrinsics, which kind of like, the compiler lowers these two LLVM things, which are themselves implemented on the— with the appropriate assembly for various platforms. So by doing this, it’s kind of alleviated the need for inline assembly, but there are some things you just can’t like, you know, you can’t possibly satisfy every use case this way. If you’re doing something like writing an OS, so if you are Linux, you can’t like, you can’t possibly hope to ever upstream everything you need into the Rust standard library. You just sometimes need to write inline assembly, because you just need to talk to the CPU.

So this is a great day, I think, for us because yeah, this has been— there’s been a macro for this since about 2013. Well before Rust 1.0. But it— back in the day, it was pretty— well, not going to say simple, it was definitely a lot of work even back then, but it was more or less just passing LLVM assembly directly. Like, what LLVM expects of an assembler. Which is— it worked, and it was, you know, functional, but if you ever want to have any back end that isn’t LLVM, that’s going to cause problems. And so, this isn’t necessarily about making your assembly portable. If you as a user write inline assembly, then there’s a good chance that like, you know, you have to care about your platform, and maybe even what compiler you’re using exactly. But it’s more about, can any kind of Rust compiler ever re-implement the same semantics if you’re tied that closely to LLVM. And so it’s kind of a trade-off, where yes, many of the things that are currently happening in inline assembly, and the stabilized version, which we have now, are kind of tied to what LLVM is doing. But it’s much less than it used to be, where it’s like, many of the options and the features are now much more standard, and aren’t just like, okay, here’s what LLVM does, do that for everyone forever.

Jon: Yeah. We should also point out that the assembly here is also not completely abstracted away. Right? So the instructions that you have available, for example, are architecture-dependent. Right? Like it’s not as though, in stabilizing assembly, we like, stabilized all of the possible, you know, assembly instructions you could have, and what they translate into for each architecture, nothing like that. Like, you still use the appropriate verbs, in a sense, for each architecture, and which ones you can use is going to depend on the architecture you’re targeting. So it’s still, like, fairly low-level here. It’s not, like, a completely new language was invented. That would be a major undertaking.

Ben: Yeah, I want to shout out to Amanieu d’Antras, who was kind of the mastermind of inline assembly: wrote the RFC, implemented a whole lot of this, along with many other folks who contributed, but like, kind of, Amanieu has been pushing it forward for several years now, and deprecating the old LLVM asm! macro. And it’s just been a ton of work even to get this far, and this is just like, the foundation, where— so as you mentioned, it’s not really its own language, it’s kind of like, it gives you a kind of like, standard and reasonably stable interface to what any reasonable compiler should give you for assembly. But it’s not, like, an actual DSL. You can imagine— so I think D, the language D, has their own, kind of like, assembler DSL that’s built into the language, but that’s a step beyond what we have here. The thing is though, that because of that, they have much less platform support, because you have to do that for every single platform that you want to support. You can imagine someone writing their own, kind of like, custom DSL, like their own crate, right? Where they provide a macro, where they provide a DSL, where that lets you kind of like, write some special unique, very nice variant of inline assembly, that just compiles to this. So if you want to, if you want a nicer assembly macro, you can make that now. And it’ll be stable, and it will continue to work. So that’s a work for the library authors at this point.

Jon: Yeah. The other thing I want to shout out for this is that because this effort has been going on for so long, we’ve also gotten to the point where this is really well documented. I read through the Rust By Example, for inline assembly, and it’s really, really good. Even if you’re not super familiar with assembly yourself, read it. It’s a fascinating read, and it talks a bunch about the syntax, and why it is the way it is, and it’s going to teach you a bit of assembly as you go as well.

Ben: Yeah. And documentation is also being improved as we speak. So currently, like, you know, if you go to use this macro, it lives in std::arch::asm, and if you go to that location in the docs, it’s kind of, not as good as it could be, where it just links out to the aforementioned Rust By Example pages, and to the Rust Reference. And so the former is, how does the user use this, and the latter is like, how would a user really use this? And so it’s like, very involved. And so they’re kind of like, an in-between sort of, like, API-level documentation is being written right now, to put in std, at the std::arch::asm location. And so it’s coming along, it’s going to be even better in the future.

Jon: Yeah, I also want to mention one more subtlety around inline assembly, which is the difference between the asm! macro and the global_asm! macro. This is a little subtle, but basically the asm! macro generates assembly code, sort of where it is. It’s going to be integrated into the compiler generated assembly code of a function. And I think it’s even, the compiler is even allowed to turn that assembly code into a separate function, and generate calls to it, rather than inlining it into the function. So the asm! macro is sort of a little bit constrained in how it gets to operate. The global_asm! macro is literally, like, put these assembly machine code words here. So with global_asm!, you can do things like, put them in the top level of a file to generate a complete function definition, you can’t do that with the asm! macro. So that’s sort of the difference here, that global_asm! is like, literally insert instructions here, whereas asm! is a little bit more principled, in terms of how it interacts with the surrounding code.

Ben: Next up is destructuring assignments. And so this is actually kind of interesting. So if you’re— I like to use Python as my example. So in Python, if you have— there are tuples in Python, just like in Rust; you have (foo, bar) and that’s your tuple. And then you can swap them. It’s a fun trick in Python to teach destructuring assignment, where if you have— you can put (bar, foo) = (foo, bar), and it’ll just swap the contents of those two variables. It’ll just— it’ll say, put this in this and this in this, and it’ll behind the scenes, just do it for you, and it’s great and funny.

Rust has a similar thing, actually more powerful, in just pattern matching. So if you use pattern matching in Rust, you know, you can match on a tuple. And also one of the more intermediate things that you can do in Rust is that you understand that patterns work in more contexts than just match, You can use them in if let, but you can also use them in signatures for functions, if you want to, like, if you’re taking a tuple, you don’t need to have a tuple foo and then on line one say, let (a, b) = foo. You can just say (a, b), and then the type of the tuple. And then inside the function you’ll have a and b as just bound variables. Obviously that works only on infallible, quote-unquote, matching, where there’s no possible chance to fail. Because what would happen if that didn’t— if you like, passed in, say, the wrong kind of thing for, say, an Option— because you can match on Options too, right? You can say Some(foo) or None. So any kind of— and infallible just means a match where there would only be one possible arm. Or where one arm would would compile.

And so— but previously, destructuring only worked in let. So in the let assignment. So when you’re originally declaring the variables. So what this new change is, it lets you use destructuring in normal assignment contexts, too. So if you have let mut a; and let mut b;, you, in the past couldn’t, say, unpack the— So imagine the API std::io::, so what’s the channel API?

Jon: mpsc::channel.

Ben: Yeah, mpsc::channel, where it gives you both a port and a channel.

Jon: It’s like, a transmitter and a receiver, or sender and receiver.

Ben: It gives you a sender and receiver. And so, as a tuple. And so it’s very common to like, you know, whenever you get this, you can say, you know, let (sender, receiver) = the result of this tuple. And so now, you can have a pre-declared, let (sender, receiver);, and then later on, say, in a tuple of (sender, receiver) = the result of this macro. Or you could even have mutable bindings, you could say like, let mut sender = something, let mut receiver = some_other_thing. And then later on, you could just mutate it as normal. And then overwrites, you could even have, like, you know, if you just want to bind only one of them— I’m not sure— This is a bad example for binding only one of them, but you can imagine a different context, where you might want to only bind one output for a pre-existing variable, and that works now. So like you just say, you know, (foo, _) and _ is the usual thing, a pattern to ignore, drop a thing. And then you would just bind one thing. And so it’s just a way of making Rust a bit more regular, as far as users are concerned.

It’s actually kind of tricky under the hood to implement, because it’s not really feasible to implement the full space of pattern contexts here, just like randomly on the left hand side of an assignment, kind of. It complicates the grammar, just like, far too much to really consider, because there’s a lot of different things that you can support in patterns. So what this does, it just supports the most common things in patterns, which just happens to be most of what you’d want this feature for, so it works out pretty well. So it’s— tuples are supported, slice patterns, so with arrays, you can bind, you know, with an array, you can bind, kind of like, a little .. syntax to kind of bind, like ignore more than one thing in an array, you get just the head or the tail of a thing. Also structs too. So you can bind just like, a single field of the struct, pattern matching wise. So it’s pretty cool.

Jon: Yeah. And I think the way that I think of this feature is that, it lets you destructure while reusing existing variables. That’s like the key part here, right? That you don’t have to, sort of, generate new variables, for the lack of a better terminology here. You don’t have to declare new variables in order to destructure. You can use the variables that you already had.

Ben: Yeah. And more than just being nice, it’s also sometimes really important for like, if you’re inside a nested context where, you know, the variable is declared in a higher up scope, and kind of like, you know, if you want it to exist outside of the scope, well, you can’t declare it inside the scope there, so you have to like, you know, originally would have to, like, bind it to a new thing and then re-bind it to the mutable thing that’s in the higher scope. Now you don’t have to do that anymore.

Jon: Yeah, exactly.

Ben: So a pretty cool feature, I’d say.

Jon: This next one is is arguably— like, it is a major feature, but it also mostly feels like an ergonomics improvement. So this is around const generics, where now, not only can you declare default values for const generics, so if you have, you know, like a— the example they use in the release notes is you have an array storage struct that’s generic over T, which is the element type, and const N of usize. That is, you know, the number of things to store in the array. And now you can say const N: usize = 2 to say that, if the user doesn’t specify what N is, then assume that N is 2. This is the same thing you could do with type parameters. So this just feels like now const generics aren’t special in this way, they do what the rest of the language can do.

And the same thing now applies to allowing const generic type parameter, or const generic parameters to be interleaved with generic type parameters. So in the the sort of minimal implementation of const generics that got stabilized a little while ago, all of the const generics had to be at the end of the parameter list. So you can have a function or type that’s generic over, you know, T and U and const N and const M. But you couldn’t say <T, N, U, M>. You had to have the const generics at the end. Whereas now you can intermix them, you could say <T, N, U, M>, and its fine.

And again, like, it was pretty tricky to get that to actually work internally in the compiler, but for— as a user it just means that const generics works more like type generics now. And it’s more that you’re not going to run into as many limitations of it than you otherwise would. So it feels very much like just an ergonomics improvement, or a lack of a lack of surprising behavior, perhaps. That’s not to diminish the value of this feature, but it’s just like, it feels like completing out the feature that we already had. Which is really nice.

And then we have our our good friend, future incompatibility warnings. We’ve been around the block with this a couple of times trying to distill down exactly what it is, and what it’s for, and why it’s useful. And I think in the 1.59 release notes is a pretty good explanation of exactly what’s going on here. Do you want to tackle it again, Ben?

Ben: Yeah, sure. So if you’ve ever built a Rust project with, you know, kind of long lived, with plenty of dependencies, you’ve probably benefited from, but maybe didn’t appreciate the fact that whenever you compile dependencies, it doesn’t spit out warnings at you, for your dependencies you’re compiling. Right? Or at least doesn’t ever even— put it this way: so in Rust it is possible to use an attribute called #![deny(warnings)]. So any warning you see, anytime you ever see, like, a warning for must_use, like you were saying before, you can in your own crate elevate that to an error. So you can say, hey, when you compile this code, please error out whenever this happens. The problem is that Rust wants to introduce new warnings from time to time. And one of the things that you can do is just deny all warnings as errors. And so this is actually one of the reasons why— so in GCC, right, adding new warnings is kind of a big deal. And so in GCC, there’s like, you know, there’s -Wall and -Weverything, and kind of because they’re— they don’t want to break workflows that expect a set of warnings to never change. And so in Rust, it will kind of just suppress warnings in dependencies, as a way of kind of like saying, hey, we might introduce a new warning here, but we don’t want any of our existing users to break just because of a warning, right? Because people tend to expect warnings not to break their code.

The thing is though, that warnings are also used for more than just, like, nagging you about, must_use things, right? They’re also used for a new purpose, which is, I’m not saying new, but one of the distinct purposes is that when Rust wants to deprecate something that’s really important— because normally, Rust would kind of like, bend over backwards to try and find a way not to remove a thing that didn’t need to be removed. Although sometimes there are things that are clearly broken, right? If there’s, like, a type system unsoundness, say. That might involve a compiler upgrade that makes this— makes the thing that compiled previously no longer compile. And you know, it’s probably the case that the thing that was compiling was wrong, and potentially unsafe in certain contexts—

Jon: Dangerous, maybe even.

Ben: It’s still dangerous, but it needs to, you know, be fixed. And so it’s kind of unavoidable that it will break your code, but they want to give you a heads-up beforehand, is the thing. So before they remove that feature, they want to make sure that everybody can see the warning that’s occurring. So it’s called a future incompatibility warning.

The problem is that because Cargo is squelching all of the warnings, that if you’re compiling a big long dependency list, and one of these future compatibility warnings happens, then you aren’t seeing it. So like, you might not notice that hey, like if you upgrade your compiler, this thing that removes a thing, it might break. And so in practice this means that almost all future incompatibility warnings have never been promoted to errors. The Rust compiler has lots of these warnings, and so the fact that they’re warnings means that like, you know, people who compile their crates will see them and say, oh, this is probably a problem, I’ll upgrade this.

But the Rust team has been very reluctant to ever actually promote any of these warnings to actual errors, and thereby remove all the bad code, and all the compatibility hacks that are needed for this. Even across editions, too. And so this feature is really important because it means that future versions and editions of Rust might actually have the gumption to start removing some of these obviously broken things, that exist only for compatibility. And that shouldn’t, you know, exist in any kind of ideal version of Rust. So I think the next Rust edition is going to be interesting, because it’ll probably— I assume, I’m not, you know, I’m not the Rust dictator, but I assume that there will be many warnings elevated to errors in the next edition that are of this nature.

Jon: Yeah. And I think this is— it’s a great way, as you say, to sort of unblock work on fixing things that might break some users, but giving people enough warning ahead of time, and making sure that they will actually see those warnings, you know, if they are attentive to their projects. Even if their dependencies are not attentive to new warnings.

There’s also the associated cargo report command, that gives you, like, a full report of the code that will be rejected as well. Which is a really nice way to get a sort of complete picture of all of the the potential issues you might run into the future. And the hope of course then, is that when a user discovers that one of their dependencies has this problem, you can go back and tell the authors of the dependency, hey, look, you now have a new warning, even though you weren’t actively looking for this, here’s a fix for it. Or at least now you are aware, and hopefully we can drive the ecosystem forward in that way.

There’s another neat addition in 1.59 that hits probably a small number of people, but it’s going to make them very happy, which is the ability to create stripped binaries. So in general, Rust tends to develop, or build binaries that include debuginfo. And in particular, it includes debuginfo from things like the standard library. It includes debuginfo from your own code. There are ways to turn it off of course, and it’s been possible to strip that out after the fact manually. Like you can use the strip command, for example. But now Cargo and rustc support stripping out this debuginfo when the binary is linked. You might wonder why do you want to do this? Well, the primary driver for this tends to be file size. Usually when you end up with very large binaries, it’s really just because they have a lot of debuginfo in them. And stripping them can often make them significantly larger. (editor’s note: smaller.) Like we’re talking order of magnitude here. And so if you have a binary that you know you’re deploying somewhere where you don’t need like, you know, symbols in backtraces, and you’re not going to run GDB on this particular binary, it might make sense for you to strip out the debuginfo.

And so now you can set the strip option in your Cargo.toml, for a given profile. You can set it to "debuginfo" to remove the debuginfo. So this is stuff like, you know, the source file and the line number that is associated with any given line of instructions. It also includes things like, you know, caller, some information about the caller graph. You can also be even more aggressive and say strip = "symbols", and that removes all of the function names as well, which could also add a bunch of bloat to a binary or a shared library file. Of course if you do that, then if your program does crash, all you’re going to get back is, like, a list of addresses and no names. Which, you know, might be too far on the side of, this gives me nothing useful. But if you’re working inside, say, an embedded context or you’re working in a context where you know that the output from a crash will never be useful anyway, like for example, it always gets dropped. Then this might be a good way to optimize your binaries like that.

I think that leaves us with the the last, sort of, big ticket item for 1.59, which is incremental compilation being disabled again. Do you want to talk a little bit about this, Ben?

Ben: Yeah, sure. So last year, people will remember that, it was like, 1.52 or something? There was a release where incremental compilation was disabled, and what incremental compilation is, is after you compile your Rust code, it remembers parts of the compilation, and then will reuse bits that haven’t changed. And this is important because in Rust, unlike say, in C or C++, the compilation unit is pretty big. It’s not just one file, it’s the entire crate. And so the like, naive model of compiling Rust is, you take the entire crate at once, you get the entire abstract syntax tree of like, every single thing in the crate, and then you compile all of that the same time. Better, if you only have to change like, you know, a single line of code somewhere, better would be just to figure out which branch of the syntax tree you changed, and then like, reuse every single other bit that you’ve already compiled, and just like, recompile the one part you need, and kind of neatly plug in what you’ve changed to the original artifact. And just use that from there.

So right now— it’s also, but the thing is, it’s a very complex and wide ranging feature. It’s— I was observing some of the discussion from the compiler team, in the wake of this change, because it came very late in the release cycle, I think a few users, like a few days before the release, reported some bugs and it was just like, fixing them caused even more bugs, so they didn’t want to backport the thing necessarily, without a risk of like, causing a cascade of ever-larger bugs. By bugs I mean compiler crashes. This doesn’t appear to be miscompilation. These were actually— so it’s still bad, but not quite as bad. A compiler crash is called an ICE, an Internal Compiler Error. And it just means that like, oops, Rust crashed. But yeah, it’s better than giving you wrong code by default. Right?

And this is actually what happened last year was, increasing these checks to make it crash more often, and reduce the risk of miscompilations happening silently. So in this case, it’s a bit frustrating certainly, because the fix is already in 1.60, it’s— so if you want to not suffer the loss of by-default incremental compilation, you can just use beta. So like, you know, rustup toolchain install beta, rustup default beta and now you’re using beta. And that would actually help a lot, because one of the reasons that this slipped through, is that not a lot of folks actually use beta for regular development. Plenty of folks use beta for CI, which is a nice service that they do for us. I say for us, you know, for, you know, the Rust ecosystem in general, because it reveals any problems in beta, before they get to stable. But CI doesn’t check, specifically, increment compilation, because every CI always builds from scratch. Right?

Jon: Yeah, I would even go as far as to say that I think most people should be using beta for their day-to-day development.

Ben: Like, if you’re so tech-savvy that you’re listening to this podcast, you’re a pretty cool person, and you should use beta. You should upgrade to Rust beta.

Jon: Yeah. rustup default beta right now. Right now.

Ben: Live on the not-so-bleeding edge, live on the, I guess like, what’s between bleeding and not bleeding?

Jon: Lightly wounded?

Ben: Live on the lightly wounded edge.

Jon: Bruised.

Ben: Live on the bruised edge. There you go. Live on the bruised edge, and use beta. Use it today. It’s essentially as stable as stable. It has like, maybe one change in a week, one commit per week, back ported from nightly on average.

Jon: Yeah. And you’re helping the Rust project. You can report issues, because you do report issues when you find them, right? Just checking with our audience here, that they know that if the compiler crashes, they should file a bug.

But otherwise, like, there’s no reason, really, not to use beta. The only downsides I can think of are, you may run into an issue. It should be rare, but if you do, that’s super valuable information to report back. And the second reason might be, you might accidentally make your code rely on a feature that is in beta, but not in stable. But that should be pretty quickly caught, once you try to push this in a PR, and the stable build fails.

Ben: To clarify, feature flags don’t work on beta, so beta is the same stability guarantees, or the same stability API surface as stable, the stable branch. It’s just that—

Jon: Yeah, so this would just be, like, something that has been stabilized, but hasn’t landed on stable yet, because there hasn’t been a stable release since when it was stabilized. So it should be a very small set.

Ben: Mm hm. Yes.

Jon: And speaking of stabilizations, I think we can get to the stabilized APIs for 1.59 now. There aren’t any ones that I think are very important to call out,except maybe the available_parallelism function.

Ben: Yeah, that’s a cool one.

Jon: This is one that I’ve wanted for a while. It comes with a bunch of caveats. Like, I recommend reading the limitations section of the API. But very basically, it tells you how many parallel tasks, how much parallelism your program is probably able to get out of whatever host it’s running on. So you might think of this as, sort of like your core count, right? Like, if you’re on a 16 core box, available_parallelism will probably give you back the number

And tell you that’s how many threads you should run in parallel, for something like Rayon for example. And it’s something where we’ve already had crates like num_cpus, that people use for this kind of functionality, but at least now we have it in the standard library, and you don’t have to reach for that.

One of the caveats here is that it is a very lightweight mechanism, which means that it doesn’t consider a lot of potential other factors that impact your available parallelism. So this could be things like, you know, if you have multiple different NUMA regions, for like, very large core count boxes, then that’s tricky, it doesn’t take into account if you have processors or co- processors that have different capabilities, it just gives you a count. It doesn’t do things like measure if the OS has like, soft-disabled a core for your process. So there are some complications here, if you really need an accurate count. But in general, this is going to be a good place to get a sort of rough estimate of the parallelism of the computer from.

Ben: And this ties back into as well, the kind of, what I was talking about earlier, with regards to trying to avoid, like, micro-dependencies. Where I think there’s a crate called num_cpus which just provides this, and the thing is, it’s kind of a building block that’s kind of required for doing any sort of like, at-scale parallel computation, where like, okay, I don’t know what, you know, how many threads or cores my current machine has, please tell me. And Rust is like, please get a crate for that. Which isn’t, you know, the worst thing, but if it’s possible to kind of grab and take this, like, micro thing, and move it into std, if it’s, like, a very fundamental building block, that’s a pure win in my mind.

Jon: Yeah. And, I mean, the num_cpus crate I think is probably still useful, because it does implement some of those more— that more, like, sophisticated, complex logic for determining the number, the amount of available parallelism that you could sort of— if all you need is just basically the core count, then then use the one from the standard library. But if you really need to like, explore the processor tree, and make sure you only look at processors that, you know, aren’t masked by the OS and stuff, then I think bringing in something like num_cpus is totally reasonable. But at least now, you have for the simple cases, something to reach from, that’s provided by the standard library.

I don’t think the other stabilizations are that interesting to get into, but I did have some things from the changelog. Unless there was one of those, any of the stabilizations as you wanted to highlight.

Ben: Well I mean, there was the armv8 neon intrinsics for aarch64. I think we’re kind of saying, I have no idea what this is, but I think— there are some folks I saw on the reddit thread. There are plenty of folks— This is SIMD for aarch64. So, I mean, like, SIMD is still coming along, but it’s just an enormous task, and it happens very slowly, in fits and starts.

Jon: Yeah, that’s true. Yeah. And I guess it is, armv8 is like, the sort of next-gen, or last newest-gen, I guess, of ARM processors, if I remember correctly. So it is exciting that we’re sort of continuing to improve on the ARM side of things too. I just want to know when I can get an ARM laptop running Linux, that has decent specs. But that’s neither here nor there.

For the changelog, one thing that I thought was nice was, Cargo now has support for the -r flag, which is exactly the same as --release, but it is shorter. So now you can do cargo run -r. And in fact, if you didn’t know this already, Cargo already has shortcuts for the aliases. So instead of typing cargo test, you can type cargo t. And it does the same thing. And now you can take cargo t -r to run all your tests in release mode. Or similarly, cargo run -r. I think you could do cargo r -r, which is its own kind of funny. But it’s just like a nice, now you can type less when you run things from the command line.

Ben: And you can like, have your shell alias cargo to c, so you can do c r -r. Now you can put all these in shell scripts, and someone will have to maintain them forever.

Jon: It’s the new R-only language, where all of your commands are Rs and spaces and minus. r r -r r r -r.

Ben: Pirate-lang.

Jon: Arr.

The other thing that has stabilized, and this is not going to hit you immediately, but it is, like, a longer term effort that is cool too to see come to fruition, and that is the symbol mangling version argument to rustc, which is now stabilized. And it’s stabilized specifically, in that now you can pass -C, so a codegen argument to rustc, -Csymbol-mangling-version=v0. You cannot set it to any other value, and v0 is not the default. But the basic idea here, is that the Rust compiler has implemented a symbol mangling scheme that is better, that more accurately represents Rust code, when expressed in debug symbols, than what’s being used currently. And that’s called the v0 version, because the legacy version is just sort of a hacked-together thing that we use, but doesn’t really have a stable name, or any kind of principled approach to it. So v0 is really the first thing that stabilized. And the symbol mangling is basically the way that Rust takes a given function name or type name in Rust, and turns it into, like, ASCII that can be put into the debug info tables in your binary, that can then be displayed by things like GDB. And there’s been a lot of work to try to get out this new better symbol mangling algorithm, because it also meant contributing to all of these upstream tools that read debuginfo, so that they understand this new mangling syntax. But that work has gone really far along, and there’s been a lot of work to, like, upstream this into GDB and LLVM and perf and Valgrind, and all of those various tools, and that’s now finally getting towards the end here. And they’re they’re happy enough with the symbol mangling scheme that they’re now ready to make it stable. It is not the default yet. That PR is still open, and sort of pending some upstream patches. But it is an exciting forward progress.

Ben: Yeah, this is a great one of those things, where it’s just a language maturity thing. Where— so, the existing symbol mangling scheme exists purely for convenience, because you need to mangle symbols somehow. Because unlike in C, where you just like, symbols are named what they’re named, you have generics, then you can have many things named the same thing if you’re not careful. And so you need to have some way of making them all unique, so that when you actually go to call the symbol, you know what you’re calling. And so the previous, the current and legacy version is just to like, hash some completely arbitrary compiler-internal details and then like, append those to the name. And so that’s like, one, kind of like, terribly hacky, then no other compiler other than rustc, will ever, you know, be able to reproduce that. And two, you can’t reverse that. So if you are GDB, you can’t look at this hash, and say like, what originally was all of these arguments, what were the type parameters to this thing? Like, what was the path of this thing? And so the v0 RFC for the— I’m not sure why it’s called v0, you could have just called it v1, honestly, it’s kind of weird. No one would have stopped you. But the RFC is an interesting read, where like, you want to see like, you know, it’s based on the C++, I believe the Itanium ABI for mangling. So there’s plenty of prior work. There’s, like, built in compression of symbols, to help avoid giant exploding long symbol names. So it’s also— so it ends up being more precise than the current mangling, easier for tools to consume, more consistent, actually specified. So other implementations of Rust could actually do it. So yeah, it’s pretty cool, and it’s been a big effort, has to coordinate with, like, lots of external upstream tools. So I commend everyone who’s been working on this, it’s been a long time coming.

Jon: I think the last thing I wanted to highlight— and this is maybe arguably stupid, but the HashMap and HashSet methods into_keys and into_values were moved from one impl block to a different one in the standard library. And specifically— oh, and retain, I think. They were moved from an impl block that had the requirement that the key type implemented Eq and Hash, to an impl block that didn’t have that bound. And the observation here is like, if you have a HashMap and you just need to grab all the keys, or grab all the values, it’s okay for the key type to not implement Eq and Hash, because you’re never comparing keys. And it’s a change that probably doesn’t matter to anyone, because normally when you create a HashMap, like, you’re going to be looking things up or inserting things, in which case you need, like, the key types to have Eq and Hash. But it is a nice way to specify the sort of minimal requirements of each method. And the reason why being them being minimal matters is every now and again, you actually do need to construct a HashMap with a key type that isn’t Eq + Hash. Because all you’re doing is like, you know, creating it and then dropping it for some stupid reason, like you’re writing a macro and the macro needs to generate code in this particular way. And so you want it to work, even if the the type isn’t— doesn’t meet the bounds, because you know that you will never insert anything. This should be extremely rare, but it’s one of those— that’s just tidy up the bounds of each function, so they’re actually the minimal stuff that that function requires. So I thought it was just nice to see that a PR landed that really just did something that basically doesn’t matter to anyone. But is just— it sort of appeals to the OCD in me. Of, it should be right, damn it.

Ben: I’d say this is one of— its kind of an opportunity for intermediate- level Rust learning, where if you actually go into the standard library, and go to the HashMap docs, you’ll see that there are— you’ve heard of inherent impls, where you just have, like, you say impl Foo and then a block, and you can put like, you know, various methods on there, no trait involved. Maybe you haven’t realized that you can have multiple of these, and that all of these can have different generic bounds. And so this what HashMap uses. And so if you go to, like, the fn new, you’ve you’ve seen the new method of HashMap, HashMap::new. There’s actually no bounds all on those generics. And so you can make a HashMap with any types at all. Now, the thing is, you know, as we were mentioning, the insert methods are all in a different block, which have different bounds. So it’s kind of a neat API design, where it’s like, yes, this is fine. Rust allows this, there’s no problems here. And it precisely identifies what methods are available at which times. So it’s pretty cool, because it’s kind of like, it makes you realize that the methods aren’t inherent to the types themselves. Like, they’re based on what types you want to use them with. It’s when they exist. Like, they don’t like, you know, they’re not always there. The actual types that you bring to the table determine which types even— which methods even exist in the first place. So I don’t know, for me it was a pretty big revelation.

Jon: Yeah. And you can see this when you when you look at the the rustdoc output, right? Like, it shows you which impl block each method existed in, and it shows you the bounds for that impl block. And there are libraries that use this to their advantage. Like for example, I maintain a library that does— like, it implements sort of like Selenium, like browser automation in Rust. And there, I have an impl block for each category of operation you might want to do with the browser. They all have the same bounds, but it’s a nice way to group together functions that do similar things. And you can even write documentation for a given impl block, to say this impl block is about this kind of stuff. And it’s just a nice way in which you can be even more organized and principled about your documentation and your FBI design.

Jon: I think that leaves us at the end of 1.59.

Ben: Yeah, I believe that’s it.

Jon: I would like to to thank you, Ben, for bringing us all the way to 1.59. I think we’re in a position now where we have to wait until 1.60 to talk again. That is the policy that we’ve instituted?

Ben: No, 1.61 even.

Jon: Oh, you’re right. It’s going to be 12 weeks, Ben, how am I going to survive?

Ben: (singing) How will I live without you?

Can we get copyright struck, on— we’re not on YouTube? I have no idea. They’ll find a way.

Jon: Someone will probably sue us. Although that said, given all the shit we talk about Rust, I’m amazed that the Rust team hasn’t sued us yet.

Ben: They’re working on it.

Jon: Yeah. I think the core team is probably coming after us any time now.

Ben: All right, well, we should get a move on then, before they find us, our locations.

Jon: Yeah, I’ll go hide.

Ben: They’re tracing the call as we speak. I’ll see you in 12 weeks.

Jon: All right. Bye, everyone.

Ben: Bye.

The Rustacean Station Podcast

What's New in Rust 1.58 and 1.59