Skip to content

Instantly share code, notes, and snippets.

@lifthrasiir
Last active September 27, 2017 14:17
Show Gist options
  • Select an option

  • Save lifthrasiir/b9f8ff94d0431cbe4e549eccba77c855 to your computer and use it in GitHub Desktop.

Select an option

Save lifthrasiir/b9f8ff94d0431cbe4e549eccba77c855 to your computer and use it in GitHub Desktop.
Why is a Rust executable large? (DRAFT)

Suppose that you are a programmer primarily working with compiled languages. Somehow you’ve got tired of those languages, there may be multiple valid reasons, and heard of a trendy new programming language called Rust. Looking at some webpages and the official forum, it looks great and you decides to try it out. It seems that Rust was a bit cumbersome to install in the past, but thanks to rustup the problem seems gone by now. Cargo seems to be great, so you follow the first sections of the Book and put a small greeting to the new language:

fn main() {
    println!("Hello, world!");
}

Amazingly cargo run runs without a hassle. It is kind of a miracle as you used to configure the build script, Makefile, projects or whatever before building things. Impressed, you realize that the executable is available in target/debug/hello. You instinctively type ls -al out (or is it dir?) and you cannot believe your eyes:

$ ls -al target/debug/hello
-rwxrwxr-x 1 lifthrasiir 650711 May 31 20:00 target/debug/hello*

650 kilobytes to print anything?! You remember that Rust is probably a sole language that may possibly displace C++, and C++ is noted of the code bloat; would that mean Rust failed to fix one of C++’s big problems? Out of curiosity, you make the same program in C and compile it. The result is eye-opening:

$ cat hello-c.c
#include <stdio.h>
int main() {
    printf("Hello, World!\n");
}
$ make hello-c
$ ls -al hello-c
-rwxrwxr-x 1 lifthrasiir 8551 May 31 20:03 hello-c*

Maybe C has a benefit of having bare-metal libraries, you think. This time you try a C++ program using iostream, which should be much safer than C’s naive printf. But surprisingly it still seems tiny compared to Rust:

$ cat hello-cpp.cpp
#include <iostream>
using namespace std;
int main() {
    cout << "Hello, World!" << endl;
}
$ make hello-cpp
$ ls -al hello-cpp
-rwxrwxr-x 1 lifthrasiir 9168 May 31 20:06 hello-cpp*

What is wrong with Rust?


It seems that the surprisingly large size of Rust binary is a massive concern for many. This question is by no means new; there is a well-known, albeit year-old, question on StackOverflow, and searching for “why is rust binary large” gives several more. Given the frequency of such questions, it is a bit surprising that we don’t yet have a definitive article or page dealing with them. So this is my attempt to provide one.

Just to be cautious: Is it a valid question to ask after all? We have hundreds of gigabytes of storage, if not some terabytes, and people should be using decent ISPs nowadays, so the binary size should not be a concern, right? The answer is that it still may matter (though not much as before):

  • Akamai State of the Internet shows that, while more than 80% of users enjoy 4Mbps or more in developed countries much less users do in developing countries. The average connection has been improved much (almost every country is now past 1Mbps average), but the entire distribution is still stagnating. I was fortunate that I’m in the country where gigabit ethernet only costs $30/mo (!), but many others may not.

  • Ordinary consumers have only shallow understanding of computing, and they likely to relate any problem with anything they know of: one of the common sentiments is that the executable bloat causes slowdown. That’s unfortunate but true, and you would want to avoid that sentiment.

For wondering readers: All examples are tested in Rust 1.9.0 and 1.11.0-nightly (a967611d8 2016-05-30). Unless noted, the primary operating system used is Linux 3.13.0 on x86-64. Your mileage may vary.

Optimization Level

If one were asked about the above, virtually every experienced Rust user would ask you back:

Have you enabled the release build?

It turns out that Cargo distinguishes the debug build (default) from the release build (--release). The Cargo documentation explains the exact differences between them, but in general the release build gets rid of development-only routines and data and enables tons of optimization. It is not default because, well, the debug build is more frequently requested than the release build. So let’s try that!

$ ls -al target/release/hello
-rwxrwxr-x 1 lifthrasiir 646467 May 31 20:10 target/release/hello*

And that didn’t really make a difference! This is because the optimization is only run over the user code, and we don’t have much user code. Almost all of the binary is from the standard library, and that doesn’t seem to be what we can do anything...

Link-time Optimization (LTO)

…except that we can. Enter the world of link-time optimization.

So the story is as follows: We can individually optimize each crate, and in fact all standard libraries ship in the optimized form. Once the compiler produces an optimized binary, it gets assembled to a single executable by a program called the “linker”. But we don’t need the entirety of standard library: a simple “Hello, world” program definitely does not need std::net for example. Yet, the linker is so stupid that it won't try to remove unused parts of crates; it will just paste them.

There is actually a good reason that the traditional linker behaves so. The linker is commonly used in C and C++ languages among others, and each file is compiled individually. This is a sharp difference from Rust where the entire crate is compiled altogether. Unless required functions are scattered throughout files, the linker can fairly easily get rid of unused files at once. It’s not perfect, but reasonably approximate what we want---removing unused functions. One disadvantage is that the compiler is unable to optimize function calls pointing to other files; it simply lacks a required information.

C and C++ folks had been fine with that approximation for decades, but in the recent decade they had enough and started to provide an option to enable the link-time optimization (LTO). In this scheme the compiler produces optimized binaries without looking at others, and the linker actively looks at them and tries to optimize binaries. It is much harder than working with (internally simplified) sources, and it blows the compilation time up, but it is worth trying if the smaller and/or faster executable is needed.

So far we have talked about C and C++, but the LTO is much more beneficial for Rust. Cargo.toml has an option to enable LTO:

[profile.release]
lto = true

Did that work? Well, sort of:

$ ls -al target/release/hello
-rwxrwxr-x 1 lifthrasiir 615725 May 31 20:17 target/release/hello*

It had a larger effect than the optimization level, but not much. Maybe it is time to look at the executable itself.

So what’s in my executable?

There are several tools directly working with the executable, but probably the most useful one is GNU binutils. It is available to every Unix-like systems, and also in the Windows (MinGW has a standalone install for example).

There are many utilities in binutils, but strings is probably the simplest. It simply crawls the binary to find a sequence of printable characters terminated by a zero byte, a typical representation of C string. Thus it tries to extract readable strings out of the binary, quite helpful for us. So let’s try that, and prepare for the scroll:

$ strings target/release/hello | head -n 10
/lib64/ld-linux-x86-64.so.2
bki^
 B ,
libpthread.so.0
_ITM_deregisterTMCloneTable
_Jv_RegisterClasses
_ITM_registerTMCloneTable
write
pthread_mutex_destroy
pthread_self

And, wow, it already has something we haven’t expect, pthread. (More on that later, though.) There are indeed tons of strings in our executable:

$ strings target/release/hello | wc
   5893    7027   94339

94339 printable bytes and 5893 zero bytes (one per line) make almost 100 KB. One sixth of our executable is for strings we don’t really use! At the closer inspection, this observation is not correct as strings also give many false positives, but there are some significant strings as well:

  • Those strating with jemalloc_ and je_. These are names from jemalloc, a high-performance memory allocator. So that’s what Rust uses for the memory management, in place of classic malloc/free. It is not a small library however, and we don’t do the dynamic allocation by ourselves anyway.

  • Those starting with backtrace_ and DW_. These are yet another names from libbacktrace, a library to produce stack trace. Rust uses it to print a helpful backtrace on panic (available with RUST_BACKTRACE=1 environment). We don’t panic however.

  • Those starting with _ZN. They are “mangled” names from Rust standard libraries.

Why do we have those strings at first place? They are...

Linkage

First, I have to admit that I was cheating with the size of C and C++ binaries. The fair comparison would be as follows:

$ make hello-c CFLAGS='-static'
cc -static    hello-c.c   -o hello-c
$ make hello-cpp CXXFLAGS='-static -static-libstdc++'
g++ -static -static-libstdc++    hello-cpp.cpp   -o hello-cpp
$ ls -al hello-c hello-cpp
-rwxrwxr-x 1 lifthrasiir  877175 May 31 20:10 hello-c*
-rwxrwxr-x 1 lifthrasiir 1653135 May 31 20:10 hello-cpp*

Okay, so it seems that Rust is actually far better than C and C++. But… why is it “fair”? Isn’t an 1 MB executable too much for such a simple program regardless of the language?

A binary executable is not a simple data format. It is normally processed and often altered by an OS routine called a “dynamic linker” (not to be confused a “linker”, that combines assembled binaries into a single executable).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment