Sunday, January 28, 2024

Mojo Vs Rust, Basic Test And Binary Perspective.

Hello, In first place I'm not going to do an algorithmic benchmark, just a simple loop + print test and some checks on the generated binaries.

The system is a Debian12 Linux and the architecture is: x86 64bits.



Rust

Mojo


Mojo don't allow .py extension it has to be .mojo so no default nvim highlighting ...


$ mojo build mojo_benchmark.mojo

$ time ./mojo_benchmark

...

real 0m0.342s

user 0m0.080s

sys 0m0.252s



$ rustc rust_benchmark.rs

$ time ./rust_benchmark

...

real 0m0.107s

user 0m0.012s

sys 0m0.049s


I noticed a speed increase using fish shell instead of bash but could be the environment variable stack overload.


So in this specific test rust is much faster. And also the compiler suggests using _ instead i, that mojo compiler doesn't.

The rust binary is bigger, but is because the allocator is embedded:

-rwxr-xr-x 1 sha0 sha0 1063352 Jan 10 08:55 mojo_benchmark

-rwxr-xr-x 1 sha0 sha0 4632872 Jan 10 08:57 rust_benchmark


But Look this, mojo uses libstdc++ and libc  and rust only uses libc.

$ ldd -d mojo_benchmark

linux-vdso.so.1 (0x00007ffd94917000)

libtinfo.so.6 => /lib/x86_64-linux-gnu/libtinfo.so.6 (0x00007fe899cb1000)

libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fe899a00000)

libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fe899921000)

libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fe899c91000)

libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe899740000)

/lib64/ld-linux-x86-64.so.2 (0x00007fe899d2c000)


$ ldd -d rust_benchmark

linux-vdso.so.1 (0x00007ffde67b7000)

libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f8b3881b000)

libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8b3863a000)

/lib64/ld-linux-x86-64.so.2 (0x00007f8b388ae000)



Lets check the binary.
All the python non used built-ins are written to the binary, so does rust in this case.

mojo

rust




Steps until libc write:

Mojo



Rust


Ok wait, rustc like cargo by default is on debug-mode which is the slower version, the way to do cargo --release which is much faster is  rustc -O rust_benchmark.rs

real 0m0.107s
user 0m0.005s
sys 0m0.056s


This simple program don't make profit of the optimizations.


Rust


We reduced from 30 calls to 27.
I'm not going to criticize the number of calls because rust does his magic and result faster.

Mojo only 7 calls but runtime seems slower.

Regarding memory operations, seems that is rust like compiler-time borrow checked.

https://docs.modular.com/mojo/programming-manual.html#behavior-of-destructors


Rust decompiled


Rust disassembled





Mojo decompiled





Mojo disassembled



So we have two things: the crafted assembly speed, and specially the runtime speed.

Looking the Rust assembly, it's writing the string pointer to stack on every iteration which is same pointer in every iteration.

However Mojo loop is more optimized, param and address to call are pre-calculated before the loop.


So Mojo is generating optimized code, but its c++ API seems slower, at least the print() 

Regards.


















Continue reading


No comments:

Post a Comment