Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC: use Rust for css color parsing #2647

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

nyurik
Copy link
Member

@nyurik nyurik commented Jul 24, 2024

This is a totally incomplete proof of concept how a Rust css parser can be used from C++.
This PR uses changes from #2643

  • a rustutils crate is the entry point for all rust utilities
  • uses Corrosion - a cmake plugin to compile Rust
  • uses cxx to export Rust code to C++ and generate headers
  • At the moment only does C++ to Rust calls (I just haven't tried the reverse yet)
  • Removes any vendor/csscolorparser -- to ensure we don't accidentally rely on it

Help needed

  • fix cmake licensing script -- there are multiple rust tools for iterating over all deps, see this (there is a list at the bottom). In the mean time we just need some placeholder so that I don't have to disable license generator.
  • fix CI for all targets
  • need better naming conventions

@nyurik nyurik requested a review from louwers July 24, 2024 04:35
@nyurik nyurik force-pushed the rust-css-color branch 2 times, most recently from 1a8d11a to 807f0ae Compare July 24, 2024 04:44
@nyurik nyurik changed the title Rust css color POC: use Rust for css color parsing Jul 24, 2024
@louwers
Copy link
Collaborator

louwers commented Jul 24, 2024

I ran the Bloaty size test manually on Linux. It is reporting a +364% binary size increase. Diff here: https://gist.github.com/louwers/809e971d9ae3459bcff450487c29249c

@louwers
Copy link
Collaborator

louwers commented Jul 24, 2024

Something to consider is platform support by rustc and the Rust standard library. ARM64 iOS and ARM64 Android are classified as having 'Tier 2' support.

Tier 2 targets can be thought of as "guaranteed to build". The Rust project builds official binary releases of the standard library (or, in some cases, only the core library) for each tier 2 target, and automated builds ensure that each tier 2 target can be used as build target after each change.

Tier 2 target-specific code is not closely scrutinized by Rust team(s) when modifications are made. Bugs are possible in all code, but the level of quality control for these targets is likely to be lower

https://doc.rust-lang.org/nightly/rustc/platform-support.html#tier-2-without-host-tools

There is no automated testing for Rust or the Rust standard library for iOS right now.

@ianthetechie
Copy link
Collaborator

Something to consider is platform support by rustc and the Rust standard library. ARM64 iOS and ARM64 Android are classified as having 'Tier 2' support.

This is correct, but the Tier 2 label is a bit scarier than it sounds. For example, macOS on Apple Silicon is also somewhat infamously Tier 2 still 😉

It's worth noting that major projects are also using Rust in production on iOS and Android, including Firefox which uses Rust to share code across platforms.

@louwers
Copy link
Collaborator

louwers commented Jul 24, 2024

Can you try building with #![no_std]? https://docs.rust-embedded.org/book/intro/no-std.html

@nyurik
Copy link
Member Author

nyurik commented Jul 24, 2024

@louwers no_std would kill 90% of the Rust value -- it is mostly used for the embedded (no OS) cases (writing firmware, kernel, or bootloader code). Are there any reasons for it?

@nyurik
Copy link
Member Author

nyurik commented Jul 24, 2024

Looking at ./build/bin/mbgl-render output:

  • original size is 124MB, shrinks to 7.8MB after running strip on it
  • If I enable lto = true and codegen-units = 1 in rustutils (PR is now updated), it shrinks to 122MB / 7.1MB --- 9% improvement on stripped
  • That said, compiling original produces 120 / 6.8 MB, still 4% smaller on stripped.
  • P.S. I also tried adding opt-level = "z" (optimize for size), but that produced identical size.

I looked at the size increase text dump - looks really weird. It shows a significant increase in all sorts of .cpp files, and I am really not sure why that might be the case. The key changes are in these I think:

[NEW]    +171  [ = ]       0    /rust/deps/compiler_builtins-0.1.109/src/lib.rs/@/compiler_builtins.13ee9051c16629a4-cgu.016
[NEW]    +790  [ = ]       0    /rust/deps/compiler_builtins-0.1.109/src/lib.rs/@/compiler_builtins.13ee9051c16629a4-cgu.008
[NEW]  +130Ki  [NEW] +48.6Ki    /rust/deps/memchr-2.5.0/src/lib.rs/@/memchr.7066fcbbe06873ca-cgu.0
[NEW]  +180Ki  [NEW] +25.6Ki    library/alloc/src/lib.rs/@/alloc.bfbae7e348dce413-cgu.0
[NEW]  +829Ki  [NEW]  +207Ki    library/core/src/lib.rs/@/core.868bc93c3f2beb33-cgu.0
[NEW] +1.46Ki  [NEW]    +259    /rust/deps/compiler_builtins-0.1.109/src/lib.rs/@/compiler_builtins.13ee9051c16629a4-cgu.110
[NEW] +25.0Ki  [NEW] +13.2Ki    /home/bart/build-rust/corrosion_generated/cxxbridge/rustutils_headers/src/lib.cpp
[NEW] +3.14Mi  [NEW]  +619Ki    library/std/src/lib.rs/@/std.3c8ba8ebcf555201-cgu.0
[NEW] +5.40Ki  [NEW] +1.04Ki    library/panic_unwind/src/lib.rs/@/panic_unwind.79513d39ffd1496f-cgu.0

Most of these are 1-time cost, i.e. some core mem alloc, utf8, and panic handling, and a 4% increase might be a good trade in exchange for other benefits. But clearly we should pay attention to that.

repository = "https://github.com/maplibre/maplibre-native"

[lib]
crate-type = ["staticlib"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that for Android (and probably Linux?), you only need a cdylib. staticlib is however required for iOS (at the moment).

This is actually somewhat relevant to @louwers' comment about Bloatly too, depending on how Bloatly looks at things, and how smart linking is with JNI on Android. Apologies in advance for any ignorance about its methodology in advance, but the final size of the libraries does not necessarily go straight to release binary size of an application.

Some numbers... Doing a release build of the xcframework for iOS isn't a fair comparison, since that's compressed and includes all architectures (the final app on the user's phone is uncompressed AFAIK and is "thinned" to remove slices per architecture and dependencies on libs that are already in the base system). If you look inside the XCFramework for Ferrostar, you'll find the ios-arm64 folder is 22.1MB. The total reported binary size for a non-trivial app running on my iPhone, which includes MapLibre Native, is only 15.MB. It's one of the smallest apps on my phone (a Debug build is, for reference, only slightly larger at ~19MB) 😂

Android appears to be slightly heavier (screenshot at the end of the post), but slicing the bundle per arch should make things quite manageable. Also notable that his is a debug build; couldn't find an easy way to get Android to generate a release build without a dance for signing.

The point being, it's not adding much to a release binary, even if the library sizes may look a bit scary at first. For contrast, here are the sizes of the most popular apps on my phone: Signal (134MB), LinkedIn (367MB), Gmail (502MB), Slack (392MB), Uber (412MB), CapitalOne (480MB), WhatsApp (197MB), AirBnB (220MB)...

I am not 100% sure that the build settings, Bazel integration, etc. are optimal for this PR yet, but I am confident based on experience with Ferrostar that we should be able to manage the impact to binary size. Rust does make some tradeoffs (all Rust deps must be statically linked), but I expect we will be able to manage the impact to release builds.

TL;DR - 1) let's try building a cdylib for specific platforms, and 2) let's fact check Bloatly against what it does to an actual release binary for a demo app on several target architectures.

image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Rust library is linked with the rest of maplibre to produce a final shared library. So the target of the Rust library should be staticlib for both platforms I believe.

@ianthetechie
Copy link
Collaborator

ianthetechie commented Jul 24, 2024

On no_std, it's worth noting that you only "pay" (in final binary size) for the parts of the std lib that you actually use, and these are almost all extremely helpful. It's possible to do no_std, but you don't really gain much except for an embedded target. And even there, you actually still do have access to a large part of what you initially think is in std. Most of that's actually in core, which is available, or alloc, which you can opt into (and of course it's available on all platforms we currently target).

Small addendum: no_std is actually something very nice to do if you are publishing a public crate. But my understanding is that we would, for the foreseeable future, not be doing that. So Bart's suggestion isn't a bad one at all, but I think it doesn't apply to our use case for the moment.

@nyurik
Copy link
Member Author

nyurik commented Jul 24, 2024

Thx @ianthetechie! Why would we want to build a cdylib? If i understand it correctly, that creates an .so / .dll - which has to be shipped together with the final binary as a dynamically loadable file. Wouldn't it only make sense if we were building a shared lib distributed via .deb package? The current usecase is to create core component that gets linked with other c++ code into a single executable.

@ianthetechie
Copy link
Collaborator

Wouldn't it only make sense if we were building a shared lib distributed via .deb package? The current usecase is to create core component that gets linked with other c++ code into a single executable

Sorry I may have missed some of which platforms we're targeting / how it's built @nyurik ;) To distribute Rust code for Android via the NDK + Java bridges (JNI/JNA), you usually build a cdylib. Yes, that's dynamic (needs to be around somewhere), but the whole library (MapLibre Native) links in all of its native dependencies (statically). Just pointing that out since I assumed we'll eventually want that.

I regrettably don't have more details on why that's the case / what technical limitations there are, but that's what all the tooling bridging cargo and the NDK requires of crates. Maybe it doesn't apply to us since we're essentially linking up a library that will look like just a regular lib with headers and a C ABI already; I guess Bazel is driving a lot of this linking and by the time it gets to our NDK step, it's all indistinguishable anyways.

@nyurik
Copy link
Member Author

nyurik commented Jul 24, 2024

Ah, gotcha - yes, the resulting build target that wraps all Rust + C++ functionality could be dynamic - but that's up to the cmake/bazel/... to create and use in JNI. It would not affect how I build the low-level core components that get linked in. Otherwise you end up with JNI -> C++ cdylib -> Rust cdylib.

As a separate project, I will try to wrap the whole ml-native as a rust lib. That target will need to support both rustlib and cdylib output.

@ianthetechie
Copy link
Collaborator

@louwers To put the T2 target concern to rest, I pinged a few ppl on Mastodon to get an answer closer to the source, and Esteban Küber, a member of the compiler team responded: https://hachyderm.io/@ekuber/112841995275142925. TL;DR we can rely on stable channel Rust releases; just not nightly (which nobody is proposing here haha). It's more a reflection of CI resources than anything (and as such, perhaps unsurprisingly, eventually x86 macOS will eventually move to T2).

@nyurik nyurik force-pushed the rust-css-color branch 2 times, most recently from fe59cf4 to cc1222f Compare July 24, 2024 19:50
@nyurik
Copy link
Member Author

nyurik commented Jul 24, 2024

This PR has been rebased on the new docker implementation - so now it can be tried very easily without installing anything locally, while still not having to re-download anything on each docker run command

CMakeLists.txt Outdated Show resolved Hide resolved
Copy link
Collaborator

@maxammann maxammann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good! For the binary size I think we need to do some more experiments. I believe it is important to see how much constant "Rust runtime" overhead is added. E.g. if Rust uses a different allocator than maplibre-native then this would add significant "bloat". Does maplibre-native use malloc?

I'm personally not that interested in how much the standard library adds. If we see that a new Rust-feature adds to much overhead we can always fall back to the C++/C std-library by using unsafe Rust.

Copy link
Collaborator

@maxammann maxammann Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused files

@@ -70,6 +70,9 @@ jobs:
distribution: "temurin"
java-version: "17"

- name: Add aarch64-linux-android for Rust toolchian
run: rustup target add --toolchain stable-x86_64-unknown-linux-gnu aarch64-linux-android armv7-linux-androideabi i686-linux-android
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dependency to rustup is not hight right now, so this is fine. However, we should make sure not to depend too much on rustup. This is a mistake that I made in maplibre-rs. The mistake later shows up e.g. when integrating into other build systems.

Ideally, the only required dependency is rustc. Realstically there is no way around cargo. However, a strict requirement for rustup can be easily avoided.

Using rustup in the CI is totalyl fine! (always!) We should just avoid depending on rustup in e.g. cmake scripts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx, wasn't aware that rustup is a problem in cmake/bazel, so will make sure to avoid it there

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not neccassarily a problem, but rustup is just one way of installing Rust.

Note that the rust-toolchain.toml is for example a feature of rustup. It might just cause frictioin to depend on rustup early on :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh I love that you mention rust-toolchain.toml @maxammann ;) I think that this is one of the most underrated features for cross-platform development Ind I do use it in my projects. That said, your concern is completely correct in the case of MapLibre and other projects which use Bazel for build orchestration.

@nyurik nyurik force-pushed the rust-css-color branch 2 times, most recently from 188baa9 to 0d349d7 Compare August 5, 2024 17:46
@ntadej
Copy link
Collaborator

ntadej commented Aug 6, 2024

While this is meant to be a POC, let me express my opinion, that I am very hesitant to depend on Rust with Qt.
Could we pre-build such dependencies?

@maxammann
Copy link
Collaborator

While this is meant to be a POC, let me express my opinion, that I am very hesitant to depend on Rust with Qt. Could we pre-build such dependencies?

Why is that? Are QT projects somehow picky about how dependencies are built?

@ntadej
Copy link
Collaborator

ntadej commented Aug 6, 2024

While this is meant to be a POC, let me express my opinion, that I am very hesitant to depend on Rust with Qt. Could we pre-build such dependencies?

Why is that? Are QT projects somehow picky about how dependencies are built?

There are a few users building for embedded devices where adding additional complexity may cause issues.

@louwers
Copy link
Collaborator

louwers commented Aug 6, 2024

I re-ran Bloaty manually Linux, looks a lot leaner now! https://gist.github.com/louwers/bb80f15df034061bb5ddc6022070aba0

This is a proof-of-concept to try to see how hard it is to integrate Rust into our build, because if we want to do that, that would be the first hurdle to overcome.

Whether we actually want to integrate Rust into our build, and thus allow certain parts of MapLibre Native be written in Rust, is something that warrants a discussion. I have my own thoughts about this, but in general any change that adds significant complexity needs to be well-justified in terms of what our users are interested in.

Adding a hard dependency on Rust will be a hard sell, especially to our users that care about binary size. If you want this effort to succeed, I would recommend an approach where existing modules can have opt-in Rust implementations. Then, when you have a foot in the door, you are in a good position to write new modules that add 'killer functionality' in just Rust (e.g. a MapLibre Tiles decoder). That way you would generate buy-in, trust and excitement in our community.

@maxammann
Copy link
Collaborator

I re-ran Bloaty manually Linux, looks a lot leaner now! https://gist.github.com/louwers/bb80f15df034061bb5ddc6022070aba0

I wonder why the stdlib comes with +1.32Mi. That seems a bit too much 🤔

@maxammann
Copy link
Collaborator

@maxammann That is the file size, the VM size is what matters.

https://github.com/google/bloaty/blob/main/doc/using.md

Why that? I though we care mostly about how much more space the binaries would take when e.g. shipped in an APK. Is the binary we are looking at not stripped?

@louwers
Copy link
Collaborator

louwers commented Aug 7, 2024

Haha you were too quick. I guess both matter. Somehow I think I heard @mwilsnd talk more about VM size, maybe he can clarify.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants