Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split Cargo slides #79

Merged
merged 2 commits into from
Jul 26, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion training-slides/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@
# Applied Rust

- [Methods and Traits](./methods-traits.md)
- [Cargo Dependencies and Workspaces](./using-cargo.md)
- [Rust I/O Traits](./io.md)
- [Generics](./generics.md)
- [Lifetimes](./lifetimes.md)
- [Cargo Workspaces](./cargo-workspaces.md)
- [Heap Allocation (Box and Rc)](./heap.md)
- [Shared Mutability (Cell, RefCell)](./shared-mutability.md)
- [Thread Safety (Send/Sync, Arc, Mutex)](./thread-safety.md)
Expand Down Expand Up @@ -53,3 +53,6 @@
- [Unsafe Rust](./unsafe.md)
- [WASM](./wasm.md)
- [Working with Nightly](./working-with-nighly.md)
- [Using Cargo](./using-cargo.md)
- [Dependency Management with Cargo](./dependency-management.md)
- [Rust Projects Build Time](./rust-build-time.md)
102 changes: 102 additions & 0 deletions training-slides/src/cargo-workspaces.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# Cargo Workspaces

## Cargo Workspaces

Allow you to split your project into several packages

* further encourages modularity
* develop multiple applications and libraries in a single tree
* synchronized dependency management, release process, etc.
* a way to parallelize compilation and speed up builds
* **your internal projects should likely be workspaces** even if you don't use monorepos

## Anatomy of Rust Workspace

```text
my-app/
├── Cargo.toml # a special workspace file
├── Cargo.lock # notice that Cargo produces a common lockfile for all packages
├── packages/ # can use any directory structure
│ ├── main-app/
│ │ ├── Cargo.toml
│ │ └── src/
│ │ └── main.rs
│ ├── admin-app/
│ │ └── ...
│ ├── common-data-model/
│ │ ├── Cargo.toml
│ │ └── src/
│ │ └── lib.rs
│ ├── useful-macros
│ ├── service-a
│ ├── service-b
│ └── ...
└── tools/ # packages don't have to be in the same directory
├── release-bot/
│ ├── Cargo.toml
│ └── src/
│ └── main.rs
├── data-migration-scripts/
│ ├── Cargo.toml
│ └── src/
│ └── main.rs
└── ...
```

## Workspace Cargo.toml

```toml
[workspace]
members = ["packages/*", "tools/*"]

[dependencies]
thiserror = "1.0.39"
...
```

using wildcards for members is very handy when you want to add new member packages, split packages, etc.

## Cargo.toml for a workspace member

```toml
[package]
name = "main-app"

[dependencies]
thiserror = { workspace = true }
service-a = { path = "../service-a" }
...
```

## Cargo commands for workspaces

* `cargo run --bin main-app`
* `cargo test -p service-a`

## Creating a workspace

```sh
#!/usr/bin/env bash
function nw() {
local name="$1"
local work_dir="$PWD"
mkdir -p "$work_dir/$name/packages"
git init -q "$work_dir/$name"
cat > "$work_dir/$name/Cargo.toml" << EOF
[workspace]
members = ["packages/*"]

[workspace.dependencies]
EOF
cat > "$work_dir/$name/.gitignore" << EOF
target
EOF
code "$work_dir/$name"
}
```

Example:
```bash
nw spaceship
cargo new --lib spaceship/packages/fuel-control
```
130 changes: 130 additions & 0 deletions training-slides/src/dependency-management.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# Dependency Management with Cargo

## Cargo.toml - A manifest file

```toml
[package]
name = "tcp-mailbox"
version = "0.1.0"

[dependencies]
async-std = "1" # would also choose 1.5
clap = "2.2" # would also choose 2.3
```

## Cargo.lock - A lock file

* contains a list of all project dependencies, de-facto versions and hashes of downloaded dependencies
* when a version is *yanked* from `Crates.io` but you have the correct hash for it in a lock file Cargo will still let you download it and use it
* still gives you warning about that version being problematic
* should be committed to your repository for applications

## Dependency resolution

* uses "Zero-aware" SemVer for versioning
* `1.3.5` is compatible with versions `>= 1.3.5` and `< 2.0.0`
* `0.3.5` is compatible with versions `>= 0.3.5` and `< 0.4.0`
* `0.0.3` only allows `0.0.3`
* allows version-incompatible transitive dependencies
* except C/C++ dependencies
* combines dependencies with compatible requirements as much as possible
* allows path, git, and custom registry dependencies

## How a dependency version is selected

* for every requirement Cargo selects acceptable version intervals
* `[1.1.0; 1.6.0)`, `[1.3.5, 2.0.0)`, `[2.0.0; 3.0.0)`
* Cargo checks for interval intersections to reduce the number of unique intervals
* `[1.3.5; 1.6.0)`, `[2.0.0; 3.0.0)`
* for every unique interval it selects the most recent available version
* `=1.5.18`, `=2.7.11`
* selected versions and corresponding package hashes are written into `Cargo.lock`

## Dependency resolution: Example

```text
└── my-app May install:
├── A = "1"
│ ├── X = "1" A = "1.0.17"
│ └── Y = "1.3" => B = "1.5.0"
└── B = "1" X = "2.0.3"
├── X = "2" X = "1.2.14"
└── Y = "1.5" Y = "1.8.5"
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing there's some points that would be nice to include here as speaker notes

  • X is installed 2 times, at different compatible versions to both A and B (which means bumping deps to a same common version may reduce compile times)
  • Resolver may choose a minor patch A = 1 -> A = 1.0.17 or a major patch Y = 1.5 -> Y = 1.8.5 if you don't specify more


## Where do dependencies come from?

* Crates.io
* Private registries (open-source, self-hosted, or hosted)
* Git and Path dependencies
* dependencies can be *vendored*

Notes:

* private registries
* hosted: **Shipyard**, JFrog, CloudSmith
* self-hosted: **Kellnr**
* open-source: [Ktra](https://github.com/moriturus/ktra) - pronounced `['KO-to-ra]`, [Meuse](https://github.com/mcorbin/meuse) - `[Møs]`

*Shipyard and Kellnr will also generate API docs for you*

## Crates.io

* default package registry
* 100k crates and counting
* **every Rust Beta release is tested against all of them every week**
* packages aren't deleted, but *yanked*
* if you have a correct hash for a yanked version in your `Cargo.lock` your build won't break (you still get a warning)

## Docs.rs

* **complete API documentation for the whole Rust ecosystem**
* automatically publishes API documentation for every version of every crate on Crates.io
* documentation for old versions stays up, too. Easy to switch between versions.
* links across crates just work

## Other kinds of dependencies

* git dependencies
* both `git+https` and `git+ssh` are allowed
* can specify branch, tag, commit hash
* when downloaded by Cargo exact commit hash used is written into `Cargo.lock`
* path dependencies
* both relative and absolute paths are allowed
* common in workspaces

## C Libraries as dependencies

* Rust can call functions from C libraries using `unsafe` code
* integrate with operating system APIs, frameworks, SDKs, etc.
* talk to custom hardware
* reuse existing code (SQLite, OpenSSL, libgit2, etc.)
* building a crate that relies on C libraries often requires customization
* done using `build.rs` file

## `build.rs` file

* compiled and executed before the rest of the package
* can manipulate files, execute external programs, etc.
* download / install custom SDKs
* call `cc`, `cmake`, etc. to build C++ dependencies
* execute `bindgen` to generate Rust bindings to C libraries
* output can be used to set Cargo options dynamically
```rust ignore
println!("cargo:rustc-link-lib=gizmo");
println!("cargo:rustc-link-search=native={}/gizmo/", library_path);
```

## `-sys` crates

* often Rust libraries that integrate with C are split into a pair of crates:
* `library-name-sys`
* thin wrapper around C functions
* often all code is autogenerated by `bindgen`
* `library-name`
* depends on `library-name-sys`
* exposes convenient and idiomatic Rust API to users
* examples:
* `openssl` and `openssl-sys`
* `zstd` and `zstd-sys`
* `rusqlite` and `libsqlite3-sys`
108 changes: 108 additions & 0 deletions training-slides/src/rust-build-time.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Rust Projects Build Time

## Understanding Rust projects build time

* Cargo keeps track of changes you make and only rebuilds what is necessary
* when building a crate `rustc` can do most of work in parallel, but some steps still require synchronization
* depending on the type of build, times spent in different build phases may be vastly different.
* debug vs release
* various flags for `rustc` and LLVM
* a build from scratch vs an incremental build

## Producing a build timings report

`rm -rf target/debug && cargo build --timings`

```text
.
└── target/
├── cargo-timings/
│ ├── cargo-timings.html
│ └── cargo-timings-<timestamp>.html
├── debug/
└── ...
```

## Timings Report

![Cargo Build Report for Rust Analyzer](./images/rust-analyzer-cargo-build-timings.png)

## Reading the report

* Cargo can't start building a crate until all its dependencies have been built.
* Cargo only waits for `rustc` to produce an LLVM IR, further compilation by LLVM can run in background (purple)
* a crate can't start building until its `build.rs` is built and finishes running (yellow)
* if multiple crates depend on a single crate they often can start building in parallel
* if a package is both a binary and a library then the binary is built after a library
* integration tests, examples, benchmarks, and documentation tests all produce binaries and thus take extra time to build.

## Actions you can take

## Keep your crates independent of each other

* Bad dependency graph:
```text
D -> C -> B -> A -> App
```
* Good dependency graph (A, B, and C can be built in parallel):
```text
/-> A \
D -> B -> App
\-> C /
```

## Turn off unused features

* Before:
```toml
[dependencies]
tokio = { version = "1", features = ["full"] } # build all of Tokio .
```
* After:
```toml
[dependencies]
tokio = { version = "1", features = ["net", "io-util", "rt-multi-thread"] }
```

## Prefer pure-Rust dependencies

* crate cannot be build before `build.rs` is compiled and executed
* crates using C-dependencies have to rely on `build.rs`
* `build.rs` might trigger C/C++ compilation which in turn is often slow

* e.g.: `rustls` instead of `openssl`

## Use multi-module integration tests:

* Before (3 binaries)
```text
├── src/
│ └── ...
└── tests/
├── account-management.rs
├── billing.rs
└── reporting.rs
```
* After (a single binary)
```text
├── src/
│ └── ...
└── tests/
└── my-app-tests/
├── main.rs # includes the rest as modules .
├── account-management.rs
├── billing.rs
└── reporting.rs
```
* Also benchmark and examples

## Other tips

* split your large package into a few smaller ones to improve build parallelization
* extract your binaries into separate packages
* remove unused dependencies

## Tools

* `cargo-chef` to speed up your docker builds
* `sccache` for caching intermediary build artifacts across multiple projects and developers
Loading