Skip to content

Commit

Permalink
init readme
Browse files Browse the repository at this point in the history
  • Loading branch information
MaxHalford committed Oct 6, 2023
1 parent 2f6a989 commit af32444
Show file tree
Hide file tree
Showing 3 changed files with 40 additions and 5 deletions.
5 changes: 3 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ ROC AUC appears roughly similar between the Python and Rust implementations. Not
- Using `with_capacity` on each `Vec` in `HST`, as well as the list of HSTs, we gain 1 second. We are now at **~5 seconds**.
- We can't find a nice profiler. So for now we comment code and measure time.
- Storing all attributes in a single array, instead of one array per tree, makes us reach **~3 seconds**.
- We tried using rayon to parallelize over trees, but it didn't bring any improvements.
- We removed the CSV logic from the benchmark, which brings us under **~2.5 second**.
- There is an opportunity to do the scoring and update logic in one fell swoop. This is because of the nature of online anomaly detection. This would bring us to **~0.7 seconds**. We are not sure if this is a good design choice though, so we may revisit this later.
- Fixing some algorithmic issues actually brings us to **~5 seconds** :(
- We tried using rayon to parallelize over trees, but it didn't bring any improvement whatsoever. Maybe we used it wrong, but we believe its because our loop is too cheap to be worth the overhead of spawning threads -- or whatever it is rayon does.
- There is an opportunity to do the scoring and update logic in one fell swoop. This is because of the nature of online anomaly detection. This would bring us to **~2.5 seconds**. We are not sure if this is a good design choice though, so we may revisit this later.
35 changes: 34 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,34 @@
# Mini river
<h1>🦀 LightRiver • fast online machine learning</h1>

<p>

<!-- Tests -->
<!-- <a href="https://github.com/online-ml/beaver/actions/workflows/unit-tests.yml">
<img src="https://github.com/online-ml/beaver/actions/workflows/unit-tests.yml/badge.svg" alt="tests">
</a> -->

<!-- Code quality -->
<!-- <a href="https://github.com/online-ml/beaver/actions/workflows/code-quality.yml">
<img src="https://github.com/online-ml/beaver/actions/workflows/code-quality.yml/badge.svg" alt="code_quality">
</a> -->

<!-- License -->
<a href="https://opensource.org/licenses/BSD-3-Clause">
<img src="https://img.shields.io/badge/License-BSD%203--Clause-blue.svg?style=flat-square" alt="bsd_3_license">
</a>

</p>

[![Discord](https://dcbadge.vercel.app/api/server/qNmrKEZMAn)](https://discord.gg/qNmrKEZMAn)

<div align="center" >
<img src="https://user-images.githubusercontent.com/8095957/202878607-9fa71045-6379-436e-9da9-41209f8b39c2.png" width="25%" align="right" />
</div>

LightRiver is an online machine learning library written in Rust. It is meant to be used in high-throughput environments, as well as TinyML systems.

This library is complementary to [River](https://github.com/online-ml/river/). The latter provides a wide array of online methods, but is not ideal when it comes to performance. The idea is to take the algorithms that work best in River, and implement them in a way that is more performant. As such, LightRiver is not meant to be a general purpose library. It is meant to be a fast online machine learning library that provides a few algorithms that are known to work well in online settings. This is a akin to the way [scikit-learn](https://scikit-learn.org/) and [LightGBM](https://lightgbm.readthedocs.io/en/stable/) are complementary to each other.

## 📝 License

LightRiver is free and open-source software licensed under the [3-clause BSD license](LICENSE).
5 changes: 3 additions & 2 deletions src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ fn main() {

// PARAMETERS

let window_size: u32 = 1000;
let window_size: u32 = 1_000;
let mut counter: u32 = 0;
let n_trees: u32 = 50;
let height: u32 = 6;
Expand Down Expand Up @@ -142,7 +142,8 @@ fn main() {
let mut node: u32 = 0;
for depth in 0..height {
// Update the score
score += hst.r_mass[(tree * n_nodes + node) as usize] * u32::pow(2, depth) as f32;
score +=
hst.r_mass[(tree * n_nodes + node) as usize] * f32::powf(2.0, depth as f32);

// Update the l_mass
hst.l_mass[(tree * n_nodes + node) as usize] += 1.0;
Expand Down

0 comments on commit af32444

Please sign in to comment.