init readme

online-ml · Oct 6, 2023 · af32444 · af32444
1 parent 2f6a989
commit af32444
Show file tree

Hide file tree

Showing 3 changed files with 40 additions and 5 deletions.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -17,6 +17,7 @@ ROC AUC appears roughly similar between the Python and Rust implementations. Not
 - Using `with_capacity` on each `Vec` in `HST`, as well as the list of HSTs, we gain 1 second. We are now at **~5 seconds**.
 - We can't find a nice profiler. So for now we comment code and measure time.
 - Storing all attributes in a single array, instead of one array per tree, makes us reach **~3 seconds**.
-- We tried using rayon to parallelize over trees, but it didn't bring any improvements.
 - We removed the CSV logic from the benchmark, which brings us under **~2.5 second**.
-- There is an opportunity to do the scoring and update logic in one fell swoop. This is because of the nature of online anomaly detection. This would bring us to **~0.7 seconds**. We are not sure if this is a good design choice though, so we may revisit this later.
+- Fixing some algorithmic issues actually brings us to **~5 seconds** :(
+- We tried using rayon to parallelize over trees, but it didn't bring any improvement whatsoever. Maybe we used it wrong, but we believe its because our loop is too cheap to be worth the overhead of spawning threads -- or whatever it is rayon does.
+- There is an opportunity to do the scoring and update logic in one fell swoop. This is because of the nature of online anomaly detection. This would bring us to **~2.5 seconds**. We are not sure if this is a good design choice though, so we may revisit this later.
diff --git a/README.md b/README.md
@@ -1 +1,34 @@
-# Mini river
+<h1>🦀 LightRiver • fast online machine learning</h1>
+
+<p>
+
+<!-- Tests -->
+<!-- <a href="https://github.com/online-ml/beaver/actions/workflows/unit-tests.yml">
+<img src="https://github.com/online-ml/beaver/actions/workflows/unit-tests.yml/badge.svg" alt="tests">
+</a> -->
+
+<!-- Code quality -->
+<!-- <a href="https://github.com/online-ml/beaver/actions/workflows/code-quality.yml">
+<img src="https://github.com/online-ml/beaver/actions/workflows/code-quality.yml/badge.svg" alt="code_quality">
+</a> -->
+
+<!-- License -->
+<a href="https://opensource.org/licenses/BSD-3-Clause">
+<img src="https://img.shields.io/badge/License-BSD%203--Clause-blue.svg?style=flat-square" alt="bsd_3_license">
+</a>
+
+</p>
+
+[![Discord](https://dcbadge.vercel.app/api/server/qNmrKEZMAn)](https://discord.gg/qNmrKEZMAn)
+
+<div align="center" >
+  <img src="https://user-images.githubusercontent.com/8095957/202878607-9fa71045-6379-436e-9da9-41209f8b39c2.png" width="25%" align="right" />
+</div>
+
+LightRiver is an online machine learning library written in Rust. It is meant to be used in high-throughput environments, as well as TinyML systems.
+
+This library is complementary to [River](https://github.com/online-ml/river/). The latter provides a wide array of online methods, but is not ideal when it comes to performance. The idea is to take the algorithms that work best in River, and implement them in a way that is more performant. As such, LightRiver is not meant to be a general purpose library. It is meant to be a fast online machine learning library that provides a few algorithms that are known to work well in online settings. This is a akin to the way [scikit-learn](https://scikit-learn.org/) and [LightGBM](https://lightgbm.readthedocs.io/en/stable/) are complementary to each other.
+
+## 📝 License
+
+LightRiver is free and open-source software licensed under the [3-clause BSD license](LICENSE).
diff --git a/src/main.rs b/src/main.rs
@@ -82,7 +82,7 @@ fn main() {
 
     // PARAMETERS
 
-    let window_size: u32 = 1000;
+    let window_size: u32 = 1_000;
     let mut counter: u32 = 0;
     let n_trees: u32 = 50;
     let height: u32 = 6;
@@ -142,7 +142,8 @@ fn main() {
             let mut node: u32 = 0;
             for depth in 0..height {
                 // Update the score
-                score += hst.r_mass[(tree * n_nodes + node) as usize] * u32::pow(2, depth) as f32;
+                score +=
+                    hst.r_mass[(tree * n_nodes + node) as usize] * f32::powf(2.0, depth as f32);
 
                 // Update the l_mass
                 hst.l_mass[(tree * n_nodes + node) as usize] += 1.0;