Significant performance drop after update from 0.16 to 0.18 #7222

Danvil · 2024-08-16T23:37:21Z

Describe the bug
I have a workload where I visualize about 10 small (200x200) images and tensors at at about 20 Hz.
With rerun 0.16 the visualization was basically running in realtime without problems.
After upgrading to 0.18 (both library and desktop app) ingestion speed is significantly slower and manages maybe ~0.5 Hz. The slowdown is so extreme that data keeps trickling in for minutes after the source program was already closed.
There only seems to be a problem with ingestion speed as replay when scrolling through the timeline is unaffected.

Desktop (please complete the following information):
Linux

Rerun version
0.18

Wumpf · 2024-08-17T10:17:07Z

Thanks for reporting this! That sounds quite concerning. Should be easy enough to repro, but if you have a concrete snippet/example-data to try that would be ofc very appreciated

teh-cmc · 2024-08-19T12:00:53Z

Repro

Python:

import numpy as np
import rerun as rr
import time

rr.init("rerun_example_image", spawn=True)

for i in range(0, 100):
    rr.set_time_sequence("frame", i)

    for j in range(0, 100):
        image = np.zeros((200, 300, 3), dtype=np.uint8)
        image[:, :, 0] = (i + j) % 255
        image[50:150, 50:150] = (0, 255 - ((i + j) % 255), 0)

        rr.log(f"images/{j}", rr.Image(image))

    time.sleep(0.010)

Rust:

use ndarray::{s, Array, ShapeBuilder};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let rec = rerun::RecordingStreamBuilder::new("rerun_example_image").spawn()?;

    for i in 0..100 {
        // NOTE: uncomment this line to fix the perf issue.
        rec.set_time_sequence("frame", i);

        for j in 0..100 {
            let mut image = Array::<u8, _>::zeros((200, 300, 3).f());
            image.slice_mut(s![.., .., 0]).fill((i + j) % 255);
            image
                .slice_mut(s![50..150, 50..150, 1])
                .fill(255 - ((i + j) % 255));

            rec.log(
                format!("images/{j}"),
                &rerun::Image::from_color_model_and_tensor(rerun::ColorModel::RGB, image)?,
            )?;
        }

        std::thread::sleep(std::time::Duration::from_millis(10));
    }

    Ok(())
}

teh-cmc · 2024-08-19T12:19:00Z

As I expected this is caused by the concatenation kernel being extremely slow for that specific type for some reason (ImageBuffer, aka ListArray(ListArray(Uint8Array)).

But I'm also seeing for weird RSS behavior, I have to dig deeper 🤔

teh-cmc · 2024-08-19T14:09:11Z

Regarding the performance, it looks like things are... "correctly slow" 🤷‍♂️ It seems that my expectations were just off base.

Here are standalone benchmarks that faithfully represent a similar workload:

Arrow2

use arrow2::{
    array::{Array, ListArray, PrimitiveArray},
    offset::Offsets,
};

fn main() {
    let array0 = PrimitiveArray::from_vec((0..200 * 300).map(|v| (v % 255) as u8).collect());

    concatenate_and_measure(array0.to_boxed());

    let array1 = ListArray::new(
        ListArray::<i32>::default_datatype(array0.data_type().clone()),
        Offsets::<i32>::try_from_lengths(std::iter::once(array0.len()))
            .unwrap()
            .into(),
        array0.boxed(),
        None,
    );

    concatenate_and_measure(array1.to_boxed());

    let array2 = ListArray::new(
        ListArray::<i32>::default_datatype(array1.data_type().clone()),
        Offsets::<i32>::try_from_lengths(std::iter::once(array1.len()))
            .unwrap()
            .into(),
        array1.boxed(),
        None,
    );

    concatenate_and_measure(array2.to_boxed());
}

fn concatenate_and_measure(array: Box<dyn Array>) {
    let mut concatenated = array.clone();

    let now = std::time::Instant::now();
    for _ in 0..1000 {
        concatenated =
            arrow2::compute::concatenate::concatenate(&[&*concatenated, &*array]).unwrap();
    }
    let elapsed = now.elapsed();

    // dbg!(&array);
    dbg!(concatenated.data_type());
    eprintln!(
        "1000 accumulated concatenations in {elapsed:?} ({:.3} MiB per sec)",
        how_many_bytes(concatenated) as f64 / 1024.0 / 1024.0 / elapsed.as_secs_f64(),
    );
}

fn how_many_bytes(array: Box<dyn Array>) -> u64 {
    let mut array = array;
    loop {
        match array.data_type() {
            arrow2::datatypes::DataType::UInt8 => break,
            arrow2::datatypes::DataType::List(_) => {
                let list = array.as_any().downcast_ref::<ListArray<i32>>().unwrap();
                array = list.values().to_boxed();
            }
            _ => unreachable!(),
        }
    }

    array.len() as _
}

[src/main.rs:45] concatenated.data_type() = UInt8
1000 accumulated concatenations in 3.875282565s (14.780 MiB per sec)

[src/main.rs:45] concatenated.data_type() = List(
    Field {
        name: "item",
        data_type: UInt8,
        is_nullable: true,
        metadata: {},
    },
)
1000 accumulated concatenations in 4.272323313s (13.407 MiB per sec)

[src/main.rs:45] concatenated.data_type() = List(
    Field {
        name: "item",
        data_type: List(
            Field {
                name: "item",
                data_type: UInt8,
                is_nullable: true,
                metadata: {},
            },
        ),
        is_nullable: true,
        metadata: {},
    },
)
1000 accumulated concatenations in 4.272910412s (13.405 MiB per sec)

Arrow1

use std::sync::Arc;

use arrow::{
    array::{Array, ArrayRef, ListArray, PrimitiveArray},
    buffer::OffsetBuffer,
    datatypes::{Field, UInt8Type},
};

fn main() {
    let array0: PrimitiveArray<UInt8Type> = (0..200 * 300)
        .map(|v| (v % 255) as u8)
        .collect::<Vec<_>>()
        .into();
    let array0: ArrayRef = Arc::new(array0);

    concatenate_and_measure(array0.clone());

    let array1 = ListArray::new(
        Field::new_list_field(array0.data_type().clone(), false).into(),
        OffsetBuffer::from_lengths(std::iter::once(array0.len())),
        array0.clone(),
        None,
    );
    let array1: ArrayRef = Arc::new(array1);

    concatenate_and_measure(array1.clone());

    let array2 = ListArray::new(
        Field::new_list_field(array1.data_type().clone(), false).into(),
        OffsetBuffer::from_lengths(std::iter::once(array1.len())),
        array1.clone(),
        None,
    );
    let array2: ArrayRef = Arc::new(array2);

    concatenate_and_measure(array2.clone());
}

fn concatenate_and_measure(array: ArrayRef) {
    let mut concatenated = array.clone();

    let now = std::time::Instant::now();
    for _ in 0..1000 {
        concatenated = arrow::compute::kernels::concat::concat(&[&*concatenated, &*array]).unwrap();
    }
    let elapsed = now.elapsed();

    // dbg!(&array);
    dbg!(concatenated.data_type());
    eprintln!(
        "1000 accumulated concatenations in {elapsed:?} ({:.3} MiB per sec)",
        how_many_bytes(concatenated) as f64 / 1024.0 / 1024.0 / elapsed.as_secs_f64(),
    );
}

fn how_many_bytes(array: ArrayRef) -> u64 {
    let mut array = array;
    loop {
        match array.data_type() {
            arrow::datatypes::DataType::UInt8 => break,
            arrow::datatypes::DataType::List(_) => {
                let list = array.as_any().downcast_ref::<ListArray>().unwrap();
                array = list.values().clone();
            }
            _ => unreachable!(),
        }
    }

    array.len() as _
}

[src/main.rs:49] concatenated.data_type() = UInt8
1000 accumulated concatenations in 3.866165885s (14.815 MiB per sec)

[src/main.rs:49] concatenated.data_type() = List(
    Field {
        name: "item",
        data_type: UInt8,
        nullable: false,
        dict_id: 0,
        dict_is_ordered: false,
        metadata: {},
    },
)
1000 accumulated concatenations in 7.109822764s (8.056 MiB per sec)

[src/main.rs:49] concatenated.data_type() = List(
    Field {
        name: "item",
        data_type: List(
            Field {
                name: "item",
                data_type: UInt8,
                nullable: false,
                dict_id: 0,
                dict_is_ordered: false,
                metadata: {},
            },
        ),
        nullable: false,
        dict_id: 0,
        dict_is_ordered: false,
        metadata: {},
    },
)
1000 accumulated concatenations in 7.137852288s (8.024 MiB per sec)

Native

fn main() {
    let array0 = (0..200 * 300).map(|v| (v % 255) as u8).collect::<Vec<u8>>();

    concatenate_and_measure(array0);
}

fn concatenate_and_measure(array: Vec<u8>) {
    let mut concatenated = array.clone();

    let now = std::time::Instant::now();
    for _ in 0..1000 {
        concatenated = concatenated
            .into_iter()
            .chain(array.iter().copied())
            .collect();
    }
    let elapsed = now.elapsed();

    eprintln!(
        "1000 accumulated concatenations in {elapsed:?} ({:.3} MiB per sec)",
        std::mem::size_of_val(concatenated.as_slice()) as f64
            / 1024.0
            / 1024.0
            / elapsed.as_secs_f64()
    );
}

1000 accumulated concatenations in 4.428740529s (12.933 MiB per sec)

Conclusion

If anything, arrow is the unexpectedly slow one, once you introduce list arrays.

teh-cmc · 2024-08-19T14:09:31Z

That's for the performance, still need to figure out what's going on with the memory.

RERUN_CHUNK_MAX_ROWS=4096:

RERUN_CHUNK_MAX_ROWS=0:

That simply makes no sense -- there has to be yet another arrow pointer leaking somewhere.

teh-cmc · 2024-08-19T21:41:07Z

Alright I've found the leak, although I haven't had time to look for a solution yet.

The issue stems from the implementation of the arrow concatenation kernel for things wrapped in one or more layer of ListArrays: for some reason the implementation will allocate two times the capacity than should be required in that case, and that extra capacity will just linger on forever from that point on.

I originally felt this could be the source of the problem, and tested it out on a PrimitiveArray, which didn't exhibit any issue... that's where the "wrapped in a ListArray" part comes in to play: I finally tried that after exhausting every other goose chase. I'm very sad.

Interestingly, the problem appears both in arrow and arrow2.
Also, this can very possibly mean that we can save some more memory on timeseries.

I'll look for a fix tomorrow.

Arrow

use std::sync::{
    atomic::{AtomicUsize, Ordering::Relaxed},
    Arc,
};

static LIVE_BYTES_GLOBAL: AtomicUsize = AtomicUsize::new(0);

thread_local! {
    static LIVE_BYTES_IN_THREAD: AtomicUsize = AtomicUsize::new(0);
}

pub struct TrackingAllocator {
    allocator: std::alloc::System,
}

#[global_allocator]
pub static GLOBAL_ALLOCATOR: TrackingAllocator = TrackingAllocator {
    allocator: std::alloc::System,
};

#[allow(unsafe_code)]
// SAFETY:
// We just do book-keeping and then let another allocator do all the actual work.
unsafe impl std::alloc::GlobalAlloc for TrackingAllocator {
    #[allow(clippy::let_and_return)]
    unsafe fn alloc(&self, layout: std::alloc::Layout) -> *mut u8 {
        LIVE_BYTES_IN_THREAD.with(|bytes| bytes.fetch_add(layout.size(), Relaxed));
        LIVE_BYTES_GLOBAL.fetch_add(layout.size(), Relaxed);

        // SAFETY:
        // Just deferring
        unsafe { self.allocator.alloc(layout) }
    }

    unsafe fn dealloc(&self, ptr: *mut u8, layout: std::alloc::Layout) {
        LIVE_BYTES_IN_THREAD.with(|bytes| bytes.fetch_sub(layout.size(), Relaxed));
        LIVE_BYTES_GLOBAL.fetch_sub(layout.size(), Relaxed);

        // SAFETY:
        // Just deferring
        unsafe { self.allocator.dealloc(ptr, layout) };
    }
}

fn live_bytes_local() -> usize {
    LIVE_BYTES_IN_THREAD.with(|bytes| bytes.load(Relaxed))
}

fn live_bytes_global() -> usize {
    LIVE_BYTES_GLOBAL.load(Relaxed)
}

/// Returns `(num_bytes_allocated, num_bytes_allocated_by_this_thread)`.
fn memory_use<R>(run: impl Fn() -> R) -> (usize, usize) {
    let used_bytes_start_local = live_bytes_local();
    let used_bytes_start_global = live_bytes_global();
    let ret = run();
    let bytes_used_local = live_bytes_local() - used_bytes_start_local;
    let bytes_used_global = live_bytes_global() - used_bytes_start_global;
    drop(ret);
    (bytes_used_global, bytes_used_local)
}

// ----------------------------------------------------------------------------

use arrow::{
    array::{Array, ArrayRef, ListArray, PrimitiveArray},
    buffer::OffsetBuffer,
    datatypes::{Field, UInt8Type},
};

fn main() {
    let array0: PrimitiveArray<UInt8Type> = (0..200 * 300)
        .map(|v| (v % 255) as u8)
        .collect::<Vec<_>>()
        .into();
    let array0: ArrayRef = Arc::new(array0);

    let (global, local) = memory_use(|| {
        let concatenated = concatenate_and_measure(array0.clone());
        eprintln!("expected: {}", how_many_bytes(concatenated.clone()));
        concatenated
    });
    eprintln!("global: {global} bytes");
    eprintln!("local: {local} bytes");

    let array1 = ListArray::new(
        Field::new_list_field(array0.data_type().clone(), false).into(),
        OffsetBuffer::from_lengths(std::iter::once(array0.len())),
        array0.clone(),
        None,
    );
    let array1: ArrayRef = Arc::new(array1);

    let (global, local) = memory_use(|| {
        let concatenated = concatenate_and_measure(array1.clone());
        eprintln!("expected: {}", how_many_bytes(concatenated.clone()));
        concatenated
    });
    eprintln!("global: {global} bytes");
    eprintln!("local: {local} bytes");

    let array2 = ListArray::new(
        Field::new_list_field(array1.data_type().clone(), false).into(),
        OffsetBuffer::from_lengths(std::iter::once(array1.len())),
        array1.clone(),
        None,
    );
    let array2: ArrayRef = Arc::new(array2);

    let (global, local) = memory_use(|| {
        let concatenated = concatenate_and_measure(array2.clone());
        eprintln!("expected: {}", how_many_bytes(concatenated.clone()));
        concatenated
    });
    eprintln!("global: {global} bytes");
    eprintln!("local: {local} bytes");
}

fn concatenate_and_measure(array: ArrayRef) -> ArrayRef {
    let mut concatenated = array.clone();

    let now = std::time::Instant::now();
    for _ in 0..1000 {
        concatenated = arrow::compute::kernels::concat::concat(&[&*concatenated, &*array]).unwrap();
    }
    let elapsed = now.elapsed();

    // dbg!(&array);
    dbg!(concatenated.data_type());
    eprintln!(
        "1000 accumulated concatenations in {elapsed:?} ({:.3} MiB per sec)",
        how_many_bytes(concatenated.clone()) as f64 / 1024.0 / 1024.0 / elapsed.as_secs_f64(),
    );

    concatenated
}

fn how_many_bytes(array: ArrayRef) -> u64 {
    let mut array = array;
    loop {
        match array.data_type() {
            arrow::datatypes::DataType::UInt8 => break,
            arrow::datatypes::DataType::List(_) => {
                let list = array.as_any().downcast_ref::<ListArray>().unwrap();
                array = list.values().clone();
            }
            _ => unreachable!(),
        }
    }

    array.len() as _
}

[src/main.rs:130] concatenated.data_type() = UInt8
1000 accumulated concatenations in 6.251046146s (9.163 MiB per sec)
expected: 60060000
global: 60060200 bytes
local: 60060200 bytes

[src/main.rs:130] concatenated.data_type() = List(
    Field {
        name: "item",
        data_type: UInt8,
        nullable: false,
        dict_id: 0,
        dict_is_ordered: false,
        metadata: {},
    },
)
1000 accumulated concatenations in 12.074819041s (4.744 MiB per sec)
expected: 60060000
global: 120004384 bytes
local: 120004384 bytes

[src/main.rs:130] concatenated.data_type() = List(
    Field {
        name: "item",
        data_type: List(
            Field {
                name: "item",
                data_type: UInt8,
                nullable: false,
                dict_id: 0,
                dict_is_ordered: false,
                metadata: {},
            },
        ),
        nullable: false,
        dict_id: 0,
        dict_is_ordered: false,
        metadata: {},
    },
)
1000 accumulated concatenations in 12.0635169s (4.748 MiB per sec)
expected: 60060000
global: 120008600 bytes
local: 120008600 bytes

Arrow2

use std::sync::atomic::{AtomicUsize, Ordering::Relaxed};

static LIVE_BYTES_GLOBAL: AtomicUsize = AtomicUsize::new(0);

thread_local! {
    static LIVE_BYTES_IN_THREAD: AtomicUsize = AtomicUsize::new(0);
}

pub struct TrackingAllocator {
    allocator: std::alloc::System,
}

#[global_allocator]
pub static GLOBAL_ALLOCATOR: TrackingAllocator = TrackingAllocator {
    allocator: std::alloc::System,
};

#[allow(unsafe_code)]
// SAFETY:
// We just do book-keeping and then let another allocator do all the actual work.
unsafe impl std::alloc::GlobalAlloc for TrackingAllocator {
    #[allow(clippy::let_and_return)]
    unsafe fn alloc(&self, layout: std::alloc::Layout) -> *mut u8 {
        LIVE_BYTES_IN_THREAD.with(|bytes| bytes.fetch_add(layout.size(), Relaxed));
        LIVE_BYTES_GLOBAL.fetch_add(layout.size(), Relaxed);

        // SAFETY:
        // Just deferring
        unsafe { self.allocator.alloc(layout) }
    }

    unsafe fn dealloc(&self, ptr: *mut u8, layout: std::alloc::Layout) {
        LIVE_BYTES_IN_THREAD.with(|bytes| bytes.fetch_sub(layout.size(), Relaxed));
        LIVE_BYTES_GLOBAL.fetch_sub(layout.size(), Relaxed);

        // SAFETY:
        // Just deferring
        unsafe { self.allocator.dealloc(ptr, layout) };
    }
}

fn live_bytes_local() -> usize {
    LIVE_BYTES_IN_THREAD.with(|bytes| bytes.load(Relaxed))
}

fn live_bytes_global() -> usize {
    LIVE_BYTES_GLOBAL.load(Relaxed)
}

/// Returns `(num_bytes_allocated, num_bytes_allocated_by_this_thread)`.
fn memory_use<R>(run: impl Fn() -> R) -> (usize, usize) {
    let used_bytes_start_local = live_bytes_local();
    let used_bytes_start_global = live_bytes_global();
    let ret = run();
    let bytes_used_local = live_bytes_local() - used_bytes_start_local;
    let bytes_used_global = live_bytes_global() - used_bytes_start_global;
    drop(ret);
    (bytes_used_global, bytes_used_local)
}

// ----------------------------------------------------------------------------

use arrow2::{
    array::{Array, ListArray, PrimitiveArray},
    offset::Offsets,
};

fn main() {
    let array0 = PrimitiveArray::from_vec((0..200 * 300).map(|v| (v % 255) as u8).collect());

    let (global, local) = memory_use(|| {
        let concatenated = concatenate_and_measure(array0.to_boxed());
        eprintln!("expected: {}", how_many_bytes(concatenated.to_boxed()));
        concatenated
    });
    eprintln!("global: {global} bytes");
    eprintln!("local: {local} bytes");

    let array1 = ListArray::new(
        ListArray::<i32>::default_datatype(array0.data_type().clone()),
        Offsets::<i32>::try_from_lengths(std::iter::once(array0.len()))
            .unwrap()
            .into(),
        array0.boxed(),
        None,
    );

    let (global, local) = memory_use(|| {
        let concatenated = concatenate_and_measure(array1.to_boxed());
        eprintln!("expected: {}", how_many_bytes(concatenated.to_boxed()));
        concatenated
    });
    eprintln!("global: {global} bytes");
    eprintln!("local: {local} bytes");

    let array2 = ListArray::new(
        ListArray::<i32>::default_datatype(array1.data_type().clone()),
        Offsets::<i32>::try_from_lengths(std::iter::once(array1.len()))
            .unwrap()
            .into(),
        array1.boxed(),
        None,
    );

    let (global, local) = memory_use(|| {
        let concatenated = concatenate_and_measure(array2.to_boxed());
        eprintln!("expected: {}", how_many_bytes(concatenated.to_boxed()));
        concatenated
    });
    eprintln!("global: {global} bytes");
    eprintln!("local: {local} bytes");
}

fn concatenate_and_measure(array: Box<dyn Array>) -> Box<dyn Array> {
    let mut concatenated = array.clone();

    let now = std::time::Instant::now();
    for _ in 0..1000 {
        concatenated =
            arrow2::compute::concatenate::concatenate(&[&*concatenated, &*array]).unwrap();
    }
    let elapsed = now.elapsed();

    // dbg!(&array);
    dbg!(concatenated.data_type());
    eprintln!(
        "1000 accumulated concatenations in {elapsed:?} ({:.3} MiB per sec)",
        how_many_bytes(concatenated.clone()) as f64 / 1024.0 / 1024.0 / elapsed.as_secs_f64(),
    );

    concatenated
}

fn how_many_bytes(array: Box<dyn Array>) -> u64 {
    let mut array = array;
    loop {
        match array.data_type() {
            arrow2::datatypes::DataType::UInt8 => break,
            arrow2::datatypes::DataType::List(_) => {
                let list = array.as_any().downcast_ref::<ListArray<i32>>().unwrap();
                array = list.values().to_boxed();
            }
            _ => unreachable!(),
        }
    }

    array.len() as _
}

[src/main.rs:128:5] concatenated.data_type() = UInt8
1000 accumulated concatenations in 6.377350563s (8.981 MiB per sec)
expected: 60060000
global: 60060152 bytes
local: 60060152 bytes

[src/main.rs:128:5] concatenated.data_type() = List(
    Field {
        name: "item",
        data_type: UInt8,
        is_nullable: true,
        metadata: {},
    },
)
1000 accumulated concatenations in 11.999255292s (4.773 MiB per sec)
expected: 60060000
global: 120004328 bytes
local: 120004328 bytes

[src/main.rs:128:5] concatenated.data_type() = List(
    Field {
        name: "item",
        data_type: List(
            Field {
                name: "item",
                data_type: UInt8,
                is_nullable: true,
                metadata: {},
            },
        ),
        is_nullable: true,
        metadata: {},
    },
)
1000 accumulated concatenations in 11.970515069s (4.785 MiB per sec)
expected: 60060000
global: 120012504 bytes
local: 120012504 bytes

This fixes the extra capacity from the temporary growable vector leaking into the final buffer and therefore hanging around indefinitely. See rerun-io/rerun#7222 (comment)

teh-cmc · 2024-08-23T15:33:49Z

The memory leak has been fixed in rerun-io/re_arrow2#9, which I've released as part of re_arrow2 0.17.5.

I still need to send the same patch to arrow (I assume the same bug got carried over).

Before:

After:

teh-cmc · 2024-08-24T09:58:29Z

Upstream patch for the cap leak: Fix MutableBuffer::into_buffer leaking its extra capacity into the final buffer apache/arrow-rs#6300

Danvil · 2024-08-26T05:57:42Z

@teh-cmc Thanks for investigating and fixing that memory leak! Any news on the performance drop? rerun became so slow with 0.18 that it is unusable for us in the current state 😢

teh-cmc · 2024-08-26T08:04:06Z

@teh-cmc Thanks for investigating and fixing that memory leak! Any news on the performance drop? rerun became so slow with 0.18 that it is unusable for us in the current state 😢

Both the memleak and the performance issue should be fixed on main @Danvil, I'd really appreciate it if you could try it out early before we release 0.18.1 in a few days 🙏

Danvil · 2024-08-27T22:13:30Z

With branch master (as of 8/27, 3PM PDT) it feels a little bit faster but performance is still lower than rerun 0.16.

I tried some bisection of the rerun visualizations in my application and the bottleneck now seems to be this code:

rr.set_time_seconds("timestamp", my_timestamp);
rr.log("path", &rerun::Clear::recursive())?;
for (i, pose) in path.iter().enumerate() {
    rr.log(format!("{tag}/{i}"), &rerun::Transform3D::from_translation_mat3x3(pose.t3, pose.r3))?;
}

(there are 250 poses)

Without this visualization 0.18 also is much faster. So it seems like a large part of the slowdown from 0.16 to 0.18 is due to visualization of the poses.

teh-cmc · 2024-08-28T06:46:03Z

Thanks for the update @Danvil, we'll have a look asap.

teh-cmc · 2024-08-28T07:08:20Z

This adventure continues in:

Performance regression with many poses #7292

Danvil added 👀 needs triage This issue needs to be triaged by the Rerun team 🪳 bug Something isn't working labels Aug 16, 2024

Wumpf added 🚀 performance Optimization, memory use, etc 🦟 regression A thing that used to work in an earlier release and removed 👀 needs triage This issue needs to be triaged by the Rerun team labels Aug 17, 2024

nikolausWest added this to the 0.18.1 milestone Aug 18, 2024

teh-cmc self-assigned this Aug 19, 2024

teh-cmc mentioned this issue Aug 23, 2024

fix primitive capacity leak rerun-io/re_arrow2#9

Merged

teh-cmc mentioned this issue Aug 24, 2024

Default RERUN_CHUNK_MAX_BYTES to 384kiB instead of 4MiB #7263

Merged

6 tasks

emilk closed this as completed in 20da873 Aug 26, 2024

emilk closed this as completed in #7263 Aug 26, 2024

teh-cmc mentioned this issue Aug 28, 2024

Performance regression with many poses #7292

Closed

emilk mentioned this issue Nov 25, 2024

Add shrink_to_fit to Array apache/arrow-rs#6360

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Significant performance drop after update from 0.16 to 0.18 #7222

Significant performance drop after update from 0.16 to 0.18 #7222

Danvil commented Aug 16, 2024 •

edited

Loading

Wumpf commented Aug 17, 2024 •

edited

Loading

teh-cmc commented Aug 19, 2024

teh-cmc commented Aug 19, 2024

teh-cmc commented Aug 19, 2024

teh-cmc commented Aug 19, 2024 •

edited

Loading

teh-cmc commented Aug 19, 2024 •

edited

Loading

teh-cmc commented Aug 23, 2024 •

edited

Loading

teh-cmc commented Aug 24, 2024

Danvil commented Aug 26, 2024

teh-cmc commented Aug 26, 2024

Danvil commented Aug 27, 2024 •

edited

Loading

teh-cmc commented Aug 28, 2024

teh-cmc commented Aug 28, 2024

Significant performance drop after update from 0.16 to 0.18 #7222

Significant performance drop after update from 0.16 to 0.18 #7222

Comments

Danvil commented Aug 16, 2024 • edited Loading

Wumpf commented Aug 17, 2024 • edited Loading

teh-cmc commented Aug 19, 2024

Repro

teh-cmc commented Aug 19, 2024

teh-cmc commented Aug 19, 2024

Arrow2

Arrow1

Native

Conclusion

teh-cmc commented Aug 19, 2024 • edited Loading

teh-cmc commented Aug 19, 2024 • edited Loading

Arrow

Arrow2

teh-cmc commented Aug 23, 2024 • edited Loading

teh-cmc commented Aug 24, 2024

Danvil commented Aug 26, 2024

teh-cmc commented Aug 26, 2024

Danvil commented Aug 27, 2024 • edited Loading

teh-cmc commented Aug 28, 2024

teh-cmc commented Aug 28, 2024

Danvil commented Aug 16, 2024 •

edited

Loading

Wumpf commented Aug 17, 2024 •

edited

Loading

teh-cmc commented Aug 19, 2024 •

edited

Loading

teh-cmc commented Aug 19, 2024 •

edited

Loading

teh-cmc commented Aug 23, 2024 •

edited

Loading

Danvil commented Aug 27, 2024 •

edited

Loading