You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a really awkward idiom. I know it's copied from the existing scan code, so not to fix in this PR, but we should fix it.
Several problems:
The split_vector has the ability to automatically extend the DV, but we don't use it here. And so the ScanResult raw mask concept (= footgun) arises, and we have to later pad the DV before consuming it. Why not just pad it reliably here and be done with it?
The method ownership/move/take shuffle is weirdly awkward. I think we can do better (see below)
The extend value is (a) optional and (b) configurable (true vs. false), which value is passed decides the semantics of a None result (extend=true is hard-wired today, with the assumption that None means "all selected", right?)
Re move/take issues, is there a reason we need to take from and then assign back to selection_vector? This split_vector seems to rely on Vec::split_off, which is more suitable for popping off end rather than popping off front like we want. In particular, split_off does not change the capacity of the source, so we would pay quadratic space overall as we repeatedly split a small chunk off the front of a larger vector?
Since we anyway own the selection vector, I wonder if we could do something like:
fntake_front_padded<'a>(slice:&mut&'a [bool],index:usize,padding:bool) -> Option<Vec<bool>>{match slice.split_at_checked(index){None | Some((_,[])) => slice.contains(&!padding).then(|| {// We consumed the whole thing, and at least one significant bit. Extend it as needed.letmut sv = Vec::from(std::mem::take(slice));
sv.resize(index, padding);
sv
}),Some((left, right)) => {*slice = right;Some(Vec::from(left))}}}
Basically, we get rid of the optional slice by treating None and &[] as equivalent. We also make the padding value non-optional, and ensure that every call to take_front_padded returns the expected size of selection vector. That lets us get rid of the raw mask ickiness. Finally, we limit memory bloat to exactly 2x (instead of worst-case quadratic).
Honestly, since all callers pass true as the padding value, we could remove the function arg and just hard-wire it. But maybe we need to leave the door open for handling both deletion and selection vectors?
The text was updated successfully, but these errors were encountered:
just inlined original comment: #580 (comment) from @scovich
regarding table_changes/scan.rs:
This is a really awkward idiom. I know it's copied from the existing scan code, so not to fix in this PR, but we should fix it.
Several problems:
split_vector
has the ability to automatically extend the DV, but we don't use it here. And so the ScanResult raw mask concept (= footgun) arises, and we have to later pad the DV before consuming it. Why not just pad it reliably here and be done with it?Re move/take issues, is there a reason we need to
take
from and then assign back toselection_vector
? Thissplit_vector
seems to rely onVec::split_off
, which is more suitable for popping off end rather than popping off front like we want. In particular,split_off
does not change the capacity of the source, so we would pay quadratic space overall as we repeatedly split a small chunk off the front of a larger vector?Since we anyway own the selection vector, I wonder if we could do something like:
where
Basically, we get rid of the optional slice by treating
None
and&[]
as equivalent. We also make the padding value non-optional, and ensure that every call totake_front_padded
returns the expected size of selection vector. That lets us get rid of the raw mask ickiness. Finally, we limit memory bloat to exactly 2x (instead of worst-case quadratic).Honestly, since all callers pass
true
as the padding value, we could remove the function arg and just hard-wire it. But maybe we need to leave the door open for handling both deletion and selection vectors?The text was updated successfully, but these errors were encountered: