when using bind_rows on a list of data.tables with keys, with either map_dfr or a do.call, the keys are not removed This is a problem, because the keys are not correct anymore, which means that later queries are using an index that is wrong. #5587
Labels
not sure if I should report this under data.table or dplyr, but since the issue is in bin_rows, I guess it is dplyr - but I think it's an issue for data.table users mostly - and I think it is pretty serious, since it can completely mess up data integrity, with no errors, no warnings, no way of knowing it happened except by external means, e.g. seeing that the data output just can't be right - this happened to me last week and it made a pretty big mess of things.
here is the issue I filed under dplyr
same thing happens w. map_dfr,that uses bind_rows under the hood. This is how I found the issue. It is very common in my workflow, and I suspect I am not the one.
doing the same with rbind removes the keys, which is the expected behaviour - as the index from the key are no longer valid.
curiosly, using split() instead of lapply circumvents the issue - but this is not a solution, since typically you need to apply a function to the data, - splitting it and then recombining it with split() makes no sense. But I am including it here for completeness.
sessionInfo
The text was updated successfully, but these errors were encountered: