-
Notifications
You must be signed in to change notification settings - Fork 42
issue 1093
Dan Kelley edited this page Jan 5, 2017
·
5 revisions
This work is being done in branch dk-1093-large-rdi
, not in an older branch called dk-1093
. (I am experimenting with the idea of more informative branch names, of the form developerInitials-issueNumber-WordsSeparatedWithHyphens
.)
-
2017 Jan 4 I think things are working now, for blocks where
from
andto
yield a subset that is small enough to fit into R. However, I do not think this is the common use case. When I work with data, I would likely prefer to work withby
argument, to get a rough overview of the whole timeseries, before focussing on smaller time intervals. I need to write more C code to handleby
in this way, and so I would say the work is only 1/4 done. Remaining tasks:- Handle
by
better, by filling up anunsigned char
array with the results of a series ofseek
andfread
calls. - Handle the case of numeric
from
andto
faster (hand these arguments to the existing C function -- easy peasy). - See whether the present scheme of determining the segment pointers is inefficient. The present code reads the whole file twice: a first pass merely count pointers (for a memory allocation) and the second stores into the allocated memory. Another approach would be to have a growable allocation, so I will try that, now that I have a 6Gb file as a test case. (The worry with growable allocation is that time will be spend copying that memory, especially if the growth factor is small, but that we can still run out of memory, if the growth factor is large.)
- Handle