You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is issue is meant to discuss some performance tuning and ideas.
I was recently trying to write a programming using directory and filepath for filesystem manipulation.
Something initially in the spirit of du from coreutils.
The problem is the performance is not great.
In writing a recursive directory traversal I must check constantly whether a path is a directory or not to know whether to recurse into it. The function doesDirectoryExist takes 80% of runtime -- which might be right due to having to read the metadata.
I attempted to write a faster fastIsDir by addressing benchmark results.
This version ignores errno to go faster, but it clearly shows the use of useAsCString, which copies O(n) the path into a null-terminated bytearray. Since I want to run this on all the files in the filesystem, I think the cost has some impact.
Now, I think that c_stat doesn't need a copy of the string (it's read only, right?), so ideally we would pass path directly to c_stat:
The issue is that path, I believe, is not null-terminated, which makes this code wrong. The only way I see to add a NULL at the end of the string is copying it into a length+1 array and terminate it manually (which is what useAsCString does).
What I want to discuss in this issue is:
Is it possible, since PosixString is opaque up to Internal, to have PosixString be under the hood a null-terminated ByteArray# such that libraries like directory can directly pass it to read-only functions without the need for copying it into the stack to add a null terminator?
I might be able to attempt an implementation, given some pointers.
Thanks,
Romes
The text was updated successfully, but these errors were encountered:
Wrt filepaths itself: if you read thousands of files via pinned memory, you will possibly cause very high memory fragmentation. The point of the new type is to avoid memory fragmentation.
In GitLab by @alt-romes on Sep 29, 2022, 24:12
Hi!
This is issue is meant to discuss some performance tuning and ideas.
I was recently trying to write a programming using
directory
andfilepath
for filesystem manipulation.Something initially in the spirit of
du
from coreutils.The problem is the performance is not great.
In writing a recursive directory traversal I must check constantly whether a path is a directory or not to know whether to recurse into it. The function
doesDirectoryExist
takes 80% of runtime -- which might be right due to having to read the metadata.I attempted to write a faster
fastIsDir
by addressing benchmark results.Here's an attempt:
This version ignores
errno
to go faster, but it clearly shows the use ofuseAsCString
, which copies O(n) the path into a null-terminated bytearray. Since I want to run this on all the files in the filesystem, I think the cost has some impact.Now, I think that
c_stat
doesn't need a copy of the string (it's read only, right?), so ideally we would passpath
directly toc_stat
:The issue is that
path
, I believe, is not null-terminated, which makes this code wrong. The only way I see to add a NULL at the end of the string is copying it into a length+1 array and terminate it manually (which is whatuseAsCString
does).What I want to discuss in this issue is:
Is it possible, since
PosixString
is opaque up toInternal
, to havePosixString
be under the hood a null-terminatedByteArray#
such that libraries likedirectory
can directly pass it to read-only functions without the need for copying it into the stack to add a null terminator?I might be able to attempt an implementation, given some pointers.
Thanks,
Romes
The text was updated successfully, but these errors were encountered: