Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ByteString: More efficient IO read/write #129

Merged
merged 6 commits into from
Feb 14, 2025

Conversation

konsumlamm
Copy link
Contributor

Implement hPut, hGet, hGetContents using block IO and change readFile, writeFile, appendFile to use them.

Implements #112.

The code currently compiles, but it doesn't work, since primBS2FPtr, primFPtr2BS & primFPtrLen2BS aren't implemented correctly (_primitive "I" doesn't work, since while ByteString and ForeignPtr have essentially the same representation, they have different tags). I'm not sure what the best way forward is here. Ideally there would be conversion between bytestrings and forptrs without having to copy the payload, but I'm not sure that's possible. Another idea is to add mallocByteString and withByteString (so that we don't have to go through ForeignPtr), but that would mean code duplication.

@augustss
Copy link
Owner

I think the right thing is to use the same tag for both. The tags only differ because of serialization. But that's better done with some subfield of a foreign pointer. I think it's a straightforward change. I'll do it later today.

@augustss
Copy link
Owner

I've pushed a change that gets rid of T_BSTR.

I'm on vacation in New Zealand, so responses are a bit slow. :)

@konsumlamm
Copy link
Contributor Author

Unfortunately, I think this just moved the problem. It's still not possible to simply convert between a foreign pointer and a bytestring using I, since the latter needs the fptype to be set to FP_BSTR. Looking at addForPtr, I think something similar should work here, but since the foreign pointer and the bytestring need to share the finalizer, the tag can't be stored there. I think it would've been possible before, but maybe I'm missing something.

@augustss
Copy link
Owner

It can probably be fixed by just removing the sanity checks. The fptype doesn't matter, except for serialization.

@konsumlamm
Copy link
Contributor Author

It can probably be fixed by just removing the sanity checks. The fptype doesn't matter, except for serialization.

Ok, I'll try that.

@konsumlamm
Copy link
Contributor Author

I added two new primitives, fp2bs & bs2fp. fp2bs takes a foreign pointer and an int and returns the pointer as a bytestring (setting the fptype and the size), while bs2fp simply returns the bytestring forptr unchanged. I'm not entirely sure if this is sound, but at least I haven't encountered any segfault yet.

@konsumlamm konsumlamm changed the title ByteString: More efficient IO read/write (WIP) ByteString: More efficient IO read/write Feb 12, 2025
@konsumlamm konsumlamm marked this pull request as ready for review February 12, 2025 21:00
@augustss
Copy link
Owner

Just setting the type to bytestring isn't sound unless the pointer was originally a bytestring. A bytestring has a size that is used in serialization. The serialization code needs a sanity check for NOSIZE.

I'll look at it some more tonight.

@konsumlamm
Copy link
Contributor Author

Just setting the type to bytestring isn't sound unless the pointer was originally a bytestring. A bytestring has a size that is used in serialization. The serialization code needs a sanity check for NOSIZE.

fp2bs also sets the size.

@augustss
Copy link
Owner

Having fp2bs and bs2fp is a good idea.

@augustss augustss merged commit c7338fb into augustss:master Feb 14, 2025
13 checks passed
@konsumlamm konsumlamm deleted the bytestring-io branch February 14, 2025 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants