Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding in memory support for xyz files #5866

Merged
merged 10 commits into from
Dec 22, 2023
Merged

Conversation

samypr100
Copy link
Contributor

@samypr100 samypr100 commented Jan 26, 2023

Type

  • Bug fix (non-breaking change which fixes an issue): Fixes #
  • New feature (non-breaking change which adds functionality). Resolves Support reading and writing file-like objects #1560 (partially)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected) Resolves #

Motivation and Context

The ability to read/write formats in memory is a typical ask due to a variety of use cases. Open3D has some in memory options for file formats such as PNG/JPG (e.g. ReadPNGFromMemory) so it's useful if other file formats had this kind of support.

Checklist:

  • I have run python util/check_style.py --apply to apply Open3D code style
    to my code.
  • This PR changes Open3D behavior or adds new functionality.
    • Both C++ (Doxygen) and Python (Sphinx / Google style) documentation is
      updated accordingly.
    • I have added or updated C++ and / or Python unit tests OR included test
      results
      (e.g. screenshots or numbers) here.
  • I will follow up and update the code if CI fails.
  • For fork PRs, I have selected Allow edits from maintainers.

Description

The changes in this PR doesn't fully close #1560 since I believe there's other formats to still support such as xyzn, xyzrgb, pts, pcd, and maybe others.

This PR only adds full "in memory" support for xyz file format in order to keep the PR small, simple, and reviewable. It's also early so if somethings needs to change with the approach it's easier rather than trying to retro-fit to all formats.

Since we're reading contents from memory, "auto" would not quite work as there's no magic headers for all these files, so some changes were made so that format is specified for both Reading/Writing Point Clouds. I'm using a prefix of mem::{format} for the new in memory formats.

It simplifies the Python API when temporary files are used to achieve similar behavior like shown below.

>>> import open3d as o3d
>>> xyz_bytes = b"x y z\nx y z\n"
>>> pcr = o3d.io.read_point_cloud_from_bytes(xyz_bytes, "mem::xyz")
>>> pcb = o3d.io.write_point_cloud_to_bytes(pc, "mem::xyz")
>>> pcb.decode("utf8")
"x y z\nx y z\n"

The issue asked to mimic something like the file-like interface in BytesIO(), but Pybind currently doesn't "officially" offer that functionality despite being discussed in pybind/pybind11#1477.

There's a solution to it from that issue, but that would mean bundling a custom pybind extension into the build process, which we could explore. The benefit of it is that we'd be able to retain a file-like interface rather than sending a buffer around, although I'm not sure if it will work with FILE though and some of the other more advanced File System IO that's done through the codebase.

Laslty, with the current approach it's possible there'll be a decent amount of code duplication (similar to FileJPG.cpp, and FilePNG.cpp). The existing code seems tightly coupled with the File IO, so I'm open to suggestions/recommendations/alternatives.


This change is Reviewable

@update-docs
Copy link

update-docs bot commented Jan 26, 2023

Thanks for submitting this pull request! The maintainers of this repository would appreciate if you could update the CHANGELOG.md based on your changes.

@samypr100
Copy link
Contributor Author

Please retrigger workflows when possible, thank you 😅

cpp/tests/io/PointCloudIO.cpp Outdated Show resolved Hide resolved
cpp/pybind/io/class_io.cpp Show resolved Hide resolved
@ssheorey ssheorey added this to the v0.17 milestone Jan 27, 2023
cpp/pybind/io/class_io.cpp Outdated Show resolved Hide resolved
cpp/open3d/io/PointCloudIO.cpp Outdated Show resolved Hide resolved
@samypr100
Copy link
Contributor Author

@yxlao I've made some updates, thanks

@ssheorey ssheorey requested a review from yxlao February 3, 2023 22:35
@samypr100
Copy link
Contributor Author

@ssheorey Please re-run workflows, I've pushed a change that hopefully fixes windows build, thanks

@samypr100
Copy link
Contributor Author

samypr100 commented Feb 11, 2023

Any pointers as to why I'd be getting these errors on the windows pipelines are welcome. Short of trying to setup a development environment on windows myself (mainly use osx and linux), I'm not sure 100% how this is working on osx/linux but not on Windows unless I missed something obvious during building. I haven't tried to touch the code under t and maybe that's the issue?

D:\a\Open3D\Open3D\cpp\tests\t\io\PointCloudIO.cpp(423,1): error C2440: '<function-style-cast>': cannot convert from 'initializer list' to 'open3d::io::WritePointCloudOption' [C:\Open3D\build\cpp\tests\tests.vcxproj]
D:\a\Open3D\Open3D\cpp\tests\t\io\PointCloudIO.cpp(426,1): message : No constructor could take the source type, or constructor overload resolution was ambiguous [C:\Open3D\build\cpp\tests\tests.vcxproj]
D:\a\Open3D\Open3D\cpp\tests\t\io\PointCloudIO.cpp(440,1): error C2440: '<function-style-cast>': cannot convert from 'initializer list' to 'open3d::io::WritePointCloudOption' [C:\Open3D\build\cpp\tests\tests.vcxproj]
D:\a\Open3D\Open3D\cpp\tests\t\io\PointCloudIO.cpp(443,1): message : No constructor could take the source type, or constructor overload resolution was ambiguous [C:\Open3D\build\cpp\tests\tests.vcxproj]

Edit: Fixed in 4284a20

@samypr100
Copy link
Contributor Author

samypr100 commented Feb 13, 2023

@ssheorey workflow run please :), I tested on windows seems to build now 🙌 (also rebased w/ latest master for consistency)

@ssheorey
Copy link
Member

Hi @yxlao can you take another look? Thanks.

cpp/pybind/io/class_io.cpp Outdated Show resolved Hide resolved
@samypr100 samypr100 force-pushed the in-memory-xyz branch 3 times, most recently from 7f17aaf to 89fa0eb Compare February 18, 2023 18:35
@samypr100 samypr100 requested a review from yxlao February 19, 2023 15:32
@ssheorey
Copy link
Member

ssheorey commented Mar 2, 2023

Hi @yxlao can you take another look at this PR? Thanks.

@Chen-Suyi
Copy link

Thanks for your contributions!

Is this feature accessible now? It would be really helpful for me, but I cannot find any API descriptions in the documents.

@ssheorey ssheorey self-requested a review November 7, 2023 17:24
@charangodwin
Copy link

I could help review this PR.

Copy link

@charangodwin charangodwin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 11 files reviewed, 9 unresolved discussions (waiting on @samypr100, @ssheorey, and @yxlao)


cpp/open3d/io/PointCloudIO.h line 95 at r4 (raw file):

            // Attention: when you update the defaults, update the docstrings in
            // pybind/io/class_io.cpp
            std::string format = "auto",

ES.23: Prefer the {}-initializer syntax


cpp/open3d/io/file_format/FileXYZ.cpp line 67 at r5 (raw file):

        reporter.SetTotal(static_cast<int64_t>(length));

        std::string content(reinterpret_cast<const char *>(buffer), length);

This create another copy of the buffer as std::string deep copy the input buffer. Also in the class_io.cpp we use std::memcpy.
Is there a way to process the buffer directly using char*? Maybe using std::istrstream.

@samypr100
Copy link
Contributor Author

@charangodwin re istratream, see #5866 (comment) about some history on how changing may affect benchmarks significantly.

@ssheorey ssheorey self-assigned this Dec 12, 2023
@ssheorey ssheorey requested a review from nsaiapova December 12, 2023 15:25
Copy link
Collaborator

@nsaiapova nsaiapova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good to me with just a couple of minor suggestions.

cpp/open3d/io/file_format/FileXYZ.cpp Outdated Show resolved Hide resolved
python/test/io/rpc/test_io.py Outdated Show resolved Hide resolved
@ssheorey ssheorey merged commit d7a2cf6 into isl-org:main Dec 22, 2023
29 of 30 checks passed
@ssheorey
Copy link
Member

Thanks @samypr100 for this very useful contribution and sorry for the delay in getting this merged - we'll try to merge faster in the future....

Minor comment about other formats (xyzn, xyzi, xyzrgb) - these are better supported in the more generic Tensor based datatypes (open3d::t::io namespace). Functions for reading from a file are already available there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support reading and writing file-like objects
6 participants