Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ A read_geotiff function for reading GeoTIFF into ndarray #3

Merged
merged 14 commits into from
Feb 28, 2024

Conversation

weiji14
Copy link
Owner

@weiji14 weiji14 commented Feb 26, 2024

Rust-based function for reading GeoTIFF files!

Uses the tiff crate for the I/O, ndarray crate to store the 2D array in Rust, and numpy crate to convert to numpy.ndarray in Python.

Note: This only works on single-band GeoTIFF files with float32 dtype for now

Usage:

In Rust:

use ndarray::Array2;
use std::fs::File;
use cog3pio::io::geotiff::read_geotiff;

let path: &str = "path/to/file.tif";
let file: File = File::open(path).expect("Cannot find GeoTIFF file");
let arr: Array2<f32> = read_geotiff(file).unwrap();
assert_eq!(arr.dim(), (20, 20));
assert_eq!(arr.mean(), Some(19.0));

In Python:

import numpy as np
from cog3pio import read_geotiff

array: np.ndarray = read_geotiff(path="georaster/data/tiff/float32.tif")
assert array.shape == (20, 20)
assert array.dtype == "float32"

TODO:

  • Initial implementation to allow reading single-band Float32 dtype GeoTIFF files
  • Improve documentation
  • Add unit tests in Python and Rust

TODO in future:

References:

TIFF decoding and encoding library in pure Rust!
An n-dimensional array for general elements and for numerics!
PyO3-based Rust bindings of the NumPy C-API!
Rust-based function for reading GeoTIFF files! Uses the `tiff` crate for the I/O, `ndarray` crate to store the 2D array in Rust, and `numpy` crate to convert to numpy.ndarray in Python.
@weiji14 weiji14 added the feature New feature or request label Feb 26, 2024
@weiji14 weiji14 added this to the 0.1.0 milestone Feb 26, 2024
@weiji14 weiji14 self-assigned this Feb 26, 2024
@weiji14 weiji14 changed the title Feat/read geotiff ✨ A read_geotiff function for reading GeoTIFF into ndarray Feb 26, 2024
Needed to prevent `error while loading shared libraries: libpython3.12.so.1.0: cannot open shared object file` error when running `cargo test`.
A library for managing temporary files and directories!
Allow passing anything that implements Read + Seek into the `read_geotiff` function, such as in-memory stream buffer. The `read_geotiff_py` function still accepts an &str reference to a filepath.
Not exactly a GeoTIFF file, but at least it's something!
Test reading an actual single-band float32 GeoTIFF file downloaded from the internet. Done in Python.
NumPy is the fundamental package for array computing with Python! Setting minimum pin of 1.23 following SPEC 0.

Prevent error like `pyo3_runtime.PanicException: Failed to access NumPy array API capsule: PyErr { type: <class 'ModuleNotFoundError'>, value: ModuleNotFoundError("No module named 'numpy'"), traceback: None }`.
@@ -57,6 +57,7 @@ jobs:
install: |
apt-get update
apt-get install -y --no-install-recommends python3 python3-pip
apt build-dep numpy
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original error on s390x build at https://github.com/weiji14/cog3pio/actions/runs/8071580043/job/22051501437#step:7:410 before this line was added:

    error: subprocess-exited-with-error
    
    × Preparing metadata (pyproject.toml) did not run successfully.
    │ exit code: 1
    ╰─> [20 lines of output]
        + /usr/bin/python3 /tmp/pip-install-6l_88pqi/numpy_281cd73ebd0a445c8bd7bbbb7479992b/vendored-meson/meson/meson.py setup /tmp/pip-install-6l_88pqi/numpy_281cd73ebd0a445c8bd7bbbb7479992b /tmp/pip-install-6l_88pqi/numpy_281cd73ebd0a445c8bd7bbbb7479992b/.mesonpy-q6jzljl7 -Dbuildtype=release -Db_ndebug=if-release -Db_vscrt=md --native-file=/tmp/pip-install-6l_88pqi/numpy_281cd73ebd0a445c8bd7bbbb7479992b/.mesonpy-q6jzljl7/meson-python-native-file.ini
        The Meson build system
        Version: 1.2.99
        Source dir: /tmp/pip-install-6l_88pqi/numpy_281cd73ebd0a445c8bd7bbbb7479992b
        Build dir: /tmp/pip-install-6l_88pqi/numpy_281cd73ebd0a445c8bd7bbbb7479992b/.mesonpy-q6jzljl7
        Build type: native build
        Project name: NumPy
        Project version: 1.26.4
        
        ../meson.build:1:0: ERROR: Unknown compiler(s): [['cc'], ['gcc'], ['clang'], ['nvc'], ['pgcc'], ['icc'], ['icx']]
        The following exception(s) were encountered:
        Running `cc --version` gave "[Errno 2] No such file or directory: 'cc'"
        Running `gcc --version` gave "[Errno 2] No such file or directory: 'gcc'"
        Running `clang --version` gave "[Errno 2] No such file or directory: 'clang'"
        Running `nvc --version` gave "[Errno 2] No such file or directory: 'nvc'"
        Running `pgcc --version` gave "[Errno 2] No such file or directory: 'pgcc'"
        Running `icc --version` gave "[Errno 2] No such file or directory: 'icc'"
        Running `icx --version` gave "[Errno 2] No such file or directory: 'icx'"
        
        A full log can be found at /tmp/pip-install-6l_88pqi/numpy_281cd73ebd0a445c8bd7bbbb7479992b/.mesonpy-q6jzljl7/meson-logs/meson-log.txt
        [end of output]
    
    note: This error originates from a subprocess, and is likely not a problem with pip.
  error: metadata-generation-failed
  
  × Encountered error while generating package metadata.
  ╰─> See above for output.
  
  note: This is an issue with the package mentioned above, not pip.
  hint: See above for details.
  25h::error::The process '/home/runner/work/_actions/uraimo/run-on-arch-action/517085f0367c8256bcfa753e3e13e1550af09954/src/run-on-arch.sh' failed with exit code 1

So at b54ab26, I added this apt build-dep numpy line following https://numpy.org/devdocs/building/#system-level-dependencies, but got E: You must put some 'deb-src' URIs in your sources.list at https://github.com/weiji14/cog3pio/actions/runs/8071955898/job/22052723659?pr=3#step:7:330.

Maybe best to just install the compilers individually instead of messing with /etc/apt/sources.list

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, added the NumPy build deps manually at 41259a1, and also needed to apt install ninja-build to get NumPy to compile on armv7/s390x/ppc64le (9dfa03a). The CI on those three platforms take ~24min to run though, compared to <5min for linux/aarch64 🙃

Manually specifying GCC and other build libraries to install with `apt` instead of using `apt build-dep numpy`, following https://numpy.org/devdocs/building/#system-level-dependencies.
Try to get build dependencies to compile ninja and prevent `ERROR: Failed building wheel for ninja`.
Add a title/heading for the crate docs, and some extra descriptions on what the crate does. Added the missing_doc lint to warn when docs are missing. Also improved docs on the module level, and made some minor rustfmt fixes.
@weiji14 weiji14 marked this pull request as ready for review February 28, 2024 04:58
@weiji14 weiji14 merged commit ab83aa6 into main Feb 28, 2024
23 checks passed
@weiji14 weiji14 deleted the feat/read_geotiff branch February 28, 2024 04:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant