Skip to content

Commit

Permalink
simpler hierarchy
Browse files Browse the repository at this point in the history
  • Loading branch information
Benjamin Lefaudeux committed Oct 31, 2024
1 parent 57838a1 commit 5c115ae
Show file tree
Hide file tree
Showing 17 changed files with 28 additions and 42 deletions.
16 changes: 6 additions & 10 deletions .github/workflows/go.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,17 +15,13 @@ jobs:
steps:
- uses: actions/checkout@v4

- name: Set up Go
uses: golangci/golangci-lint-action@v6

# uses: actions/setup-go@v4
# with:
# go-version: "1.20"

- name: Install linux deps
run: |
sudo apt-get update
sudo apt-get -y install libvips-dev
sudo apt-get -y install libvips-dev libjpeg-turbo8-dev
- name: Set up Go
uses: golangci/golangci-lint-action@v6

- name: Install pre-commit
run: |
Expand All @@ -36,12 +32,12 @@ jobs:
run: pre-commit run --all-files

- name: Build
run: cd src/cmd/main && go build -v main.go
run: cd cmd/main && go build -v main.go

- name: Test
env:
DATAROOM_API_KEY: ${{ secrets.DATAROOM_API_KEY }}
DATAROOM_TEST_SOURCE: ${{ secrets.DATAROOM_TEST_SOURCE }}
DATAROOM_API_URL: ${{ secrets.DATAROOM_API_URL }}

run: cd src/tests && go test -v .
run: cd tests && go test -v .
4 changes: 2 additions & 2 deletions .github/workflows/gopy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ jobs:
- name: Build python module
run: |
cd src/pkg/client
cd pkg
gopy pkg -author="Photoroom" -email="[email protected]" -name="datago" .
export DESTINATION="../../../build"
mkdir -p $DESTINATION/datago
Expand All @@ -47,7 +47,7 @@ jobs:
mv Makefile $DESTINATION/.
mv README.md $DESTINATION/.
rm LICENSE MANIFEST.in
cd ../../../build
cd ../build
- name: Install python module
run: |
Expand Down
1 change: 0 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,3 @@ repos:
- id: go-fmt
- id: go-imports
- id: golangci-lint
args: ["run", "src"]
41 changes: 16 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
[![Build & Test](https://github.com/Photoroom/datago/actions/workflows/go.yml/badge.svg)](https://github.com/Photoroom/datago/actions/workflows/go.yml)
[![Gopy](https://github.com/Photoroom/datago/actions/workflows/gopy.yml/badge.svg)](https://github.com/Photoroom/datago/actions/workflows/gopy.yml)

datago
======
# datago

A golang-based data loader which can be used from Python. Compatible with a soon-to-be open sourced VectorDB-enabled data stack, which exposes HTTP requests.

Datago handles, outside of the Python GIL

- per sample IO from object storage
- deserialization (jpg and png decompression)
- some optional vision processing (aligning different image payloads)
Expand All @@ -19,11 +19,9 @@ Datago is rank and world-size aware, in which case the samples are dispatched de

<img width="922" alt="Screenshot 2024-09-24 at 9 39 44 PM" src="https://github.com/user-attachments/assets/b58002ce-f961-438b-af72-9e1338527365">


<details> <summary><strong>Use it</strong></summary>

Use the package from Python
---------------------------
## Use the package from Python

```python
from datago import datago
Expand All @@ -40,26 +38,22 @@ for _ in range(10):

Please note that the image buffers will be passed around as raw pointers, they can be re-interpreted in python with the attached helpers


Match the raw exported buffers with typical python types
--------------------------------------------------------
## Match the raw exported buffers with typical python types

See helper functions provided in `polyglot.py`, should be self explanatory

</details><details> <summary><strong>Build it</strong></summary>

Install deps
------------
## Install deps

```bash
$ sudo apt install golang libjpeg-turbo8-dev libvips-dev
$ sudo ldconfig
```

Build a benchmark CLI
---------------------
## Build a benchmark CLI

From the root of this project `datago_src`:
From the root of this project:

```bash
$ go build cmd/main/main.go
Expand All @@ -77,23 +71,20 @@ Running it with additional sanity checks
$ go run -race cmd/main/main.go
```

Run the go test suite
---------------------
## Run the go test suite

From the src folder
From the root folder

```bash
$ go test -v tests/client_test.go
```

Refresh the python package and its binaries
-------------------------------------------
## Refresh the python package and its binaries

- Install the dependencies as detailed in the next point
- Run the `generate_python_package.sh` script

Generate the python package binaries manually
---------------------------------------------
## Generate the python package binaries manually

```bash
$ python3 -m pip install pybindgen
Expand All @@ -103,30 +94,30 @@ $ go install golang.org/x/image/draw
```

NOTE:

- you may need to add `~/go/bin` to your PATH so that gopy is found.
- - Either `export PATH=$PATH:~/go/bin` or add it to your .bashrc
- you may need this to make sure that LDD looks at the current folder `export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:.`

then from the /pkg/client folder:
then from the /pkg folder:

```bash
$ gopy pkg -author="Photoroom" -email="[email protected]" -url="" -name="datago" -version="0.0.1" .
```

then you can `pip install -e .` from here.

## Update the pypi release (maintainers)

Update the pypi release (maintainers)
-------------------------------------
```
python3 setup.py sdist
python3 -m twine upload dist/* --verbose
```
</details>
# License
License
=======
MIT License
Copyright (c) 2024 Photoroom
Expand Down
File renamed without changes.
2 changes: 1 addition & 1 deletion src/cmd/main/main.go → cmd/main.go
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
package main

import (
datago "datago/pkg/client"
datago "datago/pkg"
"flag"
"fmt"
"os"
Expand Down
4 changes: 2 additions & 2 deletions generate_python_package.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ DESTINATION="../../../python_$python_version"
rm -rf $DESTINATION

# Build the python package via the gopy toolchain
cd src/pkg/client
cd pkg
gopy pkg -author="Photoroom" -email="[email protected]" -url="" -name="datago" -version="0.3" .
mkdir -p $DESTINATION/datago
mv datago/* $DESTINATION/datago/.
Expand All @@ -21,4 +21,4 @@ mv README.md $DESTINATION/.
rm LICENSE
rm MANIFEST.in

cd ../../..
cd ..
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion src/tests/client_test.go → tests/client_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ import (
"os"
"testing"

datago "datago/pkg/client"
datago "datago/pkg"

"github.com/davidbyttow/govips/v2/vips"
)
Expand Down

0 comments on commit 5c115ae

Please sign in to comment.