Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sqlitedb API implementation #60

Merged
merged 34 commits into from
Aug 23, 2024
Merged
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
4a6dcf6
WIP
thorbjoernl Jul 31, 2024
9bf4395
WIP
thorbjoernl Jul 31, 2024
37d4ec5
All main tests work
thorbjoernl Jul 31, 2024
9bd7895
WIP
thorbjoernl Aug 1, 2024
9e04c2f
New URI design for sqlite
thorbjoernl Aug 1, 2024
da974a8
Use new URI in jsonfiledb
thorbjoernl Aug 1, 2024
e1f5d6a
All tests work again
thorbjoernl Aug 1, 2024
59a76df
Remove commented code
thorbjoernl Aug 2, 2024
4836e13
List api returns new uri
thorbjoernl Aug 2, 2024
dffdfa4
WIP
thorbjoernl Aug 5, 2024
aaf6eab
Fix some tests
thorbjoernl Aug 5, 2024
715c11c
WIP
thorbjoernl Aug 6, 2024
44509b6
Make more tests apply to both json and sqlite
thorbjoernl Aug 9, 2024
01d0aec
Add per-entry ctime/mtime
thorbjoernl Aug 14, 2024
a5572c2
Remove performance claim about locking in docs
thorbjoernl Aug 14, 2024
3c566a1
merge
thorbjoernl Aug 14, 2024
65f8730
Merge branch 'main' into sqlitedb
thorbjoernl Aug 15, 2024
59533c1
Fix parameter validation in jsonfiledb
thorbjoernl Aug 15, 2024
63b63d6
Merge branch 'main' into sqlitedb
thorbjoernl Aug 15, 2024
cc0f011
Extract lookup code out of jsondb
thorbjoernl Aug 15, 2024
0feb274
WIP
thorbjoernl Aug 15, 2024
a6bfdae
Almost all tests work
thorbjoernl Aug 16, 2024
53ef2fb
Remove pytest hook
thorbjoernl Aug 16, 2024
3768c68
models-style test works
thorbjoernl Aug 16, 2024
5965935
All tests pass
thorbjoernl Aug 16, 2024
f6c15f7
Improve documentation
thorbjoernl Aug 16, 2024
ebad44f
Uncomment
thorbjoernl Aug 16, 2024
225ac13
Test list_timeseries for both json and sqlite
thorbjoernl Aug 16, 2024
132d8e3
Fix profile route (again)
thorbjoernl Aug 20, 2024
6c46606
Merge remote-tracking branch 'origin/main' into sqlitedb
thorbjoernl Aug 20, 2024
9d9090d
WIP
thorbjoernl Aug 20, 2024
1db7dcf
Merge in main and resolve conflicts
thorbjoernl Aug 21, 2024
ce6bb17
Code review
thorbjoernl Aug 23, 2024
ea63434
Code review 2
thorbjoernl Aug 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 0 additions & 10 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,6 @@ repos:
rev: 23.9.1
hooks:
- id: black
- repo: local
hooks:
- id: tox-code-checks
name: Run tox targets -- tests
stages: [commit]
language: system
types: [python]
pass_filenames: false
verbose: true
entry: tox -v
- repo: https://github.com/pre-commit/mirrors-mypy
rev: 'v1.10.0'
hooks:
Expand Down
2 changes: 1 addition & 1 deletion docs/locking.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ To ensure consistent writes, aerovaldb provides a locking mechanism which can be

For :class:`AerovalJsonFileDB` the locking mechanism uses a folder of lock files (`~/.aerovaldb/` by default) to coordinate the lock. It is important that the file system where the lock files are stored supports `fcntl <https://linux.die.net/man/2/fcntl>`.

By default locking is disabled as it has large effect on performance. To enable, set the environment variable `AVDB_USE_LOCKING=1`.
By default locking is disabled as it may impact performance. To enable, set the environment variable `AVDB_USE_LOCKING=1`.

Overriding the lock-file directory
----------------------------------
Expand Down
34 changes: 34 additions & 0 deletions scripts/build_sqlite_test_database.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
import os
import aerovaldb

# This script is a helper script to create an sqlite database
# with the same contents as the json test database. It does so
# by copying each asset from the jsondb into test.sqlite

if os.path.exists("tests/test-db/sqlite/test.sqlite"):
os.remove("tests/test-db/sqlite/test.sqlite")

jsondb = aerovaldb.open("json_files:tests/test-db/json")
sqlitedb = aerovaldb.open("sqlitedb:tests/test-db/sqlite/test.sqlite")

for i, uri in enumerate(list(jsondb.list_all())):
print(f"[{i}] - Processing uri {uri}")
data = jsondb.get_by_uri(
uri, access_type=aerovaldb.AccessType.JSON_STR, default="{}"
)
sqlitedb.put_by_uri(data, uri)

json_list = list(jsondb.list_all())
sqlite_list = list(sqlitedb.list_all())
print("The following URIs exist in jsondb but not sqlitedb")
for x in json_list:
if not (x in sqlite_list):
print(x)

print("The following URIs exist in sqlitedb but not jsondb")
for x in sqlite_list:
if not (x in json_list):
print(x)

print(f"jsondb number of assets: {len(list(jsondb.list_all()))}")
print(f"sqlite number of assets: {len(list(sqlitedb.list_all()))}")
1 change: 1 addition & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ where=src
[options.entry_points]
aerovaldb =
json_files = aerovaldb.jsondb:AerovalJsonFileDB
sqlitedb = aerovaldb.sqlitedb:AerovalSqliteDB

[tox:tox]
labels =
Expand Down
73 changes: 64 additions & 9 deletions src/aerovaldb/aerovaldb.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
from .types import AccessType
from .utils import async_and_sync
from .routes import *
from .lock import FakeLock, FileLock


def get_method(route):
Expand Down Expand Up @@ -211,7 +212,11 @@ async def put_glob_stats(
raise NotImplementedError

def list_glob_stats(
self, project: str, experiment: str
self,
project: str,
experiment: str,
/,
access_type: str | AccessType = AccessType.URI,
) -> Generator[str, None, None]:
"""Generator that lists the URI for each glob_stats object.
Expand Down Expand Up @@ -339,7 +344,11 @@ async def put_timeseries(
raise NotImplementedError

def list_timeseries(
self, project: str, experiment: str
self,
project: str,
experiment: str,
/,
access_type: str | AccessType = AccessType.URI,
) -> Generator[str, None, None]:
"""Returns a list of URIs of all timeseries files for
a given project and experiment id.
Expand Down Expand Up @@ -761,7 +770,13 @@ async def put_map(
"""
raise NotImplementedError

def list_map(self, project: str, experiment: str) -> Generator[str, None, None]:
def list_map(
self,
project: str,
experiment: str,
/,
access_type: str | AccessType = AccessType.URI,
) -> Generator[str, None, None]:
"""Lists all map files for a given project / experiment combination.
:param project: The project ID.
Expand Down Expand Up @@ -1140,9 +1155,9 @@ async def get_by_uri(
Note:
-----
URI is implementation specific. While AerovalJsonFileDB returns
a file path, this behaviour should not be relied upon as other
implementations may not.
URI is intended to be consistent between implementations. Using get_by_uri()
to fetch an identifier which can then be written to another connector using
its respective put_by_uri() method.
"""
raise NotImplementedError

Expand All @@ -1156,9 +1171,9 @@ async def put_by_uri(self, obj, uri: str):
Note:
-----
URI is implementation specific. While AerovalJsonFileDB returns
a file path as the uri, this behaviour should not be relied upon
as other implementations will not.
URI is intended to be consistent between implementations. Using get_by_uri()
to fetch an identifier which can then be written to another connector using
its respective put_by_uri() method.
"""
raise NotImplementedError

Expand All @@ -1170,3 +1185,43 @@ def lock(self):
See also: https://aerovaldb.readthedocs.io/en/latest/locking.html
"""
raise NotImplementedError

def _normalize_access_type(
self, access_type: AccessType | str | None, default: AccessType = AccessType.OBJ
) -> AccessType:
"""Normalizes the access_type to an instance of AccessType enum.
:param access_type: AccessType instance or string convertible to AccessType
:param default: The type to return if access_type is None. Defaults to AccessType.OBJ
:raises ValueError: If str access_type can't be converted to AccessType.
:raises ValueError: If access_type is not str or AccessType
:return: The normalized AccessType.
"""
if isinstance(access_type, AccessType):
return access_type

if isinstance(access_type, str):
try:
return AccessType[access_type]
except:
raise ValueError(
f"String '{access_type}' can not be converted to AccessType."
)
if access_type is None:
return default

assert False

def list_all(
self, access_type: str | AccessType = AccessType.URI
) -> Generator[str, None, None]:
"""Iterator to list over the URI of each object
stored in the current aerovaldb connection, returning
the URI of each.
:param access_type : What to return (This is implementation specific, but in general
each implementation should support URI).
:raises : UnsupportedOperation
For non-supported acces types.
"""
raise NotImplementedError
Loading