Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial context updates #93

Open
wants to merge 340 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 184 commits
Commits
Show all changes
340 commits
Select commit Hold shift + click to select a range
83547a6
update property added
pseusys May 23, 2023
89bdf54
typos fixed
pseusys May 24, 2023
6c17b1e
new scheme proposal
pseusys May 24, 2023
e263354
new subscript type (no subscript)
pseusys May 31, 2023
9d00610
private key, 3 default values and test fix
pseusys Jun 1, 2023
ec543d2
sample example added
pseusys Jun 1, 2023
8950e60
example db cleaned
pseusys Jun 1, 2023
157a18a
mongo completed
pseusys Jun 6, 2023
38b06f0
sql operational
pseusys Jun 6, 2023
d1e1b4f
ydb operational (for current test set)
pseusys Jun 8, 2023
acfb571
sql redefinition fixed
pseusys Jun 9, 2023
02eedff
attributes moved to vars
pseusys Jun 11, 2023
9e77d90
type checks and restrictions added
pseusys Jun 13, 2023
fcfaf06
function order fixed
pseusys Jun 13, 2023
885c899
policies tests added
pseusys Jun 13, 2023
6ae19ef
_hilarious_ YDB random bug fixed **again** just for `FUN`
pseusys Jun 13, 2023
155867d
lint applied
pseusys Jun 14, 2023
2a81e93
tests restored
pseusys Jun 14, 2023
cf2e8db
Merge branch 'dev' into feat/partial_context_updates
pseusys Jun 14, 2023
d4ad968
errors fixed
pseusys Jun 14, 2023
ad9eb62
fixed two more errors
pseusys Jun 14, 2023
e57cfd1
docstrings added
pseusys Jun 14, 2023
0dc2070
mongo bug fixed
pseusys Jun 16, 2023
04e27c6
redis optimized
pseusys Jun 16, 2023
bea2155
mongo indexes added
pseusys Jun 16, 2023
9eef2ca
one less query for redis
pseusys Jun 16, 2023
ab0aaf8
sql requests made async
pseusys Jun 20, 2023
98f427d
one other option for SQL storages
pseusys Jun 23, 2023
5588dc6
sqls fixed
pseusys Jun 23, 2023
8121a25
duplicate indexes removed
pseusys Jun 25, 2023
5fe4d31
sqlite async error fixed
pseusys Jun 26, 2023
74e76b5
sync writes
pseusys Jun 27, 2023
b00fc30
async disabling possibility added, query parameters overflow fixed
pseusys Jun 28, 2023
5755d06
sql log read finished
pseusys Jun 28, 2023
5d70793
load test added, all items fixed
pseusys Jun 29, 2023
221fa01
ydb implemented
pseusys Jun 29, 2023
dd5c4d0
mongo finished
pseusys Jun 29, 2023
b44484d
mongo passes all tests correctly
pseusys Jun 29, 2023
ecc92bf
single log behavior added as default
pseusys Jun 29, 2023
8ce83ce
limit removed
pseusys Jul 3, 2023
08f2999
sql reworked
pseusys Jul 5, 2023
77a3d79
overread disabled
pseusys Jul 10, 2023
6bd931c
and now really updated
pseusys Jul 10, 2023
1c5e170
sparse logging
pseusys Jul 10, 2023
84a84ad
double writing disabled
pseusys Jul 11, 2023
9b94df9
faster (probably) serialization setup
pseusys Jul 11, 2023
b32aa73
faster pickle fixed
pseusys Jul 11, 2023
01f8b46
potential data loss prevented
pseusys Jul 12, 2023
c28b792
serializer interface added, datetime args added
pseusys Jul 13, 2023
5cb1b45
mongo ready
pseusys Jul 14, 2023
ccbc07a
redis done + active_ctx returned
pseusys Jul 18, 2023
49435da
ydb ready
pseusys Jul 18, 2023
39d0da7
file-based
pseusys Jul 19, 2023
1ca66ed
with_stem removed
pseusys Jul 19, 2023
9bb3eb7
ydb ??? again??
pseusys Jul 19, 2023
a8c6497
len and prune
pseusys Jul 21, 2023
2bbf6e4
redis delete number of args changed
pseusys Jul 21, 2023
7aefa5b
Update community.rst, revert some changes
pseusys Jul 21, 2023
6fa0542
one line reverted
pseusys Jul 21, 2023
fa9359f
double serialization removed
pseusys Jul 21, 2023
9fdf5bd
no_dependencies_tests_fixed
pseusys Jul 21, 2023
c70157a
serializer changed
pseusys Jul 21, 2023
05f0d94
serializer unchanged (example)
pseusys Jul 21, 2023
95ba296
partial tutorials started
pseusys Jul 30, 2023
cd020c9
context storages made async
pseusys Jul 31, 2023
687ba7e
tutorials added
pseusys Jul 31, 2023
425a744
example context storage removed
pseusys Jul 31, 2023
2403aed
docs added
pseusys Aug 1, 2023
b4546b4
storages docs updated
pseusys Aug 1, 2023
bdda5ff
reviewed problems fixed
pseusys Aug 2, 2023
414e4a0
file-based dbs made sync
pseusys Aug 3, 2023
e5357fc
quickle removed
pseusys Aug 3, 2023
edeb376
Excessive description removed
pseusys Aug 4, 2023
895011c
Merge branch 'dev' into feat/partial_context_updates
pseusys Aug 4, 2023
78d2ccc
migrated to pydantic 2.0
pseusys Aug 4, 2023
d4bff86
Documentation building fixes (#186)
pseusys Aug 4, 2023
821713c
add patch for json context storage
ruthenian8 Aug 4, 2023
fca7c42
json storage fixed
pseusys Aug 4, 2023
33dabc8
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Aug 4, 2023
c5ad6d5
test pickle save and load with logging
pseusys Aug 4, 2023
8deaabd
timestamp conversion test for windows
pseusys Aug 7, 2023
cbe7c70
time in nanoseconds for windows
pseusys Aug 7, 2023
b14239e
ok ok windows take this
pseusys Aug 7, 2023
ed888d4
some other idea to trick windows
pseusys Aug 7, 2023
998fb2c
excessive logging removed
pseusys Aug 8, 2023
12f938e
config dicts fixed + module docstrings added
pseusys Aug 9, 2023
dbc8928
linting and formatting fixed
pseusys Aug 9, 2023
ab43a98
s's removed from docstrings
pseusys Aug 10, 2023
cc18acc
Merge branch 'dev' into feat/partial_context_updates
pseusys Aug 10, 2023
7f850ee
type defined
pseusys Aug 11, 2023
9fe28c9
property docstring added
pseusys Aug 14, 2023
6e4eb75
Merge branch 'dev' into feat/partial_context_updates
pseusys Aug 30, 2023
c25c48d
dff installation cell added to tutorial 8
pseusys Aug 30, 2023
6856ee5
shelve improved
pseusys Sep 5, 2023
5314e31
partial review reaction
pseusys Sep 19, 2023
7f1835e
more documentation added
pseusys Sep 20, 2023
cd76105
finished review
pseusys Sep 24, 2023
4238f9b
Merge branch 'dev' into feat/partial_context_updates
RLKRo Mar 22, 2024
50cda47
put benchmark tutorial after partial updates one
RLKRo Mar 22, 2024
4cc055a
Merge branch 'dev' into feat/partial_context_updates
pseusys Jul 4, 2024
7f77c8f
context storages updated
pseusys Jul 4, 2024
2617255
old naming reset
pseusys Jul 4, 2024
4fb8f67
context merge fixed
pseusys Jul 4, 2024
1230d16
context ids removed
pseusys Jul 4, 2024
c3d82da
context equality tested
pseusys Jul 4, 2024
0bd6347
framework data comparison removed
pseusys Jul 4, 2024
4a15bf0
context id removed from everywhere
pseusys Jul 4, 2024
9b3dd80
lint applied
pseusys Jul 4, 2024
4f0562a
documentation building fixed
pseusys Jul 4, 2024
ef0a9ee
RST syntax fixed
pseusys Jul 4, 2024
3d364bc
context dict added
pseusys Jul 29, 2024
e7ad269
async + pydantic
pseusys Jul 30, 2024
be34714
fixes
pseusys Jul 31, 2024
b8701a0
hashes manipulation only on `write_full_diff`
pseusys Jul 31, 2024
a58eace
ctx_dict + ctx updated
pseusys Aug 5, 2024
33f2823
setting removed
pseusys Aug 5, 2024
c4f9fce
sets added
pseusys Aug 6, 2024
e892a52
serialization added, sample context storage class created
pseusys Aug 6, 2024
1b8aa0d
iterative async access made synchronous
pseusys Aug 6, 2024
173b1fe
sql prototype
pseusys Aug 6, 2024
9665038
context API updated proposal
pseusys Aug 7, 2024
3468af5
context schema and serializer removed
pseusys Aug 7, 2024
71bd9f3
context API updated once again
pseusys Aug 7, 2024
2e6b334
review notes fixed
pseusys Aug 8, 2024
830ea40
ContextDictView made mutable
pseusys Aug 8, 2024
5d3dd95
context dict file split
pseusys Aug 8, 2024
f00ba02
turn introduction reverted
pseusys Aug 8, 2024
1af24db
turns separated (again)
pseusys Aug 13, 2024
3616ac0
key deletion now nullifies value
pseusys Aug 13, 2024
81ce7ba
memory storage
pseusys Aug 16, 2024
1f9e653
ctx_dict tests done
pseusys Aug 17, 2024
c981cc5
general context storages tests created
pseusys Aug 27, 2024
5002dda
ctx_dict updated not to use serializer
pseusys Sep 18, 2024
3e6a8f4
merge dev
RLKRo Sep 18, 2024
6991fb6
Merge branch 'refs/heads/dev' into feat/partial_context_updates
RLKRo Sep 18, 2024
5b80818
fix imports in newly added files
RLKRo Sep 19, 2024
96af9bc
hide circular imports behind type checking
RLKRo Sep 19, 2024
000fb0d
fix imports in test files
RLKRo Sep 19, 2024
2c2ab9d
merge context.init into context.connected
RLKRo Sep 19, 2024
2eb5a2c
remove get_last_index imports
RLKRo Sep 19, 2024
06d54b9
update pipeline.context_storage type
RLKRo Sep 19, 2024
f80e6a3
fix bug with setting sequence type values under a single key
RLKRo Sep 19, 2024
c5311f6
revert primary_id renaming
RLKRo Sep 19, 2024
d43752a
memory test (almost!) finished
pseusys Sep 23, 2024
1ae3e4f
ctx_dict tests fixed
pseusys Sep 23, 2024
85315a6
add overload for getitem
RLKRo Sep 23, 2024
351a43e
split typevar definitions
RLKRo Sep 23, 2024
e9eb2fb
remove asyncio mark
RLKRo Sep 23, 2024
6d93399
allow using negative indexes for context dict
RLKRo Sep 23, 2024
e2053dc
add validation on setitem for context dict
RLKRo Sep 24, 2024
acdcd3c
fixes
RLKRo Sep 24, 2024
16a3d77
allow non-str context ids
RLKRo Sep 24, 2024
9a76ae3
add current_turn_id
RLKRo Sep 24, 2024
5e37651
fix tests
RLKRo Sep 24, 2024
d376e49
update doc
RLKRo Sep 24, 2024
256e296
integer keysreversed
pseusys Sep 24, 2024
e2ffa0a
sql storage update function fix
pseusys Sep 24, 2024
9043dca
move context factory and pipeline fixtures to global conftest
RLKRo Sep 24, 2024
d58ce7c
unbound V from BaseModel
RLKRo Sep 24, 2024
6905bcd
remove default marker; return None by default
RLKRo Sep 24, 2024
0ac3c1e
fix key slicing
RLKRo Sep 24, 2024
3956348
use current_turn_id in check_happy_path
RLKRo Sep 24, 2024
d37c4e2
use context_factory to initialize context in non-core tests
RLKRo Sep 24, 2024
2bf82f9
fix: await misc get
RLKRo Sep 24, 2024
8a4d8be
update pipeline tutorials
RLKRo Sep 24, 2024
6404eb4
allow initializing MemoryContextStoraeg via context_storage_factory
RLKRo Sep 25, 2024
240cded
move all db tests into a single parametrized test class
RLKRo Sep 25, 2024
535d524
SQL testing fixed
pseusys Sep 27, 2024
6e0a103
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Sep 27, 2024
862e7d3
test_dbs fixed
pseusys Sep 27, 2024
e82d086
file context storages implemented
pseusys Sep 27, 2024
59f91c1
file and sql fixed
pseusys Sep 28, 2024
1c97303
async file dependency removed
pseusys Sep 30, 2024
f5ceb2f
rename delete_main_info to delete_context
RLKRo Sep 30, 2024
cf27afa
fix load_field_items typing
RLKRo Oct 1, 2024
c1a24ee
rewrite db tests
RLKRo Oct 1, 2024
f2ec013
Merge branch 'refs/heads/dev' into feat/partial_context_updates
RLKRo Oct 1, 2024
cb22d12
small None checking update
pseusys Oct 3, 2024
8ba5aed
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Oct 3, 2024
d9b95f6
tests updated
pseusys Oct 3, 2024
7277bf9
mongo done
pseusys Oct 3, 2024
e1cb50d
redis done
pseusys Oct 4, 2024
782bf66
ydb finished
pseusys Oct 4, 2024
0fb487b
raise error in abstract method
RLKRo Oct 4, 2024
ff70324
update service tests
RLKRo Oct 7, 2024
b59cf95
Merge remote-tracking branch 'origin/feat/partial_context_updates' in…
RLKRo Oct 7, 2024
d3af3b2
update lock file
RLKRo Oct 7, 2024
e38e2d4
fieldconfig removed
pseusys Oct 10, 2024
de739f2
update benchmark utils
RLKRo Oct 11, 2024
eaa8a87
aiofile reverted
pseusys Oct 13, 2024
53bf877
misc tables removed
pseusys Oct 13, 2024
7629fbc
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Oct 13, 2024
757fe48
denchmark awaiting removed
pseusys Oct 17, 2024
a001c27
Merge branch 'refs/heads/dev' into feat/partial_context_updates
RLKRo Oct 18, 2024
96d05dc
update lock file
RLKRo Oct 18, 2024
1430544
fix context size calculation
RLKRo Oct 18, 2024
403e2e1
change model_dump mode
RLKRo Oct 18, 2024
5340256
key filter implementation
pseusys Oct 21, 2024
9aad1bb
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Oct 21, 2024
b32b367
ctx_dict hashes update added
pseusys Oct 24, 2024
edc85bd
added and removed sets cleared upon storage
pseusys Oct 24, 2024
e61b1b7
Revert "key filter implementation"
RLKRo Oct 24, 2024
d114d42
sql and file logging added
pseusys Oct 28, 2024
3619125
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Oct 28, 2024
5618484
debug logging added
pseusys Oct 28, 2024
5e6e223
use standard logging practices
RLKRo Oct 30, 2024
4323871
make logging more uniform across the methods and collapse long lists
RLKRo Oct 31, 2024
93144df
fix potential error in prefix parsing
RLKRo Oct 31, 2024
83c7b33
Merge branch 'refs/heads/dev' into feat/partial_context_updates
RLKRo Oct 31, 2024
b763f21
create tmp file only for file dbs
RLKRo Nov 1, 2024
69d1520
add test for load_field_items
RLKRo Nov 2, 2024
291396f
test fix: misc no longer context dict
RLKRo Nov 2, 2024
c3d8c73
test fix: load_field_items no longer returns dict
RLKRo Nov 2, 2024
4bb6ca7
test fix: field config was removed
RLKRo Nov 2, 2024
dbbbb28
remove debug artefact
RLKRo Nov 2, 2024
710554c
all user input escapedin ydb
pseusys Nov 6, 2024
20b6b5f
ctx_dict moved
pseusys Nov 8, 2024
2b6eebf
async lock introduced
pseusys Nov 8, 2024
6c458c6
codestyle fixed
pseusys Nov 14, 2024
46e0112
Merge branch 'dev' into feat/partial_context_updates
pseusys Nov 14, 2024
e263fa1
SOME of the errors FIXED!!!
pseusys Nov 20, 2024
1f96f6d
rebuild script updated
pseusys Nov 22, 2024
ce6c8b6
turns added, empty ctx_dict method also added
pseusys Nov 22, 2024
9e7cf47
context creation field set removed
pseusys Nov 22, 2024
c34f8e7
contex storage class splitted
pseusys Nov 22, 2024
1d3859c
rebuild was cleaned (once again)
pseusys Nov 22, 2024
5514c7b
turns added and tested
pseusys Nov 25, 2024
2b9b947
splitted database methods + locks and validations
pseusys Nov 27, 2024
86d745c
insert limit removed
pseusys Nov 27, 2024
214fb92
_locks removed from subclasses
pseusys Nov 27, 2024
5a8d0d5
lazy connection
pseusys Nov 27, 2024
abbd920
uuid length and name changed
pseusys Nov 28, 2024
b9a0680
logs location changed
pseusys Nov 28, 2024
0115b83
none and empty subscript forbidden
pseusys Nov 28, 2024
0587881
names extracted to a special class
pseusys Nov 28, 2024
e756f75
set strings removed
pseusys Nov 28, 2024
61619e3
configuration name changed
pseusys Nov 28, 2024
aad2c49
literal keys instead of strings
pseusys Nov 28, 2024
539005d
loggers from SQL removed
pseusys Nov 29, 2024
2feb094
connect before load in file
pseusys Nov 29, 2024
2ac91a2
logging moved to commect
pseusys Nov 29, 2024
f4e5f33
context dict made abstract
pseusys Nov 29, 2024
68a1c5f
connect moved to pipeline.run
pseusys Nov 29, 2024
8671233
ctx_dict overloads fixed
pseusys Nov 29, 2024
48b6444
configuration renamed
pseusys Nov 29, 2024
e40786c
context_info dataclass added
pseusys Nov 29, 2024
a54df18
test-time comparison fixed
pseusys Nov 29, 2024
49d3bff
lock staticmethod extracted
pseusys Nov 29, 2024
6fd0e1a
initial locking system fixed
pseusys Nov 29, 2024
47edbda
codestyle
pseusys Nov 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
dist/
venv/
build/
dbs/
docs/source/apiref
docs/source/release_notes.rst
docs/source/tutorials
Expand Down
2 changes: 2 additions & 0 deletions dff/context_storages/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,5 @@
from .mongo import MongoContextStorage, mongo_available
from .shelve import ShelveContextStorage
from .protocol import PROTOCOLS, get_protocol_install_suggestion
from .context_schema import ContextSchema, ALL_ITEMS
from .serializer import DefaultSerializer
286 changes: 286 additions & 0 deletions dff/context_storages/context_schema.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,286 @@
"""
Context Schema
--------------
The `ContextSchema` module provides class for managing context storage rules.
The :py:class:`~.Context` will be stored in two instances, `CONTEXT` and `LOGS`,
that can be either files, databases or namespaces. The context itself alongsode with
pseusys marked this conversation as resolved.
Show resolved Hide resolved
several latest requests, responses and labels are stored in `CONTEXT` table,
while the older ones are kept in `LOGS` table and not accessed too often.
"""

import time
pseusys marked this conversation as resolved.
Show resolved Hide resolved
from asyncio import gather
from uuid import uuid4
from enum import Enum
from pydantic import BaseModel, Field
from typing import Any, Coroutine, List, Dict, Optional, Callable, Tuple, Union, Awaitable
from typing_extensions import Literal

from dff.script import Context

ALL_ITEMS = "__all__"
"""
The default value for all `DictSchemaField`:
it means that all keys of the dictionary or list will be read or written.
Can be used as a value of `subscript` parameter for `DictSchemaField` and `ListSchemaField`.
pseusys marked this conversation as resolved.
Show resolved Hide resolved
"""

_ReadPackedContextFunction = Callable[[str], Awaitable[Tuple[Dict, Optional[str]]]]
RLKRo marked this conversation as resolved.
Show resolved Hide resolved
RLKRo marked this conversation as resolved.
Show resolved Hide resolved
"""
Type alias of asynchronous function that should be called in order to retrieve context
data from `CONTEXT` table. Matches type of :py:func:`DBContextStorage._read_pac_ctx` method.
pseusys marked this conversation as resolved.
Show resolved Hide resolved
"""

_ReadLogContextFunction = Callable[[Optional[int], str, str], Awaitable[Dict]]
"""
Type alias of asynchronous function that should be called in order to retrieve context
data from `LOGS` table. Matches type of :py:func:`DBContextStorage._read_log_ctx` method.
"""

_WritePackedContextFunction = Callable[[Dict, int, int, str, str], Awaitable]
"""
Type alias of asynchronous function that should be called in order to write context
data to `CONTEXT` table. Matches type of :py:func:`DBContextStorage._write_pac_ctx` method.
"""

_WriteLogContextFunction = Callable[[List[Tuple[str, int, Any]], int, str], Coroutine]
pseusys marked this conversation as resolved.
Show resolved Hide resolved
"""
Type alias of asynchronous function that should be called in order to write context
data to `LOGS` table. Matches type of :py:func:`DBContextStorage._write_log_ctx` method.
"""


class SchemaField(BaseModel, validate_assignment=True):
"""
Schema for :py:class:`~.Context` fields that are dictionaries with numeric keys fields.
Used for controlling read and write policy of the particular field.
"""

name: str = Field(default_factory=str, frozen=True)
"""
`name` is the name of backing :py:class:`~.Context` field.
It can not (and should not) be changed in runtime.
"""

subscript: Union[Literal["__all__"], int] = 3
"""
`subscript` is used for limiting keys for reading and writing.
It can be a string `__all__` meaning all existing keys or number,
positive for first **N** keys and negative for last **N** keys.
pseusys marked this conversation as resolved.
Show resolved Hide resolved
Keys should be sorted as numbers.
Default: 3.
"""


class ExtraFields(str, Enum):
RLKRo marked this conversation as resolved.
Show resolved Hide resolved
"""
Enum, conaining special :py:class:`~.Context` field names.
These fields only can be used for data manipulation within context storage.
"""

active_ctx = "active_ctx"
pseusys marked this conversation as resolved.
Show resolved Hide resolved
primary_id = "_primary_id"
storage_key = "_storage_key"
created_at = "_created_at"
updated_at = "_updated_at"


class ContextSchema(BaseModel, validate_assignment=True, arbitrary_types_allowed=True):
"""
Schema, describing how :py:class:`~.Context` fields should be stored and retrieved from storage.
The default behaviour is the following: All the context data except for the fields that are
dictionaries with numeric keys is serialized and stored in `CONTEXT` **table** (that is a table
for SQL context storages only, it can also be a file or a namespace for different backends).
pseusys marked this conversation as resolved.
Show resolved Hide resolved
For the dictionaries with numeric keys, their entries are sorted according by key and the last
pseusys marked this conversation as resolved.
Show resolved Hide resolved
few are included into `CONTEXT` table, while the rest are stored in `LOGS` table.

That behaviour allows context storage to minimize the operation number for context reading and
writing.
"""

requests: SchemaField = Field(default_factory=lambda: SchemaField(name="requests"), frozen=True)
"""
Field for storing Context field `requests`.
pseusys marked this conversation as resolved.
Show resolved Hide resolved
"""

responses: SchemaField = Field(default_factory=lambda: SchemaField(name="responses"), frozen=True)
"""
Field for storing Context field `responses`.
"""

labels: SchemaField = Field(default_factory=lambda: SchemaField(name="labels"), frozen=True)
"""
Field for storing Context field `labels`.
"""

append_single_log: bool = True
"""
If set will *not* write only one value to LOGS table each turn.

Example:
If `labels` field contains 7 entries and its subscript equals 3, (that means that 4 labels
were added during current turn), if `duplicate_context_in_logs` is set to False:
RLKRo marked this conversation as resolved.
Show resolved Hide resolved

- If `append_single_log` is True:
only the first label will be written to `LOGS`.
- If `append_single_log` is False:
all 4 first labels will be written to `LOGS`.

"""

duplicate_context_in_logs: bool = False
"""
If set will *always* backup all items in `CONTEXT` table in `LOGS` table

Example:
If `labels` field contains 7 entries and its subscript equals 3 and `append_single_log`
is set to False:

- If `duplicate_context_in_logs` is False:
the last 3 entries will be stored in `CONTEXT` table and 4 first will be stored in `LOGS`.
- If `duplicate_context_in_logs` is True:
the last 3 entries will be stored in `CONTEXT` table and all 7 will be stored in `LOGS`.

"""

supports_async: bool = False
"""
If set will try to perform *some* operations asynchronously.

WARNING! Be careful with this flag. Some databases support asynchronous reads and writes,
and some do not. For all `DFF` context storages it will be set automatically.
pseusys marked this conversation as resolved.
Show resolved Hide resolved
Change it only if you implement a custom context storage.
"""

def __init__(self, **kwargs):
super().__init__(**kwargs)

async def read_context(
self, pac_reader: _ReadPackedContextFunction, log_reader: _ReadLogContextFunction, storage_key: str
) -> Context:
"""
Read context from storage.
Calculate what fields to read, call reader function and cast result to context.
Also set `primary_id` and `storage_key` attributes of the read context.

:param pac_reader: the function used for reading context from
`CONTEXT` table (see :py:const:`~._ReadPackedContextFunction`).
:param log_reader: the function used for reading context from
`LOGS` table (see :py:const:`~._ReadLogContextFunction`).
:param storage_key: the key the context is stored with.

:return: the read :py:class:`~.Context` object.
"""
ctx_dict, primary_id = await pac_reader(storage_key)
if primary_id is None:
raise KeyError(f"No entry for key {primary_id}.")

tasks = dict()
for field_props in [value for value in dict(self).values() if isinstance(value, SchemaField)]:
field_name = field_props.name
nest_dict: Dict[int, Any] = ctx_dict[field_name]
if isinstance(field_props.subscript, int):
sorted_dict = sorted(list(nest_dict.keys()))
last_read_key = sorted_dict[-1] if len(sorted_dict) > 0 else 0
if len(nest_dict) > field_props.subscript:
limit = -field_props.subscript
last_keys = sorted(nest_dict.keys())[limit:]
ctx_dict[field_name] = {k: v for k, v in nest_dict.items() if k in last_keys}
elif len(nest_dict) < field_props.subscript and last_read_key > field_props.subscript:
pseusys marked this conversation as resolved.
Show resolved Hide resolved
limit = field_props.subscript - len(nest_dict)
tasks[field_name] = log_reader(limit, field_name, primary_id)
else:
tasks[field_name] = log_reader(None, field_name, primary_id)

if self.supports_async:
tasks = dict(zip(tasks.keys(), await gather(*tasks.values())))
else:
tasks = {key: await task for key, task in tasks.items()}

for field_name in tasks.keys():
log_dict = {k: v for k, v in tasks[field_name].items()}
ctx_dict[field_name].update(log_dict)
pseusys marked this conversation as resolved.
Show resolved Hide resolved

ctx = Context.cast(ctx_dict)
setattr(ctx, ExtraFields.primary_id.value, primary_id)
setattr(ctx, ExtraFields.storage_key.value, storage_key)
return ctx

async def write_context(
self,
ctx: Context,
pac_writer: _WritePackedContextFunction,
log_writer: _WriteLogContextFunction,
storage_key: str,
chunk_size: Union[Literal[False], int] = False,
pseusys marked this conversation as resolved.
Show resolved Hide resolved
):
"""
Write context to storage.
Calculate what fields to write, split large data into chunks if needed and call writer function.
Also update `updated_at` attribute of the given context with current time, set `primary_id` and `storage_key`.

:param ctx: the context to store.
:param pac_writer: the function used for writing context to
`CONTEXT` table (see :py:const:`~._WritePackedContextFunction`).
:param log_writer: the function used for writing context to
`LOGS` table (see :py:const:`~._WriteLogContextFunction`).
:param storage_key: the key to store the context with.
:param chunk_size: maximum number of items that can be inserted simultaneously, False if no such limit exists.

:return: the read :py:class:`~.Context` object.
"""
updated_at = time.time_ns()
setattr(ctx, ExtraFields.updated_at.value, updated_at)
created_at = getattr(ctx, ExtraFields.created_at.value, updated_at)

ctx_dict = ctx.model_dump()
logs_dict = dict()
primary_id = getattr(ctx, ExtraFields.primary_id.value, str(uuid4()))
RLKRo marked this conversation as resolved.
Show resolved Hide resolved

for field_props in [value for value in dict(self).values() if isinstance(value, SchemaField)]:
nest_dict = ctx_dict[field_props.name]
last_keys = sorted(nest_dict.keys())

if (
self.append_single_log
and isinstance(field_props.subscript, int)
and len(nest_dict) > field_props.subscript
):
unfit = -field_props.subscript - 1
pseusys marked this conversation as resolved.
Show resolved Hide resolved
logs_dict[field_props.name] = {last_keys[unfit]: nest_dict[last_keys[unfit]]}
else:
if self.duplicate_context_in_logs or not isinstance(field_props.subscript, int):
logs_dict[field_props.name] = nest_dict
else:
limit = -field_props.subscript
logs_dict[field_props.name] = {key: nest_dict[key] for key in last_keys[:limit]}

if isinstance(field_props.subscript, int):
limit = -field_props.subscript
last_keys = last_keys[limit:]

ctx_dict[field_props.name] = {k: v for k, v in nest_dict.items() if k in last_keys}

await pac_writer(ctx_dict, created_at, updated_at, storage_key, primary_id)

flattened_dict: List[Tuple[str, int, Dict]] = list()
for field, payload in logs_dict.items():
for key, value in payload.items():
flattened_dict += [(field, key, value)]
if len(flattened_dict) > 0:
if not bool(chunk_size):
await log_writer(flattened_dict, updated_at, primary_id)
RLKRo marked this conversation as resolved.
Show resolved Hide resolved
else:
tasks = list()
for ch in range(0, len(flattened_dict), chunk_size):
next_ch = ch + chunk_size
chunk = flattened_dict[ch:next_ch]
tasks += [log_writer(chunk, updated_at, primary_id)]
if self.supports_async:
await gather(*tasks)
pseusys marked this conversation as resolved.
Show resolved Hide resolved
else:
for task in tasks:
await task

setattr(ctx, ExtraFields.primary_id.value, primary_id)
setattr(ctx, ExtraFields.storage_key.value, storage_key)
Loading