Refactor Conversation Memory class and drivers #1084

vachillo · 2024-08-19T23:07:13Z

I have read and agree to the contributing guidelines for submitting new pull requests.

Describe your changes

Update LocalConversationMemory with an optional persist_file, consistent with LocalVectorStoreDriver
update return value of load to a tuple instead of instance
update BaseConversationMemory.to_prompt_stack to take a prompt driver.

Issue ticket number and link

codecov · 2024-08-21T17:36:07Z

Codecov Report

All modified and coverable lines are covered by tests ✅

📢 Thoughts on this report? Let us know!

griptape/memory/structure/run.py

collindutter · 2024-08-21T18:00:44Z

griptape/memory/structure/base_conversation_memory.py

        default=Factory(lambda: Defaults.drivers_config.conversation_memory_driver), kw_only=True
    )
    prompt_driver: BasePromptDriver = field(
        default=Factory(lambda: Defaults.drivers_config.prompt_driver), kw_only=True
    )
    runs: list[Run] = field(factory=list, kw_only=True, metadata={"serializable": True})
+    metadata: dict = field(factory=dict, kw_only=True, metadata={"serializable": True})


Should be renamed to meta for consistency with other places.

collindutter · 2024-08-21T18:01:38Z

griptape/memory/structure/base_conversation_memory.py

-                [self.add_run(r) for r in memory.runs]
+        if self.autoload:
+            runs, metadata = self.conversation_memory_driver.load()
+            runs.extend(self.runs)


If we're going to merge Driver Runs in (I like this idea), I think we should add them after user defined runs.

went back and forth. my thought was that, if you passed new Runs here, they would probably be the most recent and the data that gets loaded would be "historical"

revisiting this, i think we have the same intention but my implementation is wrong

griptape/memory/structure/base_conversation_memory.py

collindutter · 2024-08-21T18:52:18Z

griptape/drivers/memory/conversation/local_conversation_memory_driver.py

+        if self.persist_file is not None and not self.lazy_load:
+            self._load_file()


How would one use lazy_load? Should this be present in all Drivers?

mostly for "backwards compatibility" with the previous implementation. it wouldnt try to create the file unless accessed. we could just make one default and stick with it.

I vote for picking one way.

I kind of lean towards the lazy way, otherwise, it'll just create like an empty file, right? Though I could see the predictable behavior nice I suppose... though if the only goal is to load it back into the same driver, then the driver can check if it exists... yeah I vote for the lazy way. That said I don't feel too strongly about it.

dont really care either way either. ill remove the param and default to the existing lazy behavior and see how that looks

Maybe I'm misunderstanding the functionality, but can we just check if persist_file is set and only save then?

its timing on when the file is actually created. LocalVectorStoreDriver creates the file on post_init, whereas the original implementation of LocalConversationMemoryDriver waited until .store() to create the file if needed.

Got it, so when would we ever want lazy_load=False? Seems easiest to just check if persist_file is set during store and that's the only time we create a file.

griptape/drivers/memory/conversation/griptape_cloud_conversation_memory_driver.py

collindutter · 2024-08-21T18:55:40Z

griptape/drivers/memory/conversation/local_conversation_memory_driver.py

+            if not overwrite:
+                loaded_str = Path(self.persist_file).read_text()
+                if loaded_str:
+                    loaded_data = json.loads(loaded_str)
+                    loaded_data["runs"] += data["runs"]
+                    loaded_data["metadata"] = dict_merge(loaded_data["metadata"], data["metadata"])
+                    data = loaded_data


This may be a slightly cleaner method of appending to the json file.

collindutter · 2024-08-21T18:58:21Z

griptape/drivers/memory/conversation/griptape_cloud_conversation_memory_driver.py

+            thread_metadata = response.json().get("metadata", {})
+            thread_metadata = dict_merge(thread_metadata, metadata)
+            response = requests.patch(
+                self._get_url(f"/threads/{self.thread_id}"),
+                json={"metadata": thread_metadata},
+                headers=self.headers,
+            )
+            response.raise_for_status()


Can this be DRY'd up with the overwrite logic?

~~on the cloud side?~~ nvm ill see how i can update it

Yeah just meant that we should try to only have a single requests.patch for both conditional flows.

collindutter · 2024-08-21T18:59:44Z

griptape/drivers/memory/conversation/redis_conversation_memory_driver.py

+            loaded_str = self.client.hget(self.index, self.conversation_id)
+            if loaded_str is not None:
+                loaded_data = json.loads(loaded_str)
+                loaded_data["runs"] += data["runs"]
+                loaded_data["metadata"] = dict_merge(loaded_data["metadata"], data["metadata"])
+                data = loaded_data


Can we use a LPUSH to append to the list?

does that work for pushing to nested lists?

I'm honestly not sure, but it does seem like something we should at least try.

dylanholmes

Nice work!

Comments are mainly topics for discussion. Some may be out of scope for this PR

griptape/drivers/memory/conversation/amazon_dynamodb_conversation_memory_driver.py

griptape/drivers/memory/conversation/base_conversation_memory_driver.py

dylanholmes · 2024-08-21T20:47:09Z

griptape/drivers/memory/conversation/local_conversation_memory_driver.py

+        if self.persist_file is not None and not self.lazy_load:
+            self._load_file()


I vote for picking one way.

I kind of lean towards the lazy way, otherwise, it'll just create like an empty file, right? Though I could see the predictable behavior nice I suppose... though if the only goal is to load it back into the same driver, then the driver can check if it exists... yeah I vote for the lazy way. That said I don't feel too strongly about it.

dylanholmes · 2024-08-21T21:12:33Z

griptape/drivers/memory/conversation/base_conversation_memory_driver.py



 class BaseConversationMemoryDriver(SerializableMixin, ABC):
    @abstractmethod
-    def store(self, memory: BaseConversationMemory) -> None: ...
+    def store(self, runs: list[Run], metadata: dict, *, overwrite: bool = False) -> None: ...


Alternative idea: how about adding a clear() (or reset()) method instead of overwrite?

(Or am I missing understanding the behavior?)

griptape/memory/structure/run.py

collindutter

Nice, just a couple minor questions/feedbacks

collindutter · 2024-08-26T15:24:24Z

griptape/drivers/memory/conversation/base_conversation_memory_driver.py

+    def _get_dict(self, runs: list[Run], metadata: Optional[dict] = None) -> dict:
+        data: dict = {"runs": [run.to_dict() for run in runs]}
+        if metadata is not None:
+            data["metadata"] = metadata
+        return data


_get_dict doesn't tell me much about why this method exists. Maybe rename to something like _to_params?

collindutter · 2024-08-26T15:26:54Z

griptape/utils/dict_utils.py

+def dict_merge_opt(dct: Optional[dict], merge_dct: Optional[dict], *, add_keys: bool = True) -> Optional[dict]:
+    if dct is None:
+        return merge_dct
+    if merge_dct is None:
+        return dct
+
+    return dict_merge(dct, merge_dct, add_keys=add_keys)


Why does this method need to exist? Can its functionality be merged into dict_merge?

every instance of its usage would need an updated type hint. tried to be unobtrusive

collindutter · 2024-08-26T15:28:13Z

griptape/drivers/memory/conversation/griptape_cloud_conversation_memory_driver.py

+        for run in runs:
+            run_dict = dict_merge_opt(
+                {
+                    "input": run.input.to_json(),
+                    "output": run.output.to_json(),
+                    "metadata": {"run_id": run.id},
+                },
+                run.meta,
+            )



Use list comprehension instead.

collindutter · 2024-08-26T15:37:55Z

griptape/memory/structure/run.py

-    input: BaseArtifact = field(kw_only=True, metadata={"serializable": True})
-    output: BaseArtifact = field(kw_only=True, metadata={"serializable": True})
+    id: str = field(default=Factory(lambda: uuid.uuid4().hex), metadata={"serializable": True})
+    meta: Optional[dict] = field(default=None, metadata={"serializable": True})


I think this should default to meta: dict[str, Any] instead of Optional. More consistent with other meta implementations, and then we don't need merge_dict_opt.

must have been looking at BaseVectorStoreDriver.Entry.

collindutter · 2024-08-26T19:36:30Z

griptape/utils/__init__.py

Sorry to be annoying, can you remove this change from the PR?

collindutter · 2024-08-26T19:36:47Z

tests/unit/utils/test_dict_utils.py

collindutter

Approved outside of the last two requested changes.

collindutter · 2024-08-26T23:27:03Z

MIGRATION.md

+### LocalConversationMemoryDriver `file_path` renamed to `persist_file`
+
+`LocalConversationMemoryDriver.file_path` has been renamed to `persist_file` and is now `Optional[str]`. If `persist_file` is not passed as a parameter, nothing will be persisted and no errors will be raised. `LocalConversationMemoryDriver` is now the default driver in the global `Defaults` object.
+
+#### 0.30.X
+```python
+local_driver_with_file = LocalConversationMemoryDriver(
+    file_path="my_file.json"
+)
+
+local_driver = LocalConversationMemoryDriver()
+
+assert local_driver_with_file.file_path == "my_file.json"
+assert local_driver.file_path == "griptape_memory.json"
+```
+
+#### 0.31.X
+```python
+local_driver_with_file = LocalConversationMemoryDriver(
+    persist_file="my_file.json"
+)
+
+local_driver = LocalConversationMemoryDriver()
+
+assert local_driver_with_file.persist_file == "my_file.json"
+assert local_driver.persist_file is None
+```


Let's move this higher up since the other examples use it.

collindutter · 2024-08-26T23:27:37Z

MIGRATION.md

+
+### Changes to BaseConversationMemoryDriver
+
+`BaseConversationMemoryDriver` has updated parameter names and different method signatures for `.store` and `.load`.


Should also call out driver renamed to conversation_memory_driver

dylanholmes

Nice work!

dylanholmes · 2024-08-27T13:22:46Z

griptape/drivers/memory/conversation/redis_conversation_memory_driver.py

-
-        key = self.index
-        memory_json = self.client.hget(key, self.conversation_id)
+    def load(self) -> tuple[list[Run], dict[str, Any]]:


so much cleaner!

vachillo force-pushed the conv_mem branch 7 times, most recently from 50d8e36 to 5cd4f87 Compare August 21, 2024 17:07

vachillo marked this pull request as ready for review August 21, 2024 17:07

vachillo force-pushed the conv_mem branch from bf6bb7e to ab86bdb Compare August 21, 2024 17:27

vachillo requested review from collindutter and dylanholmes August 21, 2024 17:47

collindutter requested changes Aug 21, 2024

View reviewed changes

dylanholmes reviewed Aug 21, 2024

View reviewed changes

vachillo force-pushed the conv_mem branch from 1767a0f to a337cd4 Compare August 22, 2024 15:26

vachillo requested review from collindutter and dylanholmes August 22, 2024 21:22

collindutter requested changes Aug 26, 2024

View reviewed changes

vachillo force-pushed the conv_mem branch from 1ab8ac4 to e405e00 Compare August 26, 2024 17:29

vachillo requested a review from collindutter August 26, 2024 17:31

vachillo force-pushed the conv_mem branch from e405e00 to 8dd6458 Compare August 26, 2024 17:37

collindutter reviewed Aug 26, 2024

View reviewed changes

griptape/utils/__init__.py Outdated

Copy link

Member

collindutter Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be annoying, can you remove this change from the PR?

collindutter reviewed Aug 26, 2024

View reviewed changes

tests/unit/utils/test_dict_utils.py Outdated

Copy link

Member

collindutter Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this

collindutter reviewed Aug 26, 2024

View reviewed changes

vachillo added 3 commits August 26, 2024 17:31

squash

a6006f0

formatting

f9881d0

add migration guide

5a9ab50

vachillo force-pushed the conv_mem branch from 7ca6d85 to 5a9ab50 Compare August 26, 2024 22:59

collindutter reviewed Aug 26, 2024

View reviewed changes

updates

d2bd6c9

dylanholmes approved these changes Aug 27, 2024

View reviewed changes

vachillo requested a review from collindutter August 27, 2024 15:25

collindutter approved these changes Aug 27, 2024

View reviewed changes

vachillo merged commit ef61c53 into dev Aug 27, 2024
13 checks passed

vachillo deleted the conv_mem branch August 27, 2024 15:49

		if self.persist_file is not None and not self.lazy_load:
		self._load_file()


		### Changes to BaseConversationMemoryDriver

		`BaseConversationMemoryDriver` has updated parameter names and different method signatures for `.store` and `.load`.

Refactor Conversation Memory class and drivers #1084

Refactor Conversation Memory class and drivers #1084

Conversation

vachillo commented Aug 19, 2024 • edited Loading

Describe your changes

Issue ticket number and link

codecov bot commented Aug 21, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vachillo Aug 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dylanholmes left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

collindutter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

collindutter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dylanholmes left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vachillo commented Aug 19, 2024 •

edited

Loading

codecov bot commented Aug 21, 2024 •

edited

Loading

vachillo Aug 21, 2024 •

edited

Loading