Skip to content

Commit

Permalink
Refactor Artifacts (#1114)
Browse files Browse the repository at this point in the history
  • Loading branch information
collindutter authored Sep 13, 2024
1 parent 86100db commit 37d5582
Show file tree
Hide file tree
Showing 69 changed files with 574 additions and 557 deletions.
21 changes: 21 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,27 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## Unreleased

### Added
- `BaseArtifact.to_bytes()` method to convert an Artifact's value to bytes.
- `BlobArtifact.base64` property for converting a `BlobArtifact`'s value to a base64 string.
- `CsvLoader`/`SqlLoader`/`DataframeLoader` `formatter_fn` field for customizing how SQL results are formatted into `TextArtifact`s.

### Changed
- **BREAKING**: Removed `CsvRowArtifact`. Use `TextArtifact` instead.
- **BREAKING**: Removed `MediaArtifact`, use `ImageArtifact` or `AudioArtifact` instead.
- **BREAKING**: `CsvLoader`, `DataframeLoader`, and `SqlLoader` now return `list[TextArtifact]`.
- **BREAKING**: Removed `ImageArtifact.media_type`.
- **BREAKING**: Removed `AudioArtifact.media_type`.
- **BREAKING**: Removed `BlobArtifact.dir_name`.
- **BREAKING**: Moved `ImageArtifact.prompt` and `ImageArtifact.model` into `ImageArtifact.meta`.
- **BREAKING**: `ImageArtifact.format` is now required.
- Updated `JsonArtifact` value converter to properly handle more types.
- `AudioArtifact` now subclasses `BlobArtifact` instead of `MediaArtifact`.
- `ImageArtifact` now subclasses `BlobArtifact` instead of `MediaArtifact`.
- Removed `__add__` method from `BaseArtifact`, implemented it where necessary.
- Generic type support to `ListArtifact`.
- Iteration support to `ListArtifact`.

## [0.31.0] - 2024-09-03

**Note**: This release includes breaking changes. Please refer to the [Migration Guide](./MIGRATION.md#030x-to-031x) for details.
Expand Down
136 changes: 136 additions & 0 deletions MIGRATION.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,142 @@
# Migration Guide

This document provides instructions for migrating your codebase to accommodate breaking changes introduced in new versions of Griptape.
## 0.31.X to 0.32.X

### Removed `MediaArtifact`

`MediaArtifact` has been removed. Use `ImageArtifact` or `AudioArtifact` instead.

#### Before

```python
image_media = MediaArtifact(
b"image_data",
media_type="image",
format="jpeg"
)

audio_media = MediaArtifact(
b"audio_data",
media_type="audio",
format="wav"
)
```

#### After
```python
image_artifact = ImageArtifact(
b"image_data",
format="jpeg"
)

audio_artifact = AudioArtifact(
b"audio_data",
format="wav"
)
```

### `ImageArtifact.format` is now required

`ImageArtifact.format` is now a required parameter. Update any code that does not provide a `format` parameter.

#### Before

```python
image_artifact = ImageArtifact(
b"image_data"
)
```

#### After
```python
image_artifact = ImageArtifact(
b"image_data",
format="jpeg"
)
```

### Removed `CsvRowArtifact`

`CsvRowArtifact` has been removed. Use `TextArtifact` instead.

#### Before

```python
artifact = CsvRowArtifact({"name": "John", "age": 30})
print(artifact.value) # {"name": "John", "age": 30}
print(type(artifact.value)) # <class 'dict'>
```

#### After
```python
artifact = TextArtifact("name: John\nage: 30")
print(artifact.value) # name: John\nage: 30
print(type(artifact.value)) # <class 'str'>
```

If you require storing a dictionary as an Artifact, you can use `GenericArtifact` instead.

### `CsvLoader`, `DataframeLoader`, and `SqlLoader` return types

`CsvLoader`, `DataframeLoader`, and `SqlLoader` now return a `list[TextArtifact]` instead of `list[CsvRowArtifact]`.

If you require a dictionary, set a custom `formatter_fn` and then parse the text to a dictionary.

#### Before

```python
results = CsvLoader().load(Path("people.csv").read_text())

print(results[0].value) # {"name": "John", "age": 30}
print(type(results[0].value)) # <class 'dict'>
```

#### After
```python
results = CsvLoader().load(Path("people.csv").read_text())

print(results[0].value) # name: John\nAge: 30
print(type(results[0].value)) # <class 'str'>

# Customize formatter_fn
results = CsvLoader(formatter_fn=lambda x: json.dumps(x)).load(Path("people.csv").read_text())
print(results[0].value) # {"name": "John", "age": 30}
print(type(results[0].value)) # <class 'str'>

dict_results = [json.loads(result.value) for result in results]
print(dict_results[0]) # {"name": "John", "age": 30}
print(type(dict_results[0])) # <class 'dict'>
```

### Moved `ImageArtifact.prompt` and `ImageArtifact.model` to `ImageArtifact.meta`

`ImageArtifact.prompt` and `ImageArtifact.model` have been moved to `ImageArtifact.meta`.

#### Before

```python
image_artifact = ImageArtifact(
b"image_data",
format="jpeg",
prompt="Generate an image of a cat",
model="DALL-E"
)

print(image_artifact.prompt, image_artifact.model) # Generate an image of a cat, DALL-E
```

#### After
```python
image_artifact = ImageArtifact(
b"image_data",
format="jpeg",
meta={"prompt": "Generate an image of a cat", "model": "DALL-E"}
)

print(image_artifact.meta["prompt"], image_artifact.meta["model"]) # Generate an image of a cat, DALL-E
```


## 0.30.X to 0.31.X

Expand Down
56 changes: 23 additions & 33 deletions docs/griptape-framework/data/artifacts.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,60 +5,50 @@ search:

## Overview

**[Artifacts](../../reference/griptape/artifacts/base_artifact.md)** are used for passing different types of data between Griptape components. All tools return artifacts that are later consumed by tasks and task memory.
Artifacts make sure framework components enforce contracts when passing and consuming data.
**[Artifacts](../../reference/griptape/artifacts/base_artifact.md)** are the core data structure in Griptape. They are used to encapsulate data and enhance it with metadata.

## Text

A [TextArtifact](../../reference/griptape/artifacts/text_artifact.md) for passing text data of arbitrary size around the framework. It can be used to count tokens with [token_count()](../../reference/griptape/artifacts/text_artifact.md#griptape.artifacts.text_artifact.TextArtifact.token_count) with a tokenizer.
It can also be used to generate a text embedding with [generate_embedding()](../../reference/griptape/artifacts/text_artifact.md#griptape.artifacts.text_artifact.TextArtifact.generate_embedding)
and access it with [embedding](../../reference/griptape/artifacts/text_artifact.md#griptape.artifacts.text_artifact.TextArtifact.embedding).
[TextArtifact](../../reference/griptape/artifacts/text_artifact.md)s store textual data. They offer methods such as [token_count()](../../reference/griptape/artifacts/text_artifact.md#griptape.artifacts.text_artifact.TextArtifact.token_count) for counting tokens with a tokenizer, and [generate_embedding()](../../reference/griptape/artifacts/text_artifact.md#griptape.artifacts.text_artifact.TextArtifact.generate_embedding) for creating text embeddings. You can also access the embedding via the [embedding](../../reference/griptape/artifacts/text_artifact.md#griptape.artifacts.text_artifact.TextArtifact.embedding) property.

[TaskMemory](../../reference/griptape/memory/task/task_memory.md) automatically stores [TextArtifact](../../reference/griptape/artifacts/text_artifact.md)s returned by tool activities and returns artifact IDs back to the LLM.
When `TextArtifact`s are returned from Tools, they will be stored in [Task Memory](../../griptape-framework/structures/task-memory.md) if the Tool has set `off_prompt=True`.

## Csv Row
## Blob

A [CsvRowArtifact](../../reference/griptape/artifacts/csv_row_artifact.md) for passing structured row data around the framework. It inherits from [TextArtifact](../../reference/griptape/artifacts/text_artifact.md) and overrides the
[to_text()](../../reference/griptape/artifacts/csv_row_artifact.md#griptape.artifacts.csv_row_artifact.CsvRowArtifact.to_text) method, which always returns a valid CSV row.
[BlobArtifact](../../reference/griptape/artifacts/blob_artifact.md)s store binary large objects (blobs).

## Info
When `BlobArtifact`s are returned from Tools, they will be stored in [Task Memory](../../griptape-framework/structures/task-memory.md) if the Tool has set `off_prompt=True`.

An [InfoArtifact](../../reference/griptape/artifacts/info_artifact.md) for passing short notifications back to the LLM without task memory storing them.
### Image

## Error
[ImageArtifact](../../reference/griptape/artifacts/image_artifact.md)s store image data. This includes binary image data along with metadata such as MIME type and dimensions. They are a subclass of [BlobArtifacts](#blob).

An [ErrorArtifact](../../reference/griptape/artifacts/error_artifact.md) is used for passing errors back to the LLM without task memory storing them.
### Audio

## Blob
[AudioArtifact](../../reference/griptape/artifacts/audio_artifact.md)s store audio content. This includes binary audio data and metadata such as format, and duration. They are a subclass of [BlobArtifacts](#blob).

A [BlobArtifact](../../reference/griptape/artifacts/blob_artifact.md) for passing binary large objects (blobs) back to the LLM.
Treat it as a way to return unstructured data, such as images, videos, audio, and other files back from tools.
Each blob has a [name](../../reference/griptape/artifacts/base_artifact.md#griptape.artifacts.base_artifact.BaseArtifact.name) and
[dir](../../reference/griptape/artifacts/blob_artifact.md#griptape.artifacts.blob_artifact.BlobArtifact.dir_name) to uniquely identify stored objects.
## List

[TaskMemory](../../reference/griptape/memory/task/task_memory.md) automatically stores [BlobArtifact](../../reference/griptape/artifacts/blob_artifact.md)s returned by tool activities that can be reused by other tools.
[ListArtifact](../../reference/griptape/artifacts/list_artifact.md)s store lists of Artifacts.

## Image
When `ListArtifact`s are returned from Tools, their elements will be stored in [Task Memory](../../griptape-framework/structures/task-memory.md) if the element is either a `TextArtifact` or a `BlobArtifact` and the Tool has set `off_prompt=True`.

An [ImageArtifact](../../reference/griptape/artifacts/image_artifact.md) is used for passing images back to the LLM. In addition to binary image data, an Image Artifact includes image metadata like MIME type, dimensions, and prompt and model information for images returned by [image generation Drivers](../drivers/image-generation-drivers.md). It inherits from [BlobArtifact](#blob).
## Info

## Audio
[InfoArtifact](../../reference/griptape/artifacts/info_artifact.md)s store small pieces of textual information. These are useful for conveying messages about the execution or results of an operation, such as "No results found" or "Operation completed successfully."

An [AudioArtifact](../../reference/griptape/artifacts/audio_artifact.md) allows the Framework to interact with audio content. An Audio Artifact includes binary audio content as well as metadata like format, duration, and prompt and model information for audio returned generative models. It inherits from [BlobArtifact](#blob).
## JSON

## Boolean
[JsonArtifact](../../reference/griptape/artifacts/json_artifact.md)s store JSON-serializable data. Any data assigned to the `value` property is processed using `json.dumps(json.loads(value))`.

A [BooleanArtifact](../../reference/griptape/artifacts/boolean_artifact.md) is used for passing boolean values around the framework.
## Error

!!! info
Any object passed on init to `BooleanArtifact` will be coerced into a `bool` type. This might lead to unintended behavior: `BooleanArtifact("False").value is True`. Use [BooleanArtifact.parse_bool](../../reference/griptape/artifacts/boolean_artifact.md#griptape.artifacts.boolean_artifact.BooleanArtifact.parse_bool) to convert case-insensitive string literal values `"True"` and `"False"` into a `BooleanArtifact`: `BooleanArtifact.parse_bool("False").value is False`.
[ErrorArtifact](../../reference/griptape/artifacts/error_artifact.md)s store exception information, providing a structured way to convey errors.

## Generic
## Action

A [GenericArtifact](../../reference/griptape/artifacts/generic_artifact.md) can be used as an escape hatch for passing any type of data around the framework.
It is generally not recommended to use this Artifact type, but it can be used in a handful of situations where no other Artifact type fits the data being passed.
See [talking to a video](../../examples/talk-to-a-video.md) for an example of using a `GenericArtifact` to pass a Gemini-specific video file.
[ActionArtifact](../../reference/griptape/artifacts/action_artifact.md)s represent actions taken by an LLM. Currently, the only supported action type is [ToolAction](../../reference/griptape/common/actions/tool_action.md), which is used to execute a [Tool](../../griptape-framework/tools/index.md).

## Json
## Generic

A [JsonArtifact](../../reference/griptape/artifacts/json_artifact.md) is used for passing JSON-serliazable data around the framework. Anything passed to `value` will be converted using `json.dumps(json.loads(value))`.
[GenericArtifact](../../reference/griptape/artifacts/generic_artifact.md)s provide a flexible way to pass data that does not fit into any other artifact category. While not generally recommended, they can be useful for specific use cases. For instance, see [talking to a video](../../examples/talk-to-a-video.md), which demonstrates using a `GenericArtifact` to pass a Gemini-specific video file.
6 changes: 3 additions & 3 deletions docs/griptape-framework/data/loaders.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,15 @@ Inherits from the [TextLoader](../../reference/griptape/loaders/text_loader.md)

## SQL

Can be used to load data from a SQL database into [CsvRowArtifact](../../reference/griptape/artifacts/csv_row_artifact.md)s:
Can be used to load data from a SQL database into [TextArtifact](../../reference/griptape/artifacts/text_artifact.md)s:

```python
--8<-- "docs/griptape-framework/data/src/loaders_2.py"
```

## CSV

Can be used to load CSV files into [CsvRowArtifact](../../reference/griptape/artifacts/csv_row_artifact.md)s:
Can be used to load CSV files into [TextArtifact](../../reference/griptape/artifacts/text_artifact.md)s:

```python
--8<-- "docs/griptape-framework/data/src/loaders_3.py"
Expand All @@ -42,7 +42,7 @@ Can be used to load CSV files into [CsvRowArtifact](../../reference/griptape/art
!!! info
This driver requires the `loaders-dataframe` [extra](../index.md#extras).

Can be used to load [pandas](https://pandas.pydata.org/) [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html)s into [CsvRowArtifact](../../reference/griptape/artifacts/csv_row_artifact.md)s:
Can be used to load [pandas](https://pandas.pydata.org/) [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html)s into [TextArtifact](../../reference/griptape/artifacts/text_artifact.md)s:

```python
--8<-- "docs/griptape-framework/data/src/loaders_4.py"
Expand Down
4 changes: 0 additions & 4 deletions griptape/artifacts/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,7 @@
from .json_artifact import JsonArtifact
from .blob_artifact import BlobArtifact
from .boolean_artifact import BooleanArtifact
from .csv_row_artifact import CsvRowArtifact
from .list_artifact import ListArtifact
from .media_artifact import MediaArtifact
from .image_artifact import ImageArtifact
from .audio_artifact import AudioArtifact
from .action_artifact import ActionArtifact
Expand All @@ -22,9 +20,7 @@
"JsonArtifact",
"BlobArtifact",
"BooleanArtifact",
"CsvRowArtifact",
"ListArtifact",
"MediaArtifact",
"ImageArtifact",
"AudioArtifact",
"ActionArtifact",
Expand Down
13 changes: 9 additions & 4 deletions griptape/artifacts/action_artifact.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,20 @@
from attrs import define, field

from griptape.artifacts import BaseArtifact
from griptape.mixins.serializable_mixin import SerializableMixin

if TYPE_CHECKING:
from griptape.common import ToolAction


@define()
class ActionArtifact(BaseArtifact, SerializableMixin):
class ActionArtifact(BaseArtifact):
"""Represents the LLM taking an action to use a Tool.
Attributes:
value: The Action to take. Currently only supports ToolAction.
"""

value: ToolAction = field(metadata={"serializable": True})

def __add__(self, other: BaseArtifact) -> ActionArtifact:
raise NotImplementedError
def to_text(self) -> str:
return str(self.value)
24 changes: 19 additions & 5 deletions griptape/artifacts/audio_artifact.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,26 @@
from __future__ import annotations

from attrs import define
from attrs import define, field

from griptape.artifacts import MediaArtifact
from griptape.artifacts import BlobArtifact


@define
class AudioArtifact(MediaArtifact):
"""AudioArtifact is a type of MediaArtifact representing audio."""
class AudioArtifact(BlobArtifact):
"""Stores audio data.
media_type: str = "audio"
Attributes:
format: The audio format, e.g. "wav" or "mp3".
"""

format: str = field(kw_only=True, metadata={"serializable": True})

@property
def mime_type(self) -> str:
return f"audio/{self.format}"

def to_bytes(self) -> bytes:
return self.value

def to_text(self) -> str:
return f"Audio, format: {self.format}, size: {len(self.value)} bytes"
Loading

0 comments on commit 37d5582

Please sign in to comment.