Skip to content

Commit

Permalink
Clean up docstrings
Browse files Browse the repository at this point in the history
  • Loading branch information
FullMetalMeowchemist committed Sep 16, 2023
1 parent c1a6085 commit 81a86da
Showing 1 changed file with 9 additions and 9 deletions.
18 changes: 9 additions & 9 deletions python/starpoint/embedding.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,8 +72,8 @@ def embed_and_join_metadata_by_columns(
metadatas: List[Dict],
model: EmbeddingModel,
) -> Dict[str, List[Dict]]:
"""Takes some texts and creates embeddings using a model in starpoint. Prefer using `embed_items`
instead, as mismatched `texts` and `metadatas` will output undesirable results.
"""Takes some texts and creates embeddings using a model in starpoint. Prefer using `embed_and_join_metadata` or
`embed_items` instead, as mismatched `texts` and `metadatas` will output undesirable results.
Under the hood this is using `embed_items`.
Args:
Expand Down Expand Up @@ -111,13 +111,14 @@ def embed_and_join_metadata(
embedding_key: Hashable,
model: EmbeddingModel,
) -> Dict[str, List[Dict]]:
"""Takes some texts and creates embeddings using a model in starpoint. Prefer using `embed_items`
instead, as mismatched `texts` and `metadatas` will output undesirable results.
Under the hood this is using `embed_items`.
"""Takes some texts and creates embeddings using a model in starpoint, and joins them to
all additional data as metadata. Under the hood this is using `embed_and_join_metadata_by_columns`
which is using `embed_items`.
Args:
text_embedding_items: List of dicts of data to create embeddings from.
embedding_key: the key in the embedding items to use to generate the embeddings against
embedding_key: the key in each item used to create embeddings from.
e.g. `"context"` would be passed if each item looks like this: `{"context": "embed this text"}`
model: An enum choice from EmbeddingModel.
Returns:
Expand All @@ -140,9 +141,8 @@ def embed_and_join_metadata(
f"{embedding_key}:\n {unqualified_indices}"
)

# TODO: Figure out if we should do a deep copy here instead of editing the original dict
# We can also do this operation in the first map, but that might make additional operations we might consider
# doing in here a lot more annoying. Feels like an optimization that shouldn't happen right now.
# We can also do this operation in the first map that creates texts, but that might make additional operations
# in here a lot more annoying. It's an optimization that shouldn't happen right now.
metadatas = list(
map(lambda item: item.pop(embedding_key), text_embedding_items)
)
Expand Down

0 comments on commit 81a86da

Please sign in to comment.