Pinecone Native Integration #485

monami44 · 2024-11-03T00:03:49Z

Pinecone Integration with Comprehensive Testing Suite

Overview

This PR introduces a comprehensive Pinecone integration along with extensive testing coverage across vector operations, RAG implementations, inference capabilities, and assistant functionalities.

Key Features

`PineconeProvider` Class

Implements a robust provider class that wraps Pinecone's functionality with instrumentation.
Handles both data plane and control plane operations.
Includes comprehensive error handling and event tracking.
Supports both traditional Pinecone operations and newer features like the Assistant API.
Note: Currently, there is no support for evaluating answers using the Assistant API.

Testing Suites

RAG Pipeline Test (`pinecone_rag_test.py`)

Tests the complete RAG (Retrieval-Augmented Generation) workflow.
Covers index creation, document embedding, and semantic search.
Implements vector operations (upsert, update, delete).
Tests collection management.
Includes real-world query testing with context retrieval.

Inference API Test (`pinecone_inference_test.py`)

Tests Pinecone's embedding generation capabilities.
Implements document reranking functionality.
Validates usage tracking and response formatting.
Tests various embedding models and configurations.

Assistant API Test (`pinecone_assistant_test.py`)

Tests Pinecone's Assistant API functionality.
Covers assistant creation, updating, and deletion.
Implements file upload and management.
Tests chat completions using an OpenAI-compatible interface.
Note: There is currently no support for evaluating answers within the Assistant API (premium tier feature)

Implementation Details

Comprehensive error handling and logging are included.
Event tracking for all operations is implemented.
Support for both synchronous and streaming responses is provided.
Proper cleanup of resources after each test is ensured.
Integrated with AgentOps for monitoring and tracking.

Testing Instructions

Ensure that the required environment variables are set:
- PINECONE_API_KEY
- OPENAI_API_KEY (for RAG testing)

Run individual test suites with the following commands:

python tests/core_manual_tests/providers/pinecone_rag_test.py
python tests/core_manual_tests/providers/pinecone_inference_test.py
python tests/core_manual_tests/providers/pinecone_assistant_test.py

Notes

All tests are designed to be idempotent, ensuring no repeated impact on test results.
Comprehensive resource cleanup is included in all test flows.
Tests include rate limiting and appropriate delays to ensure API stability.
Extensive error handling and detailed reporting are part of the suite.

Future Improvements

Add support for batch operations to enhance efficiency.
Implement retry logic for handling transient failures.
Integrate custom embedding model support.
Enhance streaming response handling for smoother real-time interactions.

For more information, refer to the Pinecone Documentation.

gitguardian · 2024-11-03T00:04:25Z

⚠️ GitGuardian has uncovered 4 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard.
Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secrets in your pull request

GitGuardian id	GitGuardian status	Secret	Commit	Filename
-	-	Generic High Entropy Secret	`4570e65`	agentops/llms/pinecone_test.py	View secret
-	-	Generic High Entropy Secret	`5b0a5b1`	agentops/llms/pinecone_test.py	View secret
-	-	Generic High Entropy Secret	`5b0a5b1`	agentops/llms/pinecone_test.py	View secret
-	-	Generic High Entropy Secret	`4570e65`	agentops/llms/pinecone_test.py	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secrets safely. Learn here the best practices.
Revoke and rotate these secrets.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}

areibman · 2024-11-04T07:43:36Z

Amazing-- this is actually the first vectorDB integration PR we've seen so far. Will take a look soon @monami44.

In the meantime-- can you resolve the merge conflict?

monami44 · 2024-11-04T23:05:56Z

Hey @areibman , I just resolved the merge conflict. Also here is a video-walkthrough I sent to Adam as I submitted the PR. Hit me up if you want anything else to be changed

areibman · 2024-11-05T18:06:25Z

Super cool! @the-praxs and I can take a look

teocns · 2024-11-08T18:11:30Z

tests/core_manual_tests/providers/pinecone_assistant_test.py

+        provider.delete_assistant(pc, "test-assistant")
+        print("Assistant deleted")
+
+        os.remove("test_data.txt")


TL;DR tests should not write to physical file, but simulate doing so by writing memory.

Dangerous practice... in the rare but not impossible event this can lead to race conditions and surely host directory pollution, for which I suggest either standard pyfakefs or mktemp / tempfile to avoid file collision at the very least.

Both provisioning and teardown belong to a fixture or a provisioner (if I remember correctly pyfakefs already handles it for you).

Added pyfakefs in PR #490

the-praxs · 2024-11-10T15:19:15Z

@monami44 I would request to do the following modifications -

Ensure that the chat_completions are stored as LLMEvent instead of ActionEvent. On the dashboard we use the LLM Chat tab to show the conversation between the user and the agents. Hence the chat_completions object's contents can be used for such tracking ability.
Create examples notebooks under examples directory so that one can understand how to use Pinecone and AgentOps together.

monami44 · 2024-11-11T12:28:23Z

Hey @areibman @the-praxs @teocns !

I've updated to match the new version, created example notebooks, applied LLMEvent to pinecone assistant chat function as well as did tests with temporary files as asked. Please tell me if anything else needs changes.

Cheers,
Maksym

the-praxs · 2024-11-11T13:21:17Z

Hey @areibman @the-praxs @teocns !

I've updated to match the new version, created example notebooks, applied LLMEvent to pinecone assistant chat function as well as did tests with temporary files as asked. Please tell me if anything else needs changes.

Cheers,

Maksym

I will look in a while. Thanks a lot!

the-praxs · 2024-11-14T13:31:02Z

Linting is failing so please resolve that.

the-praxs

The most modifications are required in the Exception block where a pprint can help the user understand what's causing it.

Other changes are essential for code consistency and efficiency.

Also - please do the linting and resolve the errors with CI tests.

the-praxs · 2024-11-15T11:19:32Z

agentops/event.py

+@dataclass
+class VectorEvent(Event):
+    """Event class for vector operations"""
+    event_type: str = "action"


Incorrect type annotation and value. This should be EventType with the value as EventType.VECTOR.value.

Since that events is missing in the EventType class, adding a VECTOR enum will resolve the issu.

the-praxs · 2024-11-15T11:23:29Z

agentops/llms/pinecone.py

+                self._safe_record(session, event)
+                return response
+            except Exception as e:
+                error_event = ErrorEvent(


Would be good to use pprint and print the Exception in the console for the user to see what's causing it.

the-praxs · 2024-11-15T11:23:53Z

agentops/llms/pinecone.py

+                    "query": kwargs.get("query")
+                })
+        except Exception as e:
+            details["error"] = str(e)


Same as the other Exception block, pprint will look good!

the-praxs · 2024-11-15T11:25:28Z

agentops/llms/pinecone.py

+                self._safe_record(session, event)
+                return response
+            except Exception as e:
+                error_event = ErrorEvent(


Same as the other Exception block - pprint

the-praxs · 2024-11-15T11:25:45Z

agentops/llms/pinecone.py

+                    result = orig(*args, **kwargs)
+                    return self.handle_response(result, event_kwargs, init_timestamp, session=session)
+                except Exception as e:
+                    # Create ActionEvent for the error


You know what I mean :)

the-praxs · 2024-11-15T11:26:06Z

agentops/llms/pinecone.py

+            response = pc_instance.assistant.create_assistant(**kwargs)
+            return self._handle_assistant_response(response, "create_assistant", kwargs, init_timestamp, session)
+        except Exception as e:
+            error_event = ErrorEvent(


the-praxs · 2024-11-15T11:27:11Z

agentops/llms/pinecone.py

+            response = assistant.chat_completions(messages=message_objects, stream=stream, model=model)
+
+            # Debug logging
+            print(f"Debug - Raw response: {response}")


Replace with logger.debug to ensure consistent logging.

the-praxs · 2024-11-15T11:27:18Z

agentops/llms/pinecone.py

+            return completion_text
+
+        except Exception as e:
+            print(f"Debug - Exception in chat_completions: {str(e)}")


Replace with logger.debug to ensure consistent logging.

monami44 added 8 commits October 31, 2024 23:09

draft

da3ab63

creation deletion

5b0a5b1

rerank + embed

f87f32f

kinda done

cfb8b45

almost done

4570e65

first commit

ac9f92b

done

cfeeca2

x

6b78a64

the-praxs requested review from areibman and the-praxs November 8, 2024 14:59

teocns reviewed Nov 8, 2024

View reviewed changes

the-praxs linked an issue Nov 10, 2024 that may be closed by this pull request

[Feature]: Database Integration: Pinecone. #457

Open

3 tasks

examples, better tests, llmevent

11ad351

monami44 force-pushed the pinecone branch from 00599bb to 11ad351 Compare November 11, 2024 12:12

monami44 added 2 commits November 11, 2024 13:24

Merge remote-tracking branch 'upstream/main'

991aef0

Merge branch 'main' into pinecone

42024c8

the-praxs added 2 commits November 14, 2024 19:01

Merge branch 'main' into pinecone

d21c12b

fix imports for PineconeProvider in tests

6b5b3e7

the-praxs requested changes Nov 15, 2024

View reviewed changes

AgentOps-AI deleted a comment from entelligence-ai-pr-reviews bot Nov 18, 2024

the-praxs added 2 commits November 18, 2024 14:36

Merge branch 'main' into pinecone

015bdbf

Merge branch 'main' into pinecone

4d77b44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pinecone Native Integration #485

Pinecone Native Integration #485

monami44 commented Nov 3, 2024

gitguardian bot commented Nov 3, 2024 •

edited

Loading

areibman commented Nov 4, 2024

monami44 commented Nov 4, 2024

areibman commented Nov 5, 2024

teocns Nov 8, 2024 •

edited

Loading

the-praxs commented Nov 10, 2024

monami44 commented Nov 11, 2024

the-praxs commented Nov 11, 2024

the-praxs commented Nov 14, 2024

the-praxs left a comment •

edited

Loading

the-praxs Nov 15, 2024

the-praxs Nov 15, 2024

the-praxs Nov 15, 2024

the-praxs Nov 15, 2024

the-praxs Nov 15, 2024

the-praxs Nov 15, 2024

the-praxs Nov 15, 2024

the-praxs Nov 15, 2024

Pinecone Native Integration #485

Are you sure you want to change the base?

Pinecone Native Integration #485

Conversation

monami44 commented Nov 3, 2024

Pinecone Integration with Comprehensive Testing Suite

Overview

Key Features

PineconeProvider Class

Testing Suites

RAG Pipeline Test (pinecone_rag_test.py)

Inference API Test (pinecone_inference_test.py)

Assistant API Test (pinecone_assistant_test.py)

Implementation Details

Testing Instructions

Notes

Future Improvements

gitguardian bot commented Nov 3, 2024 • edited Loading

⚠️ GitGuardian has uncovered 4 secrets following the scan of your pull request.

areibman commented Nov 4, 2024

monami44 commented Nov 4, 2024

areibman commented Nov 5, 2024

teocns Nov 8, 2024 • edited Loading

Choose a reason for hiding this comment

the-praxs commented Nov 10, 2024

monami44 commented Nov 11, 2024

the-praxs commented Nov 11, 2024

the-praxs commented Nov 14, 2024

the-praxs left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

`PineconeProvider` Class

RAG Pipeline Test (`pinecone_rag_test.py`)

Inference API Test (`pinecone_inference_test.py`)

Assistant API Test (`pinecone_assistant_test.py`)

gitguardian bot commented Nov 3, 2024 •

edited

Loading

teocns Nov 8, 2024 •

edited

Loading

the-praxs left a comment •

edited

Loading