Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate Your First Golden - Using Custom model #1216

Open
pratikchhapolika opened this issue Dec 8, 2024 · 3 comments
Open

Generate Your First Golden - Using Custom model #1216

pratikchhapolika opened this issue Dec 8, 2024 · 3 comments

Comments

@pratikchhapolika
Copy link

pratikchhapolika commented Dec 8, 2024

I am following this link: https://docs.confident-ai.com/docs/synthesizer-introduction#:~:text=begin%20generating%20goldens.-,from%20deepeval.synthesizer%20import%20Synthesizer,-...

Browser: Chrome
Python: 3.12
deepeval version: '2.0.3'
Jupyter Notebook on Macbook 16 Pro

Note: This custom model works fine when evaluating on metrics.

Custom model using AzureOpenAI

from langchain_openai.chat_models import AzureChatOpenAI
from langchain_openai.embeddings import AzureOpenAIEmbeddings

class AzureOpenAI(DeepEvalBaseLLM):
    def __init__(
        self,
        model
    ):
        self.model = model

    def load_model(self):
        return self.model

    def generate(self, prompt: str) -> str:
        chat_model = self.load_model()
        return chat_model.invoke(prompt).content

    async def a_generate(self, prompt: str) -> str:
        chat_model = self.load_model()
        res = await chat_model.ainvoke(prompt)
        return res.content

    def get_model_name(self):
        return "Custom Azure OpenAI Model"

# Replace these with real values
custom_model=AzureChatOpenAI(
                api_version=config.pf_api_version,
                azure_endpoint=config.pf_oa_endpoint,
                azure_ad_token=token,
                max_tokens=config.max_tokens,
                model=config.pf_llm_deployment,
    
            )
# init the embeddings for answer_relevancy, answer_correctness and answer_similarity
azure_embeddings = AzureOpenAIEmbeddings(
    openai_api_version=config.pf_api_version,
    azure_endpoint=config.pf_oa_endpoint_embed,
    azure_ad_token=token,
    model=config.pf_embedding_engine,
)
azure_openai = AzureOpenAI(model=custom_model)

Generate Your First Golden

from deepeval.synthesizer import Synthesizer

...
synthesizer = Synthesizer(model=azure_openai)
synthesizer.generate_goldens_from_docs(
    document_paths=['abc.pdf'],
    include_expected_output=True
)
print(synthesizer.synthetic_goldens)

ERROR TRACE

---------------------------------------------------------------------------
OpenAIError                               Traceback (most recent call last)
Cell In[9], line 2
      1 # Use gpt-3.5-turbo instead
----> 2 synthesizer = Synthesizer(model=azure_openai)
      4 synthesizer.generate_goldens_from_docs(
      5     document_paths=['abc.pdf'],
      6     include_expected_output=True,
      7     max_goldens_per_document=2
      8 )
      9 print(synthesizer.synthetic_goldens)

File [~/Library/Python/3.12/lib/python/site-packages/deepeval/synthesizer/synthesizer.py:93](http://localhost:8888/lab/tree/Downloads/~/Library/Python/3.12/lib/python/site-packages/deepeval/synthesizer/synthesizer.py#line=92), in Synthesizer.__init__(self, model, async_mode, max_concurrent, filtration_config, evolution_config, styling_config)
     88 self.synthetic_goldens: List[Golden] = []
     89 self.context_generator = None
     90 self.filtration_config = (
     91     filtration_config
     92     if filtration_config is not None
---> 93     else FiltrationConfig()
     94 )
     95 self.evolution_config = (
     96     evolution_config
     97     if evolution_config is not None
     98     else EvolutionConfig()
     99 )
    100 self.styling_config = (
    101     styling_config if styling_config is not None else StylingConfig()
    102 )

File <string>:6, in __init__(self, synthetic_input_quality_threshold, max_quality_retries, critic_model)

File [~/Library/Python/3.12/lib/python/site-packages/deepeval/synthesizer/config.py:18](http://localhost:8888/lab/tree/Downloads/~/Library/Python/3.12/lib/python/site-packages/deepeval/synthesizer/config.py#line=17), in FiltrationConfig.__post_init__(self)
     17 def __post_init__(self):
---> 18     self.critic_model, _ = initialize_model(self.critic_model)

File [~/Library/Python/3.12/lib/python/site-packages/deepeval/metrics/utils.py:269](http://localhost:8888/lab/tree/Downloads/~/Library/Python/3.12/lib/python/site-packages/deepeval/metrics/utils.py#line=268), in initialize_model(model)
    267     return model, False
    268 # Otherwise (the model is a string or None), we initialize a GPTModel and use as a native model
--> 269 return GPTModel(model=model), True

File [~/Library/Python/3.12/lib/python/site-packages/deepeval/models/gpt_model.py:103](http://localhost:8888/lab/tree/Downloads/~/Library/Python/3.12/lib/python/site-packages/deepeval/models/gpt_model.py#line=102), in GPTModel.__init__(self, model, _openai_api_key, base_url, *args, **kwargs)
    101 self.args = args
    102 self.kwargs = kwargs
--> 103 super().__init__(model_name)

File [~/Library/Python/3.12/lib/python/site-packages/deepeval/models/base_model.py:35](http://localhost:8888/lab/tree/Downloads/~/Library/Python/3.12/lib/python/site-packages/deepeval/models/base_model.py#line=34), in DeepEvalBaseLLM.__init__(self, model_name, *args, **kwargs)
     33 def __init__(self, model_name: Optional[str] = None, *args, **kwargs):
     34     self.model_name = model_name
---> 35     self.model = self.load_model(*args, **kwargs)

File [~/Library/Python/3.12/lib/python/site-packages/deepeval/models/gpt_model.py:156](http://localhost:8888/lab/tree/Downloads/~/Library/Python/3.12/lib/python/site-packages/deepeval/models/gpt_model.py#line=155), in GPTModel.load_model(self)
    144     return CustomChatOpenAI(
    145         model_name=model_name,
    146         openai_api_key=openai_api_key,
   (...)
    153         **self.kwargs,
    154     )
    155 else:
--> 156     return ChatOpenAI(
    157         model_name=self.model_name,
    158         openai_api_key=self._openai_api_key,
    159         *self.args,
    160         **self.kwargs,
    161     )

File [~/Library/Python/3.12/lib/python/site-packages/langchain_core/load/serializable.py:125](http://localhost:8888/lab/tree/Downloads/~/Library/Python/3.12/lib/python/site-packages/langchain_core/load/serializable.py#line=124), in Serializable.__init__(self, *args, **kwargs)
    123 def __init__(self, *args: Any, **kwargs: Any) -> None:
    124     """"""
--> 125     super().__init__(*args, **kwargs)

    [... skipping hidden 1 frame]

File [~/Library/Python/3.12/lib/python/site-packages/langchain_openai/chat_models/base.py:551](http://localhost:8888/lab/tree/Downloads/~/Library/Python/3.12/lib/python/site-packages/langchain_openai/chat_models/base.py#line=550), in BaseChatOpenAI.validate_environment(self)
    549         self.http_client = httpx.Client(proxy=self.openai_proxy)
    550     sync_specific = {"http_client": self.http_client}
--> 551     self.root_client = openai.OpenAI(**client_params, **sync_specific)  # type: ignore[arg-type]
    552     self.client = self.root_client.chat.completions
    553 if not self.async_client:

File [~/Library/Python/3.12/lib/python/site-packages/openai/_client.py:105](http://localhost:8888/lab/tree/Downloads/~/Library/Python/3.12/lib/python/site-packages/openai/_client.py#line=104), in OpenAI.__init__(self, api_key, organization, project, base_url, timeout, max_retries, default_headers, default_query, http_client, _strict_response_validation)
    103     api_key = os.environ.get("OPENAI_API_KEY")
    104 if api_key is None:
--> 105     raise OpenAIError(
    106         "The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable"
    107     )
    108 self.api_key = api_key
    110 if organization is None:

OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable
@kritinv
Copy link
Collaborator

kritinv commented Dec 9, 2024

Hey @pratikchhapolika , if you don't supply an OpenAI API key, DeepEval uses the OpenAI model as the critic model for filtering unqualified goldens. You can easily avoid this by defining your own custom FiltrationConfig with the custom model you've defined for generation.

@pratikchhapolika
Copy link
Author

pratikchhapolika commented Dec 10, 2024

Hey @pratikchhapolika , if you don't supply an OpenAI API key, DeepEval uses the OpenAI model as the critic model for filtering unqualified goldens. You can easily avoid this by defining your own custom FiltrationConfig with the custom model you've defined for generation.

filtration_config = FiltrationConfig(critic_model=azure_openai,synthetic_input_quality_threshold=0.6)
synthesizer = Synthesizer(filtration_config=filtration_config,model=azure_openai)

synthesizer.generate_goldens_from_docs(
    document_paths=['abc.pdf'],
    include_expected_output=True,
)
print(synthesizer.synthetic_goldens)
df = synthesizer.to_pandas()

I am seeing the same error @kritinv

---------------------------------------------------------------------------
OpenAIError                               Traceback (most recent call last)
Cell In[5], line 4
      1 filtration_config = FiltrationConfig(critic_model=azure_openai,synthetic_input_quality_threshold=0.6)
      2 synthesizer = Synthesizer(filtration_config=filtration_config,model=azure_openai)
----> 4 synthesizer.generate_goldens_from_docs(
      5     document_paths=['abc.pdf'],
      6     include_expected_output=True,
      7 )
      8 print(synthesizer.synthetic_goldens)
      9 df = synthesizer.to_pandas()

File [~/Library/Python/3.12/lib/python/site-packages/deepeval/synthesizer/synthesizer.py:117](http://localhost:8888/lab/tree/Downloads/deepeval/~/Library/Python/3.12/lib/python/site-packages/deepeval/synthesizer/synthesizer.py#line=116), in Synthesizer.generate_goldens_from_docs(self, document_paths, include_expected_output, max_goldens_per_context, context_construction_config, _send_data)
    108 def generate_goldens_from_docs(
    109     self,
    110     document_paths: List[str],
   (...)
    114     _send_data=True,
    115 ):
    116     if context_construction_config is None:
--> 117         context_construction_config = ContextConstructionConfig()
    119     if self.async_mode:
    120         loop = get_or_create_event_loop()

File <string>:11, in __init__(self, embedder, critic_model, max_contexts_per_document, chunk_size, chunk_overlap, context_quality_threshold, context_similarity_threshold, max_retries)

File [~/Library/Python/3.12/lib/python/site-packages/deepeval/synthesizer/config.py:57](http://localhost:8888/lab/tree/Downloads/deepeval/~/Library/Python/3.12/lib/python/site-packages/deepeval/synthesizer/config.py#line=56), in ContextConstructionConfig.__post_init__(self)
     56 def __post_init__(self):
---> 57     self.critic_model, _ = initialize_model(self.critic_model)
     58     if self.embedder is None:
     59         self.embedder = OpenAIEmbeddingModel()

File [~/Library/Python/3.12/lib/python/site-packages/deepeval/metrics/utils.py:269](http://localhost:8888/lab/tree/Downloads/deepeval/~/Library/Python/3.12/lib/python/site-packages/deepeval/metrics/utils.py#line=268), in initialize_model(model)
    267     return model, False
    268 # Otherwise (the model is a string or None), we initialize a GPTModel and use as a native model
--> 269 return GPTModel(model=model), True

File [~/Library/Python/3.12/lib/python/site-packages/deepeval/models/gpt_model.py:103](http://localhost:8888/lab/tree/Downloads/deepeval/~/Library/Python/3.12/lib/python/site-packages/deepeval/models/gpt_model.py#line=102), in GPTModel.__init__(self, model, _openai_api_key, base_url, *args, **kwargs)
    101 self.args = args
    102 self.kwargs = kwargs
--> 103 super().__init__(model_name)

File [~/Library/Python/3.12/lib/python/site-packages/deepeval/models/base_model.py:35](http://localhost:8888/lab/tree/Downloads/deepeval/~/Library/Python/3.12/lib/python/site-packages/deepeval/models/base_model.py#line=34), in DeepEvalBaseLLM.__init__(self, model_name, *args, **kwargs)
     33 def __init__(self, model_name: Optional[str] = None, *args, **kwargs):
     34     self.model_name = model_name
---> 35     self.model = self.load_model(*args, **kwargs)

File [~/Library/Python/3.12/lib/python/site-packages/deepeval/models/gpt_model.py:156](http://localhost:8888/lab/tree/Downloads/deepeval/~/Library/Python/3.12/lib/python/site-packages/deepeval/models/gpt_model.py#line=155), in GPTModel.load_model(self)
    144     return CustomChatOpenAI(
    145         model_name=model_name,
    146         openai_api_key=openai_api_key,
   (...)
    153         **self.kwargs,
    154     )
    155 else:
--> 156     return ChatOpenAI(
    157         model_name=self.model_name,
    158         openai_api_key=self._openai_api_key,
    159         *self.args,
    160         **self.kwargs,
    161     )

File [~/Library/Python/3.12/lib/python/site-packages/langchain_core/load/serializable.py:125](http://localhost:8888/lab/tree/Downloads/deepeval/~/Library/Python/3.12/lib/python/site-packages/langchain_core/load/serializable.py#line=124), in Serializable.__init__(self, *args, **kwargs)
    123 def __init__(self, *args: Any, **kwargs: Any) -> None:
    124     """"""
--> 125     super().__init__(*args, **kwargs)

    [... skipping hidden 1 frame]

File [~/Library/Python/3.12/lib/python/site-packages/langchain_openai/chat_models/base.py:551](http://localhost:8888/lab/tree/Downloads/deepeval/~/Library/Python/3.12/lib/python/site-packages/langchain_openai/chat_models/base.py#line=550), in BaseChatOpenAI.validate_environment(self)
    549         self.http_client = httpx.Client(proxy=self.openai_proxy)
    550     sync_specific = {"http_client": self.http_client}
--> 551     self.root_client = openai.OpenAI(**client_params, **sync_specific)  # type: ignore[arg-type]
    552     self.client = self.root_client.chat.completions
    553 if not self.async_client:

File [~/Library/Python/3.12/lib/python/site-packages/openai/_client.py:105](http://localhost:8888/lab/tree/Downloads/deepeval/~/Library/Python/3.12/lib/python/site-packages/openai/_client.py#line=104), in OpenAI.__init__(self, api_key, organization, project, base_url, timeout, max_retries, default_headers, default_query, http_client, _strict_response_validation)
    103     api_key = os.environ.get("OPENAI_API_KEY")
    104 if api_key is None:
--> 105     raise OpenAIError(
    106         "The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable"
    107     )
    108 self.api_key = api_key
    110 if organization is None:

OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable

@penguine-ip

@pratikchhapolika
Copy link
Author

pratikchhapolika commented Dec 10, 2024

I also find this doc to be misleading: https://docs.confident-ai.com/docs/guides-using-custom-embedding-models#:~:text=from%20deepeval.synthesizer%20import%20Synthesizer%0A...%0A%0Asynthesizer%20%3D%20Synthesizer(embedder%3DCustomEmbeddingModel())


from deepeval.synthesizer import Synthesizer
...

synthesizer = Synthesizer(embedder=CustomEmbeddingModel())

Shoud we pass, chat model to both Synthesizer and filtration_config or the embedding model.
Also no parameter as embedder

Which model does it uses to convert pdf to text?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants