You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When querying a base model with an adapter that has NaN or Inf weight tensors, LoRAX returns the following error:
The output tensors do not match for key base_model.model.model.layers.0.self_attn.q_proj.lora_A.weight
It would be more helpful if the error message indicates the the reason the tensors don't match during merge is because LoRAX detected NaN/Inf tensors in the adapter weights.
Motivation
This would help provide a rectifiable/actionable path for users who fine-tuned models and are working on testing them out to know that this isn't an issue with LoRAX, but rather, an issue with their trained adapter weights.
Your contribution
Happy to help surface a better error message! Seems like the issue is raised from this line in particule?
@arnavgarg1 I just looked at it seems addition of following code
if torch.any(torch.isnan(<Yor tensor>))):
raise ValueError("output tensor contains nan please fix adapter and try again")
in here should fix this but I would argue we should catch custom exception based on type of error we encounter something like this
try:
# your code
except NanError:
raise NanError("<your message>")
except NotEqualError:
raise NotEqualError("<your message>")
# simple custom exception class
class CustomError(Exception):
"""Base class for all errors."""
class NanError(CustomError):
"""Exception raised tensor contains nan values"""
class NotEqualError(CustomError):
"""Exception raised when trying to compare values between two tensors """
Great suggestion, @asingh9530! I put together a quick PR (#168) to test. Let me know if this addresses the issue!
The one thing I'm not sure about is how to handle NaNs if the adapter was written to safetensor format. It seems (based on the error from above) that they're handled differently, so we may need to think about that separately.
Feature request
When querying a base model with an adapter that has NaN or Inf weight tensors, LoRAX returns the following error:
It would be more helpful if the error message indicates the the reason the tensors don't match during merge is because LoRAX detected NaN/Inf tensors in the adapter weights.
Motivation
This would help provide a rectifiable/actionable path for users who fine-tuned models and are working on testing them out to know that this isn't an issue with LoRAX, but rather, an issue with their trained adapter weights.
Your contribution
Happy to help surface a better error message! Seems like the issue is raised from this line in particule?
lorax/server/lorax_server/utils/convert.py
Line 92 in 360ad4c
The text was updated successfully, but these errors were encountered: