-
Notifications
You must be signed in to change notification settings - Fork 13
MCP-8 Add: Asset Search with Custom Metadata Filters in Atlan MCP Server #116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
287b718
to
fcd70a5
Compare
|
||
@staticmethod | ||
def make_request(url: str) -> Optional[Dict[str, Any]]: | ||
current_settings = Settings() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this initialization required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following variables are defined as class variables :
ATLAN_BASE_URL: str
ATLAN_API_KEY: str
ATLAN_AGENT_ID: str = "NA"
ATLAN_AGENT: str = "atlan-mcp"
ATLAN_MCP_USER_AGENT: str = f"Atlan MCP Server {MCP_VERSION}"
ATLAN_TYPEDEF_API_ENDPOINT: Optional[str] = "/api/meta/types/typedefs/"
The values for ATLAN_BASE_URL
and ATLAN_API_KEY
are loaded from env variables
Since Settings
inherits from BaseSettings (Pydantic)
, the environment variables (ATLAN_BASE_URL, ATLAN_API_KEY) are only loaded when we create an instance, because that's when Pydantic reads the environment/.env file.
In the following @staticmethods :
- build_api_url () ->
ATLAN_BASE_URL
is required - make_request () ->
ATLAN_API_KEY
is required
Hence, the initialization ( instance creation ) is necessary
…nd merge generic ones into search_assets_tool
# Search for assets with custom metadata having a specific property filter (eq) | ||
assets = search_assets( | ||
custom_metadata_conditions=[{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why create a specific filter for this here? Why not make it part of the normal conditions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you wouldn't have to define all the context and convertors as well. If the LLM understands how to use the CM context, they can use the unique ids as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current Search Implementation
The search implementation in /Users/satabrata.paul/Desktop/atlan-github-repos/agent-toolkit-internal/modelcontextprotocol/tools/search.py
processes conditions through:
Standard Asset Attributes Processing
- Attribute Resolution: Uses
SearchUtils._get_asset_attribute(attr_name)
which callsgetattr(Asset, attr_name.upper(), None)
to get built-in Asset class attributes - Condition Processing: Uses
SearchUtils._process_condition()
which applies operators likeeq
,contains
,startswith
, etc. directly on Asset attributes
Why Custom Metadata Conditions Need Separate Handling
The separation between normal conditions and custom_metadata_conditions
is necessary because they use fundamentally different PyAtlan APIs and attribute resolution mechanisms:
Comparison Table
Aspect | Normal Conditions (Standard Attributes) | Custom Metadata Conditions |
---|---|---|
Attribute Resolution | Asset.NAME , Asset.DESCRIPTION (direct class attributes) |
CustomMetadataField(set_name="...", attribute_name="...") (requires set name + field name) |
API Classes | Uses Asset class attributes directly |
Requires CustomMetadataField class instantiation |
Search Query Construction | Built-in Elasticsearch fields | Nested field queries with different syntax |
This architectural difference in PyAtlan necessitates separate processing logic
Eventually, this tool helps to prepare the payload for search_assets tool, when users | ||
want to search for assets with filters on custom metadata. | ||
This tool can only be called once in a chat conversation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanted to make sure through docstring that the get_custom_metadata_context_tool()
is not called everytime a user wants to search an asset with respect to custom metadata filters , in one chat window
If by default the context is maintained in a chat window by the LLM ( MCP Clients ), we can remove this
@firecast Do let me know, if we need to remove it
}] | ||
} | ||
}], | ||
include_attributes=["name", "qualified_name", "type_name", "description", "certificate_status"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they are searching on the CMs add them to the include attributes as well
context = get_custom_metadata_context_tool() | ||
# Step 2: Use the context to prepare custom_metadata_conditions for search_assets_tool | ||
# Example context result might show business metadata like "Data Classification" with attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also is there a need for adding these here compared to the search tool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are generic examples which are added as a part of the search_assets
tool
However, in order for the LLM to have better context on what should be the next step be, after the custom metadata definitions are fetched, is why some examples of calling the search_assets
tool are provided as a part of the docstring for get_custom_metadata_context
tool
current_settings = Settings() | ||
headers = { | ||
"Authorization": f"Bearer {current_settings.ATLAN_API_KEY}", | ||
"x-atlan-client-origin": "atlan-search-app", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add these and not just leverage the CustomMetadataCache
pyatlan class to fetch them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API call mechanism is added to get additional context of custom metadata attributes which are of Enum type ( i.e.: Options ) -> which have a fixed set of values
The CustomMetadataCache does have method to get information of on all custom metadata definitions, including attribute definitions, but no context of enum defs can be retrieved
Hence, the API call mechanism addresses both custom metadata definitions ( with attribute defs ) and provide additional context of attribute defs which are of Enum Type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes Summary
Linear Issue Solved