Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DISC-378 : Create persona filters on hierachical qualifiedName filters #3399

Merged
merged 4 commits into from
Aug 26, 2024

Conversation

sumandas0
Copy link

@sumandas0 sumandas0 commented Aug 19, 2024

Hierarchical filters on qualifiedName to eliminate the wildcard persona filters.

We are creating an internal attributes in ES which will store the hierarchical data of qualifiedName which we will use to filter assets based on hierarchy and get rid of wildcard filters in most of the cases. More details is available on https://www.notion.so/atlanhq/Problem-Statement-Optimizing-Persona-Aliases-in-Elasticsearch-for-Improved-Search-Performance-7840c233e6dd4d96b017dff791c88282?pvs=4

@sumandas0 sumandas0 self-assigned this Aug 19, 2024
@sumandas0 sumandas0 changed the title PLT-378 : Fix connectionQName inn hierarchical filter DISC-378 : Create persona filters on hierachical qualifiedName filters Aug 19, 2024
asset = asset.substring(0, asset.length() - 2);
}
boolean isWildcard = asset.contains("*") || asset.contains("?");
if (isWildcard) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For asset value default/redshift/1705171197/dev/public/*
isHierarchical will be true
isWildcard will be true

Given this, it will never go to else block & hence metadataPolicyQualifiedNames.add(asset); will not be executed instead it will create wildcard as isWildcard is true

Ideally in this case asset should be added into metadataPolicyQualifiedNames (substring by -2) & no need of wildcard then

Copy link

@nikhilbonte21 nikhilbonte21 Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, honestly, I feel this logic is complex to understand & maintain, and different check for useHierarchicalQualifiedNameFilter

can we separate them by useHierarchicalQualifiedNameFilter?

like


boolean isWildcard = asset.contains("*") || asset.contains("?");

if (useHierarchicalQualifiedNameFilter) {
    if (isWildcard) {
        boolean isHierarchical = asset.endsWith("/*");

        if (isHierarchical) {
	    asset = asset.substring(0, asset.length() - 2);
	    metadataPolicyQualifiedNames.add(asset);
	} else {
	    allowClauseList.add(mapOf("wildcard", mapOf(QUALIFIED_NAME, asset)));
	    allowClauseList.add(mapOf("wildcard", mapOf(QUALIFIED_NAME, asset + "/*")));
	}
    } else {
    	metadataPolicyQualifiedNames.add(asset);
    }

} else {
    // this is as it is previous logic
    if (isWildcard) {
        allowClauseList.add(mapOf("wildcard", mapOf(QUALIFIED_NAME, asset)));
    } else {
        terms.add(asset);
    }
    allowClauseList.add(mapOf("wildcard", mapOf(QUALIFIED_NAME, asset + "/*")));
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For asset value default/redshift/1705171197/dev/public/*
isHierarchical will be true
isWildcard will be true

No asset = asset.substring(0, asset.length() - 2); will make sure we we cut /* from the string and consider it then. So isWildcard will be false.

Copy link
Author

@sumandas0 sumandas0 Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is like this because if we wanted to optimize for the case of /* in the end. If there is any we trim last part and proceed.

And added extra 2 conditions if we have hierarchical flag enabled, when we enable it for everyone and remove flag condition the conditions will be simpler and easier to write code then. Its bit messy for the feature flag part actually. Otherwise its pretty simple.

Its already been tested a couple of time, I think we can proceed with this one.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood

if (entity.hasAttribute(QUALIFIED_NAME)) {
String qualifiedName = (String) entity.getAttribute(QUALIFIED_NAME);
if (StringUtils.isNotEmpty(qualifiedName)) {
vertex.removeProperty(QUALIFIED_NAME_HIERARCHY_PROPERTY_KEY);
Copy link

@nikhilbonte21 nikhilbonte21 Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: call this statement only if necessary as it is redundant in case of GCTs, Readme, Link

Comment on lines +1835 to +1837
if (i > 0) {
currentPath.append("/");
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion:
This If can be avoided by moving currentPath.append("/"); to the very end of the block (inside)
if (StringUtils.isNotEmpty(part)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g.

if (StringUtils.isNotEmpty(part))
    currentPath.append(part);
    // i>1 reason: we don't want to add the first part of the qualifiedName as it is the entity name
    // Example qualifiedName : default/snowflake/123/db_name we only want `default/snowflake/123` and `default/snowflake/123/db_name`
    if (i > qualifiedNameOffset) {
        if (isDataMeshType && (part.equals("domain") || part.equals("product"))) {
            continue;
        }
        AtlasGraphUtilsV2.addEncodedProperty(vertex, QUALIFIED_NAME_HIERARCHY_PROPERTY_KEY, currentPath.toString());
    }
    currentPath.append("/");
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its there if there is no / in the qName then the split would be of len 1, and we will basically just not add / which will not be handled in this case. This condition makes sure it doesn't add unnecessary / after qualifiedName of glossary like assets.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All we are doing is i > 0 before currentPath.append(part);
Above suggestion results into exact same output & avoids i>0 check many times in case QN is longer & with many /, but anyway this is not a blocker to merge

@sumandas0 sumandas0 merged commit 9d8dc78 into master Aug 26, 2024
8 of 9 checks passed
@sumandas0 sumandas0 deleted the disc-378-qname-hierarchy branch August 26, 2024 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants