Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DYN-7535 Record python engine package information in graphs when using engines served from them #15515

Closed
wants to merge 1 commit into from

Conversation

zeusongit
Copy link
Contributor

Purpose

The PR adds package information to dynamo graphs consisting of python nodes using the latest engine.
When a python engine is served from a package, the package name and version information is lost, due to which we were unable to display dependency mismatch in workspace references. This PR will preserve package name and version information as part of the python node that uses a python engine served from a package.

Following steps are performed to populate a map of python engine and it's respective package name and version:

  • Get all package paths
  • Get all dlls/binaries from all package paths
  • Use MLC to inspect all binaries found in those paths and select the one whose Assembly and namespace names are same as the PythonEngine.
  • Based on that assembly inspected by the MLC that has the engine, get the loaded assembly from the AppDomain
  • Cast type to PythonEngine to get engine name, and validate if that engine was added to Dynamo.
  • Using the path to that binary, get the package path.
  • Extract the package name and version and populate the PackageEngineMap with the engine name and package info.
  • When Python Node engine is updated, update engine package and version as well.

(Note: I know variable name are a bit vague, will update them as the review goes on)

Declarations

Check these if you believe they are true

  • The codebase is in a better state after this PR
  • Is documented according to the standards
  • The level of testing this PR includes is appropriate
  • User facing strings, if any, are extracted into *.resx files
  • All tests pass using the self-service CI.
  • Snapshot of UI changes, if any.
  • Changes to the API follow Semantic Versioning and are documented in the API Changes document.
  • This PR modifies some build requirements and the readme is updated
  • This PR contains no files larger than 50 MB

Release Notes

  • Record python engine package information in graphs when using engines served from them

Reviewers

@DynamoDS/dynamo

@zeusongit zeusongit changed the title Record python engine package information in graphs when using engines served from them DYN-7535 Record python engine package information in graphs when using engines served from them Sep 30, 2024
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the ticket for this pull request: https://jira.autodesk.com/browse/DYN-7535

@mjkkirschner
Copy link
Member

mjkkirschner commented Sep 30, 2024

To me - on first glance, the approach seems a bit backward. I am especially leary of the use of App domain because that doesn't really exist anymore, it just points to the default ALC - but as you know packages might be loaded into non default ALCs.

I think a less brittle approach could be

  1. to shrink the APIs for actually injecting python engines so the API has a well defined boundary.
  2. record what package just loaded the engine at the time it loads it.

@zeusongit
Copy link
Contributor Author

zeusongit commented Sep 30, 2024

To me - on first glance, the approach seems a bit backward. I am especially leary of the use of App domain because that doesn't really exist anymore, it just points to the default ALC - but as you know packages might be loaded into non default ALCs.

I think a less brittle approach could be

  1. to shrink the APIs for actually injecting python engines so the API has a well defined boundary.
  2. record what package just loaded the engine at the time it loads it.
    • would introduce breaking changes, which I am not sure is possible for 3.4, one of the concern was the public AvailableEngines which is marked obsolete. It would be really great to limit that, but may be in a separate task with more time
    • is what I also wanted to do and tried, in IronPython and PythonNet3 case the engines are loaded from an extension, at that point in Dynamo we do not have much information except the extension manifest file, so we may have to perform similar steps at that time. Second, the loading happens inside the extension code after Ready() is called, but can it also occur from package loading? I wanted to generalize the approach instead of checking if AvailableEngines count increased after package loading and then after extensions are loaded.

@QilongTang QilongTang added this to the 3.4 milestone Sep 30, 2024
packageBinPaths = packageBinPaths.Concat(Directory.GetFiles(ppath, "*.dll", SearchOption.AllDirectories)).ToList();
}

var dotNetRuntimePaths = Directory.GetFiles(RuntimeEnvironment.GetRuntimeDirectory(), "*.dll");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

executing these calls over and over is a waste - disk access can be slow.

MetadataLoadContext mlc = null;
var resolver = new PathAssemblyResolver(standardPaths);
mlc = new MetadataLoadContext(resolver);
var all = AppDomain.CurrentDomain.GetAssemblies().ToList();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry about this because you'll only find assemblies loaded into the default ALC, I don't think this will let you see all the assemblies loaded into the current process. You may need to dig around in the dotnet source to confirm or deny that.

AppDomain is now just a facade on top of assembly load contexts.

}

//ignoring assembly version
private bool IsEqualPythonEngineAssemblyName(Type t)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you found that this failed if the base types were of different versions? You cannot just check that the type is assignable to a PythonEngine?

Copy link
Contributor Author

@zeusongit zeusongit Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assembly returned from MLC is different from the assembly loaded into the App, it had partial information, and the isAssignable check to PythonEngine type did not resolve (even though I can see the same types), that is why I had to go for this comparison.
Somewhat related: https://stackoverflow.com/questions/10439668/getting-custom-assembly-attributes-without-loading-into-current-appdomain

if (a != null)
{
//Fetch it's loaded instance from AppDomain
var b = all.FirstOrDefault(x => x.GetName().Name.ToLower().Equals(Path.GetFileNameWithoutExtension(mlcassem.GetName().Name).ToLower()), null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be very slow.

{
return enginePackageName;
}
set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely weird that this is a public setter.

{
return enginePackageVersion;
}
set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here.

{
if (Instance.PythonEnginePackageMap.Count > 0)
{
if (Instance.PythonEnginePackageMap.TryGetValue(engineName, out var tuple))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens when there are engines with the same name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will add a check

@@ -222,36 +258,35 @@ private void LoadPythonEngine(Assembly assembly)
PropertyInfo instanceProp = null;
try
{
//eType = assembly.GetTypes().FirstOrDefault(x => typeof(PythonEngine).IsAssignableFrom(x) && !x.IsInterface && !x.IsAbstract);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

??

@mjkkirschner
Copy link
Member

this PR should have some tests asserting that all of our python packages (or mock ones that mimic them) are identified correctly.

RaisePropertyChanged(nameof(EngineName));
}
}
}

[JsonProperty("EnginePackageName")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there value in serializing this with each node?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same graph containing multiple python nodes, pointing to different engines, imported from different packages.

@pinzart90
Copy link
Contributor

@zeusongit I’ll take a look at this tonight. The python engine APIs are still somewhat WIP and in the past proves to be tricky to get right

@@ -125,8 +124,10 @@ internal static readonly Lazy<PythonEngineManager>
[Obsolete("AvailableEngines field will be replaced in a future Dynamo release.")]
public ObservableCollection<PythonEngine> AvailableEngines;

internal Dictionary<string, Tuple<string, string>> PythonEnginePackageMap;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not add the Package and Version properties to the PythonEngine class ?
All python engines are loaded from DSCPython (part of Dynamo) or from the PackageManager.PackageLoader.TryLoadPackageIntoLibrary. You could easily send the pkg information through the LoadPythonEngine arguments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it is not straight-forward, the python engine is loaded as an extension, and extensions are loaded after packages, so we do not know if an engine was added after the packages are loaded.

{
enginePackageName = pkgDetails.Item1;
enginePackageVersion = pkgDetails.Item2;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect that all engine names stored throughout Dynamo should be changed to something like FullyQualifiedEngineName (PackageName.EngineName@PkgVersion) So we should not need to match the EngineName to packages.

@pinzart90
Copy link
Contributor

Is this change motivated by the fact that we can now load multiple versions of the same Python Package in the same dynamo session ? (because of assembly isolation)...?


private void PopulatePythonEnginePackageMap()
{
var packageBinPaths = new List<string>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you doing all this assembly searching and loading here ?
PythonEngines are already loaded In the PythonEngineManager constructor and in the PackageManager.PackageLOader.

The PythonEnginesList from this class will always be kept up to date when PythonEngineManager.Instance.AvailableEngines is changed. We should be able to store all the info we need in the PythonENgine classes when they are initialized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you doing all this assembly searching and loading here ?

Had to do all this to avoid breaking changes, also the idea was to not depend on the extension/engine to tell what package they come from (at-least for me), if allowed to change PythonEngine, or add a dependency on the extension itself to declare which package they belong to, it would have been way simpler. I agree.

@zeusongit
Copy link
Contributor Author

Is this change motivated by the fact that we can now load multiple versions of the same Python Package in the same dynamo session ? (because of assembly isolation)...?

Hi @pinzart90 ,
The change is motivated by the fact that we cannot currently display the user a warning if they open a graph consuming an older version of a PythonEngine, when a new version is loaded/installed in Dynamo. In this case we can preserve which package version the engine was loaded from, so that we can match it with the loaded python engine and show a warning or trigger workspace references flow.

@pinzart90
Copy link
Contributor

Is this change motivated by the fact that we can now load multiple versions of the same Python Package in the same dynamo session ? (because of assembly isolation)...?

Hi @pinzart90 , The change is motivated by the fact that we cannot currently display the user a warning if they open a graph consuming an older version of a PythonEngine, when a new version is loaded/installed in Dynamo. In this case we can preserve which package version the engine was loaded from, so that we can match it with the loaded python engine and show a warning or trigger workspace references flow.

So the problem is that newer versions of a PythonPackage are not/expected to not be backward compatible and thus causing graphs to break ?

@zeusongit
Copy link
Contributor Author

Is this change motivated by the fact that we can now load multiple versions of the same Python Package in the same dynamo session ? (because of assembly isolation)...?

Hi @pinzart90 , The change is motivated by the fact that we cannot currently display the user a warning if they open a graph consuming an older version of a PythonEngine, when a new version is loaded/installed in Dynamo. In this case we can preserve which package version the engine was loaded from, so that we can match it with the loaded python engine and show a warning or trigger workspace references flow.

So the problem is that newer versions of a PythonPackage are not/expected to not be backward compatible and thus causing graphs to break ?

Yes, it could considering the current state, that has some bugs which when fixed in the future version, may break graphs. Also, the other way around, when a user saves the graph with a newer version, and then shares the graph to a user who has an older version installed.

@pinzart90
Copy link
Contributor

pinzart90 commented Oct 1, 2024

Is this change motivated by the fact that we can now load multiple versions of the same Python Package in the same dynamo session ? (because of assembly isolation)...?

Hi @pinzart90 , The change is motivated by the fact that we cannot currently display the user a warning if they open a graph consuming an older version of a PythonEngine, when a new version is loaded/installed in Dynamo. In this case we can preserve which package version the engine was loaded from, so that we can match it with the loaded python engine and show a warning or trigger workspace references flow.

So the problem is that newer versions of a PythonPackage are not/expected to not be backward compatible and thus causing graphs to break ?

Yes, it could considering the current state, that has some bugs which when fixed in the future version, may break graphs. Also, the other way around, when a user saves the graph with a newer version, and then shares the graph to a user who has an older version installed.

We could very well just store all the loaded packages (name, version etc) in the graph and then each serialized node could point to one of those packages (so all nodes could restore specific versions of packages). Or does this already exist in some form ?
Oh I see, "NodeLibraryDependencies": []
Why does not this work for python nodes ?

@zeusongit
Copy link
Contributor Author

Oh I see, "NodeLibraryDependencies": [] Why does not this work for python nodes ?

I think because the PythonNode itself is not dependent on the package, but the engine it uses could be.

@pinzart90
Copy link
Contributor

dent on the package, but the engine it uses could be.

I think we could make workspaceDEpendency system work python python nodes. I would vote to use that system since it is already doing the same thing for other nodes

@pinzart90
Copy link
Contributor

pinzart90 commented Oct 1, 2024

dent on the package, but the engine it uses could be.

I think we could make workspaceDEpendency system work python python nodes. I would vote to use that system since it is already doing the same thing for other nodes

It would probably need to be adapted to take into account AssemblyLoadContexts and the ability of the python node to change its target package at runtime (after it is created)

@zeusongit
Copy link
Contributor Author

dent on the package, but the engine it uses could be.

I think we could make workspaceDEpendency system work python python nodes. I would vote to use that system since it is already doing the same thing for other nodes

Yes, that is the second part of this task, to use this information and trigger workspace dependency flow (not yet implemented)
This PR is just to record the package information of the python engine used in the graph, so that we have some way to compare states between the used and loaded engines.

@pinzart90
Copy link
Contributor

dent on the package, but the engine it uses could be.

I think we could make workspaceDEpendency system work python python nodes. I would vote to use that system since it is already doing the same thing for other nodes

Yes, that is the second part of this task, to use this information and trigger workspace dependency flow (not yet implemented) This PR is just to record the package information of the python engine used in the graph, so that we have some way to compare states between the used and loaded engines.

YEah but the workspaceDEpendency system already serializes the package version information in the graph. I do not see why we would want to duplicate that information with another custom serialization of the python node models

@zeusongit
Copy link
Contributor Author

dent on the package, but the engine it uses could be.

I think we could make workspaceDEpendency system work python python nodes. I would vote to use that system since it is already doing the same thing for other nodes

It would probably need to be adapted to take into account AssemblyLoadContexts and the ability of the python node to change its target package at runtime (after it is created)

Yes I need to look into ALC and yes for dynamic changing package info.

@zeusongit
Copy link
Contributor Author

dent on the package, but the engine it uses could be.

I think we could make workspaceDEpendency system work python python nodes. I would vote to use that system since it is already doing the same thing for other nodes

Yes, that is the second part of this task, to use this information and trigger workspace dependency flow (not yet implemented) This PR is just to record the package information of the python engine used in the graph, so that we have some way to compare states between the used and loaded engines.

YEah but the workspaceDEpendency system already serializes the package version information in the graph. I do not see why we would want to duplicate that information with another custom serialization of the python node models

Ok, got it, I can look into reusing NodeLibraryDependencies for this purpose.

@pinzart90
Copy link
Contributor

PythonEngine

Yes, you can add internal data to that class

@pinzart90
Copy link
Contributor

PythonEngine

Yes, you can add internal data to that class

But I am not sure you will need too, since the PakcageManagerExtension already has private Dictionary<string, List<PackageInfo>> NodePackageDictionary; to store package information

@zeusongit
Copy link
Contributor Author

Alternate implementation: #15530 PTAL

@zeusongit zeusongit added the DNM Do not merge. For PRs. label Oct 8, 2024
@zeusongit zeusongit closed this Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DNM Do not merge. For PRs. WIP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants