Skip to content

Releases: neo4j/graph-data-science

Graph Data Science 2.4.1

27 Jun 11:41
Choose a tag to compare

Bug fixes

  • Fix a bug in K-Core decomposition that can return invalid values if core values are not consecutive.
  • Fix a bug when using mutateProperty where using the same name as an existing node property could fail. Affected procedures include:
    • gds.alpha.knn.filtered.mutate
    • gds.alpha.nodeSimilarity.filtered.mutate
    • gds.beta.pipeline.linkPrediction.predict.mutate
    • gds.beta.steinerTree.mutate
    • gds.beta.spanningTree.mutate
    • gds.knn.mutate
    • gds.nodeSimilarity.mutate


  • Improved error handling when negative node ids are used as input in the sourceNode, targetNode, sourceNodes, and targetNodes fields.
  • Improved performance when projecting in-memory graphs when projecting larger graphs.

Graph Data Science 2.4.0

14 Jun 15:39
Choose a tag to compare

Breaking changes

  • Pass concurrency when training a pipeline to the node property steps. Before they were executed with the default concurrency of 4 if not overridden. This affects
    • gds.beta.pipeline.linkPrediction.train
    • gds.beta.pipeline.nodeClassification.train
    • gds.alpha.pipeline.nodeClassification.train

New features


  • Added Bellman-Ford algorithm
  • Added K-Core Decomposition algorithm
  • Added new Common Neighbour Aware Random Walk graph sampling algorithm
  • Add Random Forest and MLP classifier serialization support. This makes all node classification and link prediction models serializable


  • You can rename node properties when writing them back to the neo4j database using gds.nodeProperties.write by placing them inside a map in the form nodeProperty: 'renamedProperty'.

  • Added minCommunitySize|minComponentSize parameter to more procedures to allow filtering the result. (Contributed by @airtyon)

  • Added new procedure gds.alpha.drop.cypherdb to drop created in-memory databases

  • Added upperDegreeCutoff parameter to Node-Similarity and filtered Node-Similarity algorithm which allows skipping nodes if their degree is higher than the provided value.

  • Added aggregation to gds.beta.toUndirected to allow the aggregation of the new undirected relationships.

  • Added new optional parameter storeModelToDisk that automatically saves serializable models after training for licensed users. This affects gds.beta.pipeline.[linkPrediction|nodeClassification].train and gds.beta.graphSage.train.

  • Added procedure gds.graph.relationshipProperties.write that allows writing relationships with multiple properties to Neo4j.

  • Cypher Aggregation has graduated, which comes with a new name and API changes:

    • The method of projection is now generally called "Cypher projection", possible with an additional "new" or "v2" qualifier.
      • The existing 'Cypher projection' (gds.graph.project.cypher) is now called "Legacy Cypher projection"
    • The procedure name is losing the alpha qualifier and is now called gds.graph.project.
    • The old name gds.alpha.graph.project is deprecated and usages will forward to the new name while also adapting to the new API.
    • The 4th and 5th parameters nodeConfig and relationshipConfig have been merged into a single dataConfig parameter.
    • The properties configuration key in this merged dataConfig parameter has been renamed to relationshipProperties.
    • The overall projection configuration (e.g. readConcurrency) has moved from the 6th parameter to the 5th parameter.
  • Graph data retrieved via the GDS Arrow endpoint can now be partitioned via the FlightInfo endpoint.

Bug fixes

  • Fixed: Arrow server doesn't enable to project graphs with blank names anymore
  • Fixed: Arrow validates dangling relationships when creating an in-memory graph
  • Fixed: if an arrow process is aborted, creating a new process with the same name is now possible
  • Fixed a bug where gds.graph.export could fail when exporting larger graphs
  • Fixed a bug where gds.alpha.kSpanningTree returned incorrect results when called with the nodeLabels parameter.
  • Fixed a bug where gds.triangleCount would throw an ArrayIndexOutOfBoundsException when called with the nodeLabels parameter.
  • Fixed a bug where link prediction mutate results could fail when predicted probability is extremely close to zero.



  • Improve parallel runtime of several algorithms due to improvements of our degree-based partitioning. Note this is highly dataset dependent and is not be visible for all datasets. Affected algorithms are:
    • FastRP
    • HashGNN
    • Leiden
    • Approxmaxkcut
    • Conductance
    • LinkPrediction training
    • ToUndirected
  • Improved partitioning. This affects the parallel runtime of gds.alpha.hits, gds.beta.graph.project.subgraph and gds.beta.pipeline.linkPrediction.predict if sampleRate = 0


  • Improve progress tracking for gds.beta.graphSage.train. This will enable progress bars on the python client.
  • Improve error message for invalid nodeLabels and relationshipTypes for procedures supporting memory estimation.
  • Allow running gds.debug.sysInfo and gds.debug.arrow to run against the system database.
  • Improve automatic conversion of array property values during graph projection.
  • The Yens algorithm can now be run in parallel.
  • The node regression now verifies upfront that the all targetProperty values provided are valid when calling gds.alpha.pipeline.nodeRegression.train.
  • The scale properties algorithm has been promoted:
    • Added new procedures gds.scaleProperties.[stream,mutate] which replace gds.alpha.scaleProperties.[stream,mutate] that are now deprecated
      • The scalers L1Norm and L2Norm are not supported in the new procedures.
    • Added new procedures gds.scaleProperties.[stats,write] to return statistics from a scale properties computation and write scaled properties back to a database respectively
    • Procedures gds.scaleProperties.[mutate,stats,stream,write] support progress tracking with volumes. This will enable progress bars on the python client
    • Procedures gds.scaleProperties.[mutate,stats,write] return statistics from the performed scale computation
    • Added new parameter offset to the log scaler. This also affects procedures:
      • gds.pageRank
      • gds.eigenvector
      • gds.articleRank
    • Added new procedures gds.scaleProperties.[mutate|stats|stream|write].estimate for estimating the memory requirements of running the scale properties algorithm
    • Nodes with missing properties (null or NaN) are now omitted in the scale computation. Their scale value is set to NaN in the output.
  • Reduce the memory footprint of the binary embeddings saved by gds.beta.hashgnn.mutate.
  • Promote random forest classifier to beta tier. Added gds.beta.pipeline.[nodeClassification,linkPrediction].addRandomForest which replace gds.alpha.pipeline.[nodeClassification,linkPrediction].addRandomForest that are now deprecated.
  • Reduced memory allocation for the Spanning Tree algorithm.
  • A more effective rerouting algorithm is applied for the minimum Directed Steiner-Tree algorithm when the inverted index is present.
  • Improve memory usage when projecting very large graphs with very high degree nodes.
  • Additional validation for Cypher projection configuration to guide migration and avoid common mistakes.
  • The import of nodes with negative id via arrow into a database is now forbidden.
  • Graph restore now attempts to use the same id map implementation that has been used for the original graph.
  • Setting the useBadCollector option to true for the arrow database import will now actually trigger errors if the collector encountered a problem.

Graph Data Science 2.4.0 PREVIEW

02 Jun 13:00
Choose a tag to compare

Neo4j Graph Data Science version 2.4.0 is compatible with Neo4j version 4.4 and Neo4j versions 5.1 through 5.8.

Breaking changes

  • Pass concurreny when training a pipeline to the node property steps. Before they were executed with the default concurrency of 4 if not overridden. This affects
    • gds.beta.pipeline.linkPrediction.train
    • gds.beta.pipeline.nodeClassification.train
    • gds.alpha.pipeline.nodeClassification.train

New features

  • You can rename node properties when writing them back to the neo4j database using gds.nodeProperties.write by placing them inside a map in the form nodeProperty: 'renamedProperty'.
  • Added minCommunitySize|minComponentSize parameter to more procedures to allow filtering the result. (Contributed by @airtyon) This includes:
    • gds.beta.k1coloring.[stream|write]
    • gds.beta.leiden.[stream|write]
    • gds.beta.modularityOptimization.[stream|write]
  • Added new procedure gds.alpha.drop.cypherdb to drop created in-memory databases
  • Added Bellman-Ford algorithm:
    • gds.bellmanFord.stats
    • gds.bellmanFord.stats.estimate
    • gds.bellmanFord.mutate
    • gds.bellmanFord.mutate.estimate
    • gds.bellmanFord.write
    • gds.bellmanFord.write.estimate
  • Add Random Forest and MLP classifier serialization support. This makes all node classification and link prediction models serializable. This affects and gds.alpha.model.load.
  • Added upperDegreeCutoff parameter to Node-Similarity and filtered Node-Similarity algorithm which allows skipping nodes if their degree is higher than the provided value.
  • Added aggregation to gds.beta.toUndirected to allow the aggregation of the new undirected relationships.
  • Added new optional parameter storeModelToDisk that automatically saves serializable models after training for licensed users. This affects gds.beta.pipeline.[linkPrediction|nodeClassification].train and gds.beta.graphSage.train.
  • Added K-Core Decomposition algorithm:
    • gds.kcore.stats
    • gds.kcore.stats.estimate
    • gds.kcore.mutate
    • gds.kcore.mutate.estimate
    • gds.kcore.write
    • gds.kcore.write.estimate
  • Added procedure gds.graph.relationshipProperties.write that allows writing relationships with multiple properties to Neo4j.
  • Added new Common Neighbour Aware Random Walk graph sampling algorithm gds.graph.sample.cnarw. Available under beta tier.
  • Cypher Aggregation has graduated, which comes with a new name and API changes:
    • The method of projection is now generally called "Cypher projection", possible with an additional "new" or "v2" qualifier.
      • The existing 'Cypher projection' (gds.graph.project.cypher) is now called "Legacy Cypher projection"
    • The procedure name is losing the alpha qualifier and is now called gds.graph.project.
    • The old name gds.alpha.graph.project is deprecated and usages will forward to the new name while also adapting to the new API.
    • The 4th and 5th parameters nodeConfig and relationshipConfig have been merged into a single dataConfig parameter.
    • The properties configuration key in this merged dataConfig parameter has been renamed to relationshipProperties.
    • The overall projection configuration (e.g. readConcurrency) has moved from the 6th parameter to the 5th parameter.

Bug fixes

  • Fixed: Arrow server doesn't enable to project graphs with blank names anymore
  • Fixed: Arrow validates dangling relationships when creating an in-memory graph.


  • Improve progress tracking for gds.beta.graphSage.train. This will enable progress bars on the python client.
  • Improve error message for invalid nodeLabels and relationshipTypes for procedures supporting memory estimation.
  • Allow running gds.debug.sysInfo and gds.debug.arrow to run against the system database.
  • Improve automatic conversion of array property values during graph projection.
  • The Yens algorithm can now be run in parallel.
  • The node regression now verifies upfront that the all targetProperty values provided are valid when calling gds.alpha.pipeline.nodeRegression.train.
  • The scale properties algorithm has been promoted:
    • Added new procedures gds.scaleProperties.[stream,mutate] which replace gds.alpha.scaleProperties.[stream,mutate] that are now deprecated
      • The scalers L1Norm and L2Norm are not supported in the new procedures.
    • Added new procedures gds.scaleProperties.[stats,write] to return statistics from a scale properties computation and write scaled properties back to a database respectively
    • Procedures gds.scaleProperties.[mutate,stats,stream,write] support progress tracking with volumes. This will enable progress bars on the python client
    • Procedures gds.scaleProperties.[mutate,stats,write] return statistics from the performed scale computation
    • Added new parameter offset to the log scaler. This also affects procedures:
      • gds.pageRank
      • gds.eigenvector
      • gds.articleRank
    • Added new procedures gds.scaleProperties.[mutate|stats|stream|write].estimate for estimating the memory requirements of running the scale properties algorithm
    • Nodes with missing properties (null or NaN) are now omitted in the scale computation. Their scale value is set to NaN in the output.
  • Reduce the memory footprint of the binary embeddings saved by gds.beta.hashgnn.mutate.
  • Promote random forest classifier to beta tier. Added gds.beta.pipeline.[nodeClassification,linkPrediction].addRandomForest which replace gds.alpha.pipeline.[nodeClassification,linkPrediction].addRandomForest that are now deprecated.
  • Reduced memory allocation for the Spanning Tree algorithm.
  • A more effective rerouting algorithm is applied for the minimum Directed Steiner-Tree algorithm when the inverted index is present.
  • Improve runtime of gds.alpha.hits for concurrency > 1 due to a better partitioning.
  • Improve parallel runtime of several algorithms due to improvements of our degree-based partitioning. Note this is highly dataset dependent and is not be visible for all datasets. Affected algorithms are:
    • FastRP
    • HashGNN
    • Leiden
    • Approxmaxkcut
    • Conductance
    • LinkPrediction training
    • ToUndirected
  • Improve parallel runtime of gds.beta.graph.project.subgraph when filtering relationships due to a better partitioning.
  • Improve parallel runtime of gds.beta.pipeline.linkPrediction.predict if sampleRate = 0 due to a better partitioning.
  • Improve memory usage when projecting very large graphs with very high degree nodes.
  • Additional validation for Cypher projection configuration to guide migration and avoid common mistakes.


27 Apr 10:34
Choose a tag to compare

Bug fixes

  • gds.beta.pipeline.linkPrediction.train sampled relationships now only contain valid node ids and will avoid ArrayIndexOutOfBoundException during training.

Graph Data Science 2.3.3

21 Apr 13:10
Choose a tag to compare

New features

Neo4j Database Compatibility

  • This release is compatible with all Neo4j 5.x database version <= 5.7.0. Please see our compatibility matrix above.

  • Added includeGraphs parameter to gds.alpha.backup to allow backups without graphs.

Bug fixes

  • Multiclass node classification compatible with non-consecutive class ids
  • RandomWalk stable on multiple runs (user contribution by github user hindog)


  • Make gds.alpha.restore more failsafe
    • Continue to restore graphs and models also after the first failure for a user.
    • Improve logging around failures

Full Changelog: 2.3.2...2.3.3

Graph Data Science 2.3.2

11 Apr 10:18
Choose a tag to compare

GDS 2.3.2 is compatible with Neo4j 5 & 4.4 versions (≥ 4.4.9).

For GDS compatibility with previous releases, please use GDS Compatibility Table.

New features

Neo4j Database Compatibility

  • This release is compatible with all Neo4j 5.x database version <= 5.6.0. Please see our compatibility matrix above.

Bug fixes

  • Graphs imported via Arrow no longer cause invalid node mappings that produced ArrayIndexOutOfBoundsExceptions
  • Correct memory estimation of Leiden for very small graphs
  • KNN no longer result in an AIOOB exception if the array node properties did not exist for some nodes
  • CELF no longer returns negative gains for some nodes
  • GraphSage will no longer return NaN values because of incorrect neighbor sampling


  • More accurate memory estimation on Node Similarity and filtered Node Similarity algorithms for high topN or topK values.
  • The gds.alpha.modularity procedures for computing modularity no longer require each community to be smaller than the size of the graph.
  • Improve the progress logging of gds.graph.project.cypher to be more accurate. Especially, this avoids underestimating when the relationship query is more complex.

Graph Data Science 2.3.1

16 Feb 15:42
Choose a tag to compare

GDS 2.3.1 is compatible with Neo4j 5 & 4.4 versions (≥ 4.4.9) & 4.3 versions (≥ 4.3.15) Database.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

New features

Neo4j Database Compatibility

  • This release is compatible with all Neo4j 5.x database version <= 5.5.0. Please see our compatibility matrix above.

Log Progress

  • New optional configuration parameter logProgress allows you to specify whether percentage logging for that procedural call is on or off.

Bug fixes

  • Louvain no longer reports the incorrect modularity
  • Leiden on weighted graphs communities are now reported correctly
  • Persisted Models no longer cause false positive error logs when loaded into the Model Catalog
  • Yens on graphs without parallel relationships would cause issues


  • Filtered Node Similarity progress logging has been improved

Graph Data Science 2.3.0

01 Feb 08:31
Choose a tag to compare

GDS 2.3.0 is compatible with Neo4j 5 & 4.4 versions (≥ 4.4.9) & 4.3 versions (≥ 4.3.15) Database.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Breaking changes

  • Leiden was promoted to the beta tier. It is now called via the 'gds.beta.leiden' command instead of the gds.alpha.leiden command.
  • K-means was promoted to the beta tier. It is now called via the gds.beta.kmeans command instead of the gds.alpha.kmeans command.
  • Minimum weighted spanning tree algorithm was promoted to the beta tier. It is now called via the gds.beta.spanningTree command instead of gds.alpha.spanningTree
    • The procedures gds.alpha.spanningTree.minimum and gds.alpha.spanningTree.maximum have been removed. You can get the same behaviour by specifying the new parameter objective in gds.beta.spanningTree.
    • The weightWriteProperty has been removed as a configuration parameter. To supply the Relationship Type and Property for the produced relationship, use:
      • mutateRelationshipType
      • mutateProperty
    • gds.alpha.spanningTree.kmin and gds.alpha.spanningTree.kmax have been removed as the K-Spanning Tree algorithm has been moved in its own space gds.alpha.kSpanningTree
    • The parameter startNodeId in all Spanning Tree algorithms has been replaced with sourceNode.
  • Arrow: when projecting graphs, null will be translated to NaN for floating point values. This enables users of either the GDS Python Client or PyArrow to load NaN properties stored in Pandas DataFrames
  • Cypher Aggregations will become the primary surface for creating projections with Cypher. Offering a more intuitive and expressive interface than Cypher Projections that can also be used in Fabric or Composite Database setups.
  • The algorithm gds.alpha.influenceMaximization.greedy has been removed. It's replacement is the gds.beta.influenceMaximization.celf algorithm which has the same configuration parameters and offers better performance.

New features

Neo4j Database Compatibility

  • This release is compatible with all Neo4j 5.x database version <= 5.4.0. Please see compatibility matrix above.

Minimum Directed Steiner Tree

  • Added heuristic for minimum directed Steiner Tree under the gds.beta.steinerTree domain.
    • Added stats mode with gds.beta.steinerTree.stats
    • Added stream mode with
    • Added mutate mode with gds.beta.steinerTree.mutate
    • Added write mode with gds.beta.steinerTree.write
    • Now available in progress tracking - gds.list.progress()


  • New parameter consecutiveIds that assigns consecutive ids for the discovered communities.
  • New parameter seedProperty to seed initial communities for nodes.
  • New parameter tolerance to enable convergence criteria based on differences in modularity from one iteration to another.
  • Now available in progress tracking - gds.list.progress()
  • Added memory estimation mode:
    • gds.beta.leiden.mutate.estimate
    • gds.beta.leiden.stats.estimate
    • gds.beta.leiden.write.estimate

Logistic Regression & MLP

  • New configuration parameters classWeights and focusWeight for training methods, supported by procedures:
    • gds.beta.pipeline.nodeClassification.addLogisticRegression
    • gds.beta.pipeline.nodeClassification.addMLP
    • gds.beta.pipeline.linkPrediction.addLogisticRegression
    • gds.beta.pipeline.linkPrediction.addMLP


  • New algorithm gds.alpha.hashgnn.{mutate,stream} to create HashGNN node embeddings
  • New estimation procedures gds.alpha.hashgnn.{mutate,stream}.estimate to estimate the memory required to run HashGNN

Link Prediction

  • Added new optional configuration parameter negativeRelationshipType to gds.beta.pipeline.linkPrediction.configureSplit

Spanning Tree

  • New modes supported: gds.beta.spanningTree.{stats, stream, mutate}
  • New yield outputs for gds.beta.spanningTree:
    • the sum of weights in the discovered spanning tree.
    • the number of relationships written or added for write and mutate mode respectively.
  • Added memory estimation mode :
  • gds.beta.spanningTree.mutate.estimate
  • gds.beta.spanningTree.stats.estimate
  • gds.beta.spanningTree.write.estimate

Write Labels

  • gds.alpha.graph.nodeLabel.mutate allows for the Graph Projection to be mutated with new labels
  • gds.alpha.graph.nodeLabel.write allows for Node Labels to be written back from projections to a Neo4j Database

Graph Projections

  • Arrow now supports specifying undirected relationship types using the undirected_relationship_types configuration argument
  • Cypher Aggregations (gds.alpha.graph.project) now support specifying undirected relationship types using the undirectedRelationshipTypes configuration option
  • New procedure to turn directed relationships into undirected relationships: gds.beta.graph.relationships.toUndirected
  • Projections created using either the Native, Arrow and Cypher Aggregation APIs can now be "inverse indexed", this will enable more efficient algorithm implementations


  • Added the jobId and username to the ongoingGdsProcedures return field of gds.alpha.systemMonitor.
  • Added username as a new return field to gds.beta.listProgress.
  • Added a new return field to gds.graph.list called schemaWithOrientation which also includes the orientation.
  • Administrators can now see all running tasks from all users with gds.beta.listProgress

Bug fixes

  • Minimum Weighted Spanning Tree: Graphs with parallel edges could make the discovered tree have wrong weights on relationships
  • Cypher Aggregations: When using gds.alpha.graph.project:
    • The projected graph would list relationship types with zero relationships
    • AIOOB exceptions could surface due to sizing errors
  • Arrow: CREATE_DATABASE action would throw an NullPointerException if missing ID fields in the Arrow record. A more descriptive exception is provided
  • gds.graph.list could cause issues on some JDKs when calculating the memory usage of Projections
  • Export relationship progress logging (gds.beta.graph.export.csv) reports the correct progress
  • Graph constructed with Cypher Aggregation using arbitrary IDs are now blocked from write procedures
  • The k-Spanning Tree algorithm no longer returns disconnected partitions
  • Multi-threading bug when creating projections via Cypher Aggregation or Arrow could lead to lost labels
  • Node label filtering could lead to streamed node properties being null when filters are applied
  • Cypher projections and Cypher aggregation would throw the wrong error message when loading an invalid relationship
  • Node label filtering that would lead to the wrong results. This also affected: gds.beta.graphSage and



  • graph import now fully supports external node ids in the 64 Bit space.
  • graph import now supports 16, 32 or 64 Bit node identifiers.
  • Arrow server will now check user RBAC permissions for creating and accessing databases
  • Database import now creates a Relationship Type index


  • Better parallelization and improved overall performance improvements


  • Now supports a new and faster sampling strategy ( undirected and directed graphs) by using the new inverse index.

Machine Learning

  • Inner components of pipeline field returned by gds.pipeline.{ linkPrediction | nodeClassification | nodeRegression }.train procedures are now present directly as part of modelInfo. The pipeline field is now deprecated for removal in a future version.

Other Improvements

  • Speed improvements for Dijkstra, Astar, Yens, CELF, weighted Betweenness Centrality, and the Spanning Tree algorithms. The improvements will see a slight increase in the memory consumption of these algorithms.
  • Improved error message for invalid node labels and relationship types
  • Pregel now supports bidirectional computations (allows for messages to be sent along incoming relationships) using the new inverse index.
  • The procedure gds.graph.export now creates a Relationship Type index
  • Extended node property validation to reject projection configuration mappings with the same property keys, but different default values.

Other changes

  • Histograms returned such as degreeDistribution in gds.graph.list can have slightly different values for specific percentiles due to changes in floating point operations.
  • Progress tracking in the Spanning Tree algorithm has been reworked. Progress reporting may differ from earlier versions.
  • Mark the yielded field schema as deprecated in gds.graph.list and gds.graph.drop. In the next major release, the schema field will use the semantics of schemaWithOrientation
  • In, the positional argument failIfUnsupportedType is renamed to failIfUnsupported. Both will be supported until it is promoted to the beta tier.
  • Progress tracking for Betweenness Centrality has been reworked. Progress reporting may differ from earlier versions.

Pre-release changes

  • The Steiner Tree procedures in gds.beta.SteinerTree was originally introduced as gds.alpha.SteinerTree. The update in naming occurred in 2.3.0-alpha04.

Graph Data Science 2.2.7

27 Jan 09:49
Choose a tag to compare

GDS 2.2.7 is compatible with Neo4j 5 & 4.4 versions (≥ 4.4.9) & 4.3 versions (≥ 4.3.15) Database.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

New features

  • Added compatibility for Neo4j database 5.4.0.

Bug fixes

  • Missing id fields in the Arrow records for the CREATE_DATABASE action would throw a NullPointerException. It now throws a more descriptive exception instead.
  • Graphs with long node or relationship property names would fail during the restore process.
  • Yens algorithm would ignore edges in multigraphs and yield incorrect results.
  • Multi-threading bug when creating projections via Cypher Aggregation or Arrow could lead to lost labels.
  • Node label filtering could lead to streamed node properties being null when filters are applied.
  • Cypher projections and Cypher aggregation would throw the wrong error message when loading an invalid relationship.
  • Node label filtering that would lead to the wrong results. This also affected: gds.beta.graphSage and

Graph Data Science 2.3.0-Alpha04

05 Jan 15:58
Choose a tag to compare

GDS 2.3.0-alpha04 is compatible with Neo4j 5 & 4.4 versions (≥ 4.4.9) & 4.3 versions (≥ 4.3.15) Database.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Breaking changes

  • Leiden was promoted to the beta tier. It is now called via the 'gds.beta.leiden' command instead of the gds.alpha.leiden command.
  • K-means was promoted to the beta tier. It is now called via the gds.beta.kmeans command instead of the gds.alpha.kmeans command.
  • Minimum weighted spanning tree algorithm was promoted to the beta tier. It is now called via the gds.beta.spanningTree command instead of gds.alpha.spanningTree
    • The procedures gds.alpha.spanningTree.minimum and gds.alpha.spanningTree.maximum have been removed. You can get the same behaviour by specifying the new parameter objective in gds.beta.spanningTree.
    • The weightWriteProperty has been removed as a configuration parameter. To supply the Relationship Type and Property for the produced relationship, use:
      • mutateRelationshipType
      • mutateProperty
    • gds.alpha.spanningTree.kmin and gds.alpha.spanningTree.kmax have been removed as the K-Spanning Tree algorithm has been moved in its own space gds.alpha.kSpanningTree
    • The parameter startNodeId in all Spanning Tree algorithms has been replaced with sourceNode.
  • Arrow: when projecting graphs, null will be translated to NaN for floating point values. This enables users of either the GDS Python Client or PyArrow to load NaN properties stored in Pandas DataFrames
  • Cypher Aggregations will become the primary surface for creating projections with Cypher. Offering a more intuitive and expressive interface than Cypher Projections that can also be used in Fabric or Composite Database setups.
  • The algorithm gds.alpha.influenceMaximization.greedy has been removed. It's replacement is the already existing gds.beta.influenceMaximization.celf algorithm which has the same configuration parameters and offers better performance.

New features

Minimum Directed Steiner Tree

  • Added heuristic for minimum directed Steiner Tree under the gds.beta.steinerTree domain.
    • Added stats mode with gds.beta.steinerTree.stats
    • Added stream mode with
    • Added mutate mode with gds.beta.steinerTree.mutate
    • Added write mode with gds.beta.steinerTree.write
    • Now available in progress tracking - gds.list.progress()


  • New parameter consecutiveIds that assigns consecutive ids for the discovered communities.
  • New parameter seedProperty to seed initial communities for nodes.
  • New parameter tolerance to enable convergence criteria based on difference in modularity from one iteration to another.
  • Now available in progress tracking - gds.list.progress()
  • Added memory estimation mode:
    • gds.beta.leiden.mutate.estimate
    • gds.beta.leiden.stats.estimate
    • gds.beta.leiden.write.estimate

Logistic Regression & MLP

  • New configuration parameters classWeights and focusWeight for training methods, supported by procedures:
    • gds.beta.pipeline.nodeClassification.addLogisticRegression
    • gds.beta.pipeline.nodeClassification.addMLP
    • gds.beta.pipeline.linkPrediction.addLogisticRegression
    • gds.beta.pipeline.linkPrediction.addMLP


  • New algorithm gds.alpha.hashgnn.{mutate,stream} to create HashGNN node embeddings
  • New procedures gds.alpha.hashgnn.{mutate,stream}.estimate to estimate the memory required to run HashGNN

Link Prediction

  • Added new optional configuration parameter negativeRelationshipType to gds.beta.pipeline.linkPrediction.configureSplit

Spanning Tree

  • New modes supported: gds.beta.spanningTree.(stats, stream, mutate)
  • New yield output for gds.beta.spanningTree that outputs the sum of weights in the discovered spanning tree.
  • New yield output for gds.beta.spanningTree that outputs the number of relationships written or added for write and mutate mode respectively.
  • Added memory estimation mode :
  • gds.beta.spanningTree.mutate.estimate
  • gds.beta.spanningTree.stats.estimate
  • gds.beta.spanningTree.write.estimate

Write Labels

  • Added gds.alpha.graph.nodeLabel.write to allow for Node Labels to be written back from projections to a Neo4j Database

Graph Projections

  • Arrow now supports specifying undirected relationship types using the undirected_relationship_types configuration argument
  • Cypher Aggregations (gds.alpha.graph.project) now support specifying undirected relationship types using the undirectedRelationshipTypes configuration option
  • New procedure to turn directed relationships into undirected relationships: gds.beta.graph.relationships.toUndirect


  • Added the jobId and username to the ongoingGdsProcedures return field of gds.alpha.systemMonitor.
  • Added username as a new return field to gds.beta.listProgress.
  • Added a new return field to gds.graph.list called schemaWithOrientation which also includes the orientation.
  • Administrators can now see all running tasks from all users with gds.beta.listProgress

Bug fixes

  • Minimum Weighted Spanning Tree: Graphs with parallel edges could make the discovered tree have wrong weights on relationships
  • Cypher Aggregations: When using gds.alpha.graph.project:
    • The projected graph would list relationship types with zero relationships
    • AIOOB exceptions could surface due to sizing errors
  • Arrow: CREATE_DATABASE action would throw a NPE if missing id fields in Arrow record.. A more descriptive exception is provided



  • graph import now fully supports external node ids in the 64 Bit space.
  • graph import now supports 16, 32 or 64 Bit node identifiers.


  • Better parallelization and improved overall performance improvements

Other Improvements

  • Speed improvements for Dijkstra, Astar, Yens, CELF, weighted Betweenness Centrality, and the Spanning Tree algorithms. The improvements will see a slight increase in the memory consumption of these algorithms.
  • Improved error message for invalid node labels and relationship types

Other changes

  • Histograms returned such as degreeDistribution in gds.graph.list can have slightly different values for specific percentiles due to changes in floating point operations.
  • Progress tracking in the Spanning Tree algorithm has been reworked. Progress reporting may differ from earlier versions.
  • Mark the yielded field schema as deprecated in gds.graph.list and gds.graph.drop. In the next major release, the schema field will use the semantics of schemaWithOrientation
  • In, the positional argument failIfUnsupportedType is renamed to failIfUnsupported. Both will be supported until it is promoted to the beta tier.
  • Progress tracking for Betweenness Centrality has been reworked. Progress reporting may differ from earlier versions.