You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here's some example code to generate pairwise lin similarity scores for all node pairs in an NXOntology. For now just posting in case it's helpful, but it's also possible we could create a function to populate a matrix with a similarity metric.
defgenerate_similarity_matrix(nxo: NXOntology[str]) ->npt.NDArray[np.float_]:
nxo.freeze()
nodes=list(nxo.graph.nodes)
# ensure nodes are sorted, since matrix does not store row/column namesassertsorted(nodes) ==nodessimilarity_array=np.zeros(shape=(nxo.n_nodes, nxo.n_nodes), dtype=np.float32)
logging.info(
f"Initialized {similarity_array.shape} array:\n{similarity_array[:5, :5]}"
)
# lin is symmetric, so we use combinations_with_replacement rather than productfor (row, row_efo), (col, col_efo) incombinations_with_replacement(
list(enumerate(nodes)), r=2
):
similarity=nxo.similarity(row_efo, col_efo)
similarity_array[row, col] =similarity.lin# only works for symmetric metricssimilarity_array[col, row] =similarity.linlogging.info(f"Populated array with similarity:\n{similarity_array[:5, :5]}")
returnsimilarity_array# type:ignore[return-value]similarity_array=generate_similarity_matrix(nxo)
path=f"similarity-lin.npy.xz"withfsspec.open(path, "wb", compression="infer") aswrite_file:
np.save(write_file, similarity_array)
On EFO, saving as an XZ compressed npy file worked well. Scipy.sparse matrices can also be considered but can be slower (or faster) to work with.
The text was updated successfully, but these errors were encountered:
dhimmel
transferred this issue from related-sciences/nxontology-data
Apr 3, 2023
Here's some example code to generate pairwise lin similarity scores for all node pairs in an NXOntology. For now just posting in case it's helpful, but it's also possible we could create a function to populate a matrix with a similarity metric.
On EFO, saving as an XZ compressed npy file worked well. Scipy.sparse matrices can also be considered but can be slower (or faster) to work with.
The text was updated successfully, but these errors were encountered: