-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA]: Implementing cuGraph.node2vec on kaggel or other datasets #4094
Comments
Hi @ShivanjaliR . Thank you for raising this up and I am looking into it. cuGraph should be able to handle any datasets including kaggle's one. Is the |
https://snap.stanford.edu/data/soc-sign-bitcoin-alpha.html Its a publicly available dataset. I have attached csv file as well. |
Thanks. Looking into now |
I was able to reproduce the error and I figured what is wrong. The error you are getting is
The first option ensures that The second option ensures that Please let me know if this solve your issue or if you have any further questions. I will also push a PR ensuring that the |
@jnke2016 - I suggest making the change both in the python and C API, since this could affect users coming in at any layer. C++ is already enforced by the template parameters. |
@jnke2016 do you think this should this be a bug or a feature request? I'm leaning toward classifying this as a bug since it failed to provide an accurate error message. |
I think my previous request consider as bug .. but now I am facing new issues where I need new feature requests:
Waiting for reply. |
@ShivanjaliR - please make this a separate issue within git. This will allow us to resolve the bug when the linked PR is resolved. The new feature request will need to be tracked separately. |
I raised the new issue as follows: Please look into it as soon as possible. |
Thanks! |
Any updates on this issue? any progress? |
Is this a new feature, an improvement, or a change to existing functionality?
New Feature
How would you describe the priority of this feature request
Critical (currently preventing usage)
Please provide a clear description of problem this feature solves
I created a directed graph using cuGraph and I am using a dataset which is not belong to cuGraph libraries. When I am trying to implement node2vec on such directed graph I am facing the attached issue.
If I am using from cugraph.datasets import karate, karate_asymmetric datasets then node2vec is working as expected.
Traceback (most recent call last):
File "/app/web_reading.py", line 239, in
paths, weights, path_sizes = cugraph.node2vec(original_bitcoin_graph, start_vertices, 10, True, p, q)
File "/opt/conda/lib/python3.10/site-packages/cugraph/sampling/node2vec.py", line 123, in node2vec
vertex_set, edge_set, sizes = pylibcugraph_node2vec(
File "node2vec.pyx", line 160, in pylibcugraph.node2vec.node2vec
File "utils.pyx", line 53, in pylibcugraph.utils.assert_success
RuntimeError: non-success value returned from cugraph_node2vec: CUGRAPH_UNKNOWN_ERROR CUDA error encountered at: file=/opt/conda/include/raft/util/cudart_utils.hpp line=148:
['/app', '/opt/conda/lib/python310.zip', '/opt/conda/lib/python3.10', '/opt/conda/lib/python3.10/lib-dynload', '/opt/conda/lib/python3.10/site-packages']
def createDirectedWebGraph(sources, targets):
G = cugraph.Graph(directed=True)
edges_df = cudf.DataFrame({'source': sources, 'target': targets})
G.from_cudf_edgelist(edges_df, source='source', destination='target', renumber=True)
return G
node_content = cudf.read_csv(bitcoin_inputfile)
numpy_array = node_content.to_pandas().to_numpy()
source = numpy_array[:, 0]
target = numpy_array[:, 1]
original_bitcoin_graph = createDirectedWebGraph(source, target)
start_vertices = cudf.Series(original_bitcoin_graph.nodes(), dtype=np.int32)
print(original_bitcoin_graph.nodes())
paths, weights, path_sizes = cugraph.node2vec(original_bitcoin_graph, start_vertices, 10, True, p, q)
print(paths)
print(weights)
print(path_sizes)
Are we able to use node2vec on cugraph Graph which is created from another dataset.
Describe your ideal solution
We should be able to implement the node2vec library on other kaggle datasets. Please let me know if I am going wrong in above code.
Describe any alternatives you have considered
No response
Additional context
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: