-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vertices in unconnected_vertex_pairs does not exists #1
Comments
Hey mahmoodtareq. Another participant here. It appears that there is a significant amount of nodes with degree 0. You got the same conclusion? |
Hi @mahmoodtareq and @franciscoandrades, I use this code:
And get the output:
Let me know if this helps. Happy to explain more. |
Dear @mahmoodtareq and @franciscoandrades, It doesnt change the competition, but makes it more interesting, because now you can exploit this fact in building your solutions. We will post a more detailed analysation for that fact at the GitHub main page in a few days. Thank you a lot for investigating the data so thoroughly - you are certainly on the right track! :) |
@franciscoandrades Yeah, same conclusion. I got confused because, in the tutorial notebook, it was mentioned that "unconnected_vertex_pairs: This is a list of vertex pairs v1,v2 with deg(v1)>=10, deg(v2)>=10". I thought I made a mistake reading the data. Thanks to @MarioKrenn6240 for mentioning "This fact results in situations where you are asked to predict edges with vertices of degree=0". So, it the issue was in the data after all. |
Hi @mahmoodtareq and @franciscoandrades . First up thanks for your interest in our competition and thanks for your astute observations. As @MarioKrenn6240 said, this does not change the competition, but you might want to change the training data set we suggested to use (TrainSet2014_3). In it we have 27% of vertices that are in the unconnected vertex pairs data set but are also not connected to the rest of the graph ever. This is suboptimal as one has no information on these vertices (you could permute them ... ). Reducing the training set by these vertices might lead to better model performances. Note that in our competition data set (CompetitionSet2017_3) you only find a small percentage of such vertices. |
Hi @mkk20 - some confusion here I hope you can resolve. The competition prediction set has 18.08% vertices, that are 0-degree in 2017, which is not a small percentage (see code]. Should we read your comments as:
As you can imagine, this makes quite the difference in inductive biases one would like to inject. Thanks! code
|
Dear @trdavidson, thanks for your question - The 0-degree vertices will be used (i.e. your 2nd comment is the true one). You have information about edges formed with the zero-degree vertices that can be used (the edges are formed with other vertices which you can use as extract information). You could consider this subtask as a first implicit attempt to perform predictions of new vertices. Please let me know if you have additional questions on this, i am happy to explain more -- thanks! |
From what I understood, vertices mentioned in
unconnected_vertex_pairs
, should also be present in the vertices mentioned infull_dynamic_graph_sparse
. But, after running following snippet,u_v = set(unconnected_vertex_pairs[:, 0]) | set(unconnected_vertex_pairs[:, 1])
c_v = set(full_dynamic_graph_sparse[:, 0]) | set(full_dynamic_graph_sparse[:, 1])
the output of
len(u_v - c_v)
is 13152. That means, 13752 vertices in the unconnected list do not appear in the full dynamic graph. I may understood it wrong, so please clarify.The text was updated successfully, but these errors were encountered: