Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fastSmoothEMD code review suggestions #129

Closed
wants to merge 199 commits into from
Closed

Conversation

ots22
Copy link
Member

@ots22 ots22 commented Apr 17, 2020

No description provided.

martintoreilly and others added 30 commits May 2, 2019 18:37
Moving towards:
* A netdis function that takes two input graphs and a reference graph.
* Sub-functions that follow through paper steps, sequentially applied with output of one providing input for next.
Code to filter ego networks with fewer than threshold nodes/edges was duplicated many places. Added that functionality to make_named_ego_graph, where ego networks originally generated.

Also, remove unnecessary duplication of simplifying in read_simple_graphs function, as it's already performed in read_simple_graph (which read_simple_graphs calls).
Default behaviour of ego network generating functions is now min_ego_nodes <- 3 and min_ego_edges <- 1. Set these to zero in function calls so orca tests now passing.

Should implement tests to check that filtering of small ego networks is working as expected (i.e. when min_ego_edges and min_ego_nodes > 0)
ego network generating functions now take min_ego_edges and min_ego_nodes arguments to filter out small ego networks, which default to 1 and 3 respectively. Set to zero for tests.

Filtering is probably happening explicitly elsewhere in the tests - could be removed with min_ego_edges and min_ego_nodes given non-zero values as appropriate.
Vignette to calculate netdis for 2 query graphs and a reference network:
* Follows steps of paper.
* Output of previous step used as input to next step (as much as possible).
* Calculates netdis for all graphlet sizes up to max_graphlet_size
* No explicit calls to purrr - all hidden in functions for each step.
* Needs a lot of tidying of function names, docs, removal of potentially obsolete functions etc.

Questions:

* What should be in top level netdis function - current netdis function takes centred graphlet counts (i.e. output of all but last step). Should that be parent function taking two query graphs, ref graph, and parameters?

* How to implement comparison of many graphs? This may differ depending on choice above - e.g. when comparing many graphs ideally don't want to be re-calculating ego nets and graphlet counts each time.
but bugs -
netdis_expected_graphlet_counts applies scaling so shouldn't be used for GP approx or constant count subtraction as written
jack89roberts and others added 28 commits June 6, 2022 16:36
pending decision re #138
…te/network-comparison into fastSmoothEMD-review
@ots22
Copy link
Member Author

ots22 commented Jun 10, 2022

superceded by #149

@ots22 ots22 closed this Jun 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants