Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve issue #1029 #1170

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Conversation

gluonhiggs
Copy link

@gluonhiggs gluonhiggs commented Apr 23, 2024

In fact, I do not want to merge my code straightforwardly, because I need some advice!

  1. I added the function minimum_cycle_basis and some tests. It works, but I believe it it need some modifications to be more well programmed, including the choice of having error handler or not.
  2. With the assumption that the Rust core of minimum_cycle_basis is working fine, we need to add a Python layer to wrap around it. And there are some errors in src/connectivity/mod.rs where we use #[pyfunction] annotation. I have no idea to fix this. Perhaps, to fix it, I need to modify even the minimum_cycle_basis.rs core in rustworkx-core/src/connectivity/.
  3. [for fun] By the way, I don't feel like this is a good first issue from 0 knowledge about Rust to 7 months developing this :D
    Link to the issue https://github.com/Qiskit/rustworkx/issues/1029

@CLAassistant
Copy link

CLAassistant commented Apr 23, 2024

CLA assistant check
All committers have signed the CLA.

Copy link
Collaborator

@IvanIsCoding IvanIsCoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to review this when I have time. I am sorry we mislabeled #1029, maybe we underestimated the effort to implement the algorithm?

Nevertheless, congrats on completing the work and submitting the PR! It is a big accomplishment

@gluonhiggs
Copy link
Author

gluonhiggs commented Apr 26, 2024

For convenience, I think I need to explain my idea why I perform the subgraph creation and the lifted graph construction as in the minimum_cycle_basis.rs so that you can find it easier to review.
Our ultimate goal is to return a (minimal) cycle basis which contains the information of the input graph nodes (i.e NodeIndex(n) for node n). Because in Rust, if we create a new graph which preserves some structure of another graph, we would have the NodeIndex() reset (this phenomenon doesn't happen in Python, and I can't think of any other way to overcome). Therefore, I have to use the name of the original nodes and pass them across the flow to track the node order. We manipulate the subgraphs, the lifted graphs and get the data, using the names to map afterwards to obtain the NodeIndex.

Copy link
Collaborator

@IvanIsCoding IvanIsCoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments, but overall I think we will need to rewrite most of the hashmap usage because we should avoid the overhead of creating and hashing intermediate strings.

Also, you should replace A* with Dijkstra

IntoNodeIdentifiers, IntoNodeReferences, NodeIndexable, Visitable,
};
use petgraph::Undirected;
use std::fmt::Debug;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should remove std::fmt::Debug trait from here and the function signatures

Comment on lines 60 to 61
let source_name = format!("{}", graph.to_index(edge.source()));
let target_name = format!("{}", graph.to_index(edge.target()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above about hashing

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering without the name, how can we know which node is which after putting these into calculations related to the subgraphs and the lifted graph. Imagine NodeIndex(1) in the original graph (user input), which might be NodeIndex(0) in a subgraph, and NodeIndex(2) in the lifted graph. I can only think of mapping these NodeIndex(i) from one graph with respect to another, and apparently we still need have to use HashMap. Does it sound good?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can create a custom struct to store data e.g. a pair of integers. And then implement the hash trait with derive https://doc.rust-lang.org/std/hash/trait.Hash.html#implementing-hash.

Strings contain at least 24 bits even if they are empty. So we should find better representations to store intermediate data in a more compact way. You don’t want string hashing to be the bottleneck of the algorithm

.map(|component| {
let mut subgraph = Graph::<String, i32>::new();
// Create map index to NodeIndex of the nodes in each component
let mut name_idx_map: HashMap<String, usize> = HashMap::new();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not necessary, you should be able to use the NodeId directly because it is hashable. Or use the usize as a hash

for &node_id in &component {
let node_index = graph.to_index(node_id);
// Create the name of the node in subgraph from the original graph
let node_name = format!("{}", node_index).trim_matches('"').to_string();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above about hashing

let mst = min_spanning_tree(&graph);
let mut mst_edges: Vec<(usize, usize)> = Vec::new();
for element in mst {
// println!("Element: {:?}", element);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the debug statements

graph: G,
orth: HashSet<(usize, usize)>,
mut weight_fn: F,
name_idx_map: &HashMap<String, usize>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to propagate the hashing changes to _min_cycle as well

fn _min_cycle_basis<G, F, E>(
graph: G,
weight_fn: F,
name_idx_map: &HashMap<String, usize>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to propagate the hashing changes to _min_cycle as well

for node in path {
let node_name = gi.node_weight(node).unwrap();
if node_name.contains("_lifted") {
let original_node_name = node_name.replace("_lifted", "");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need to add additional information to the hash you can also hash tuples in Rust. Or you can come up with a custom struct and write a hasher for it

let nodeidx = NodeIndex::new(node);
let lifted_node = gi_name_to_node_index[&lifted_node_name];
let lifted_nodeidx = NodeIndex::new(lifted_node);
let result = astar(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably just be a call for dijkstra given the A* heuristic is a constant

@1ucian0 1ucian0 linked an issue May 14, 2024 that may be closed by this pull request
@gluonhiggs
Copy link
Author

I have added some modifications based on your advice. However, I don't think my approach using the new trait EdgeWeightToNumber is a good idea. I will try to generalize the edge weight type instead of forcefully using i32.

Comment on lines 921 to 971
#[pyfunction]
#[pyo3(text_signature = "(graph, /)")]
pub fn minimum_cycle_basis(py: Python, graph: &PyGraph) -> PyResult<Vec<Vec<usize>>> {
py.allow_threads(|| {
let result = connectivity::minimum_cycle_basis(&graph.graph);
match result {
Ok(min_cycle_basis) => {
// Convert Rust Vec<Vec<NodeIndex>> to Python Vec<Vec<usize>>
let py_min_cycle_basis = min_cycle_basis
.iter()
.map(|cycle| {
cycle
.iter()
.map(|&node_index| graph.graph.to_index(node_index))
.collect::<Vec<usize>>()
})
.collect::<Vec<Vec<usize>>>();
Ok(py_min_cycle_basis)
}
Err(e) => {
// Handle errors by converting them into Python exceptions
Err(PyErr::new::<pyo3::exceptions::PyRuntimeError, _>(format!(
"An error occurred: {:?}",
e
)))
}
}
})
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My piece of advice is the rustworkx-core function looks good but we need to do some work on the Python bindings. I think we can just remove the Python bindings and add it later on

@IvanIsCoding
Copy link
Collaborator

I think the code is looking better now, the compiler is complaining about some unused variables + trait missing for Python but overall this was a big improvement! Thanks

@gluonhiggs
Copy link
Author

gluonhiggs commented May 28, 2024

I have modified the functions in minimum_cycle_basis.rs so that they are able to take weight_fn that returns Result<K,E>. By this, I guess we can deal with more general weights (e.g not only i32). The thing is I also want to and the python wrapper for this core, but I don't quite understand how precisely to implement this. It would be wonderful if you provide some keywords or document that I can search to read and learn, I think I can also do this (it will take a while). Thank you!

@IvanIsCoding
Copy link
Collaborator

I have modified the functions in minimum_cycle_basis.rs so that they are able to take weight_fn that returns Result<K,E>. By this, I guess we can deal with more general weights (e.g not only i32). The thing is I also want to and the python wrapper for this core, but I don't quite understand how precisely to implement this. It would be wonderful if you provide some keywords or document that I can search to read and learn, I think I can also do this (it will take a while). Thank you!

If you know that you need to access the weights multiple times I recommend following this pattern:

let edge_cost = |e: EdgeIndex| -> PyResult<f64> {
.

If you only need to call this once you can wrap this pattern

let edge_weight = weight_callable(py, &weight_fn, &weight, default_weight)?;
into a function that you pass to the Rust code.

@gluonhiggs
Copy link
Author

I've added what I think necessary. Thank you a lot!

@coveralls
Copy link

coveralls commented Jun 20, 2024

Pull Request Test Coverage Report for Build 9597606406

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 290 (0.0%) changed or added relevant lines in 1 file are covered.
  • 16 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-1.5%) to 94.354%

Changes Missing Coverage Covered Lines Changed/Added Lines %
rustworkx-core/src/connectivity/minimal_cycle_basis.rs 0 290 0.0%
Files with Coverage Reduction New Missed Lines %
src/connectivity/mod.rs 16 96.75%
Totals Coverage Status
Change from base Build 9473947759: -1.5%
Covered Lines: 17413
Relevant Lines: 18455

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

minimal_cycle_basis
5 participants