-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't see infrequent edges. Is it possible? #30
Comments
Can anyone answer my question? Thanks so much for your attention |
Thanks for your patience.
What we could do is to implement a filter that removes cases by prioritizing on making the process map less complex. I'll have a look into that. |
Thank you so much! You are really kind. |
I have added a new function to edeaR called filter_infrequent_flows. (You can install it from github). This will NOT remove individual flows (i.e. arcs) from the process map. However, it will remove cases that contain a flow which is infrequent. For example sepsis %>% filter_infrequent_flows(min_n = 50) %>% process_map() This will consider all flows in the sepsis data with less than 50 occurences. All cases related to these flows will be removed. The differences between filter_infrequent_flows and filter_trace_frequency is subtle. The latter willl look at the end-to-end frequency of a sequence, while the first will look at the frequency of each step. The reasoning is that you can have infrequent traces that share a lot of flows, thus not lead to infrequent arcs in the process map. Removing these infrequent traces will not improve the map. The approach of the infrequent flows filter is to start from flows that are infrequent, and this filter will thus have a direct effect on the number of arcs drawn in the process map. But, importantly, it will always filter cases completely, i.e. all events,, not just individual arcs (which do not have a direct equivalent in the data, but are the result of the events). The current function expect an absolute frequency, (min_n) which should be 2 or higher (2 meaning, remove all cases of which one or more flows only occurs once). Of course, other posibilities are possible, such as a percentage of cases to keep. It might be not exactly what you are looking for, but hope this helps. Any feedback is welcome. (Documentation on the new function is still to be elaborated upon, but I hope my description here is clear). |
Just an idea, if you really want to just hide the edges. You could take a look at the DiagrammeR object returned (so don't render the process map) and then filter edges based on the label. A bit of a hack and the warning by Gert that the result may be very hard to interpret applies, but it could give you exactly what you want. Treat it more as a visualisation aid rather than a 'model discovery'. |
Thank you so much for your help! Thanks again! |
@ffalcolini Here is an example for @fmannhardt approach for removing edges in Diagrammer object: library(tidyverse)
library(bupaR)
library(processmapR)
library(eventdataR)
library(DiagrammeR)
hospital_billing_process_map <- eventdataR::hospital_billing %>%
process_map(
type_nodes = frequency("relative_case"),
type_edges = frequency("relative_case"),
sec_edges = performance(median, "secs", flow_time = "inter_start_time"),
render = FALSE,
rankdir = "TB"
)
DiagrammeR::render_graph(hospital_billing_process_map)
edges_to_remove <- hospital_billing_process_map %>%
processmapR::get_flows() %>%
dplyr::filter(value < 0.05) %>%
transmute(from = from_id, to = to_id)
edges_ids_to_remove <- hospital_billing_process_map %>%
DiagrammeR::get_edge_df() %>%
as_tibble() %>%
semi_join(edges_to_remove, by = c("from", "to")) %>%
pull(id)
filtered_graph <- hospital_billing_process_map %>%
DiagrammeR::select_edges_by_edge_id(edges = edges_ids_to_remove) %>%
DiagrammeR::delete_edges_ws() %>%
DiagrammeR::select_nodes_by_degree("deg == 0") %>%
DiagrammeR::delete_nodes_ws()
DiagrammeR::render_graph(filtered_graph) |
Very cool. Thanks for the contribution. Maybe we can add it as vignette. @gertjanssenswillen, actually I just watched Sander Leemans presentation on the Directly-follows Miner at ICPM 2020. Is the added filtering method actually implementing something similar to his proposal or could we implement it for |
I'll check this out! |
@vpanfilov thanks a lot for your help! I saw the presentation about the directly-follows miner and tried to use the Prom tool based on it. |
Hello! following up the above discussion, and if we assume that the Directly follow graph approach acceptable, I found that we could explore methods from network filtering, based on directed edges. I have played a little bit with the disparity filter (package skynet), and also the following methods for filtering 'unsignificant' edges: Tumminello M, Miccichè S, Lillo F, Piilo J, Mantegna RN (2011) Statistically Validated Networks in Bipartite Complex Systems. |
Hi!
I'd like to simplify the process map.
I can reduce the number of activities displayed: to do this I use the edeaR :: filter_activity_frequency function.
I'd like to reduce the number of edges displayed. I tried using the "layout" argument of the process_map function, specifically the "edge_cutoff" parameter; but unfortunately I don't get any results, as if the parameter has no effect.
Is it possible to use "edge_cutoff" parameter to display only the edges that have a frequency higher than a certain limit?
Can you give me an example of use?
Thanks so much!
You did a great job!
The text was updated successfully, but these errors were encountered: