-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue Translating Trained MAGIK Model to Real Data #207
Comments
I have checked your example carefully and have some recommendations. Why do you get no trajectories? This issue is related to the thresholding parameter ( Why the model has not been trained properly? The reason behind this is related to both your dataset and MAGIK's data generation pipeline. Your dataset consists mostly of confined trajectories, which is completely fine! However, How do we solve the problem? At the moment, I see two possible solutions to the problem. First, try to simulate multiple videos instead of single one. You can set the ID of each video in Additionally, In this case, I recommend modifying the default augmentation of the graph extractor as follows: import random
def GetFeature(full_graph, **kwargs):
return (
dt.Value(full_graph)
>> dt.Lambda(
GetSubSet,
randset=lambda: np.random.randint(
np.max(full_graph[-1][0][:, 0]) + 1),
)
>> dt.Lambda(
AugmentCentroids,
rotate=lambda: np.random.rand() * 2 * np.pi,
translate=lambda: np.random.randn(2) * 0.05,
flip_x=lambda: np.random.randint(2),
flip_y=lambda: np.random.randint(2),
)
>> dt.Lambda(NoisyNode)
>> dt.Lambda(NodeDropout, dropout_rate=0.03)
)
generator = GraphGenerator(
nodesdf=traindf,
properties=["centroid"],
min_data_size=51,
max_data_size=52,
batch_size=4,
feature_function=GetFeature,
**variables.properties()
) Now the The second option is to increase the number of trajectories used in creating the training subgraphs without modifying the existing dataset. import random
def GetFeature(full_graph, **kwargs):
return (
dt.Value(full_graph)
>> dt.Lambda(
GetSubGraphFromLabel,
samples=lambda: np.array(
sorted(
random.sample(
list(full_graph[-1][0][:, -1]),
np.random.randint(200, 300), # Modify this: (min, max)
)
)
),
)
>> dt.Lambda(
AugmentCentroids,
rotate=lambda: np.random.rand() * 2 * np.pi,
translate=lambda: np.random.randn(2) * 0.05,
flip_x=lambda: np.random.randint(2),
flip_y=lambda: np.random.randint(2),
)
>> dt.Lambda(NoisyNode)
>> dt.Lambda(NodeDropout, dropout_rate=0.03)
)
generator = GraphGenerator(
nodesdf=traindf,
properties=["centroid"],
min_data_size=51,
max_data_size=52,
batch_size=4,
feature_function=GetFeature,
**variables.properties()
) These two solutions will represent an increase in computation resources. Therefore I recomend using the model = dt.models.gnns.MAGIK(
dense_layer_dimensions=(64, 96), # number of features in each dense encoder layer
base_layer_dimensions=(96,)*4, # Latent dimension throughout the message passing layers
number_of_node_features=2, # Number of node features in the graphs
number_of_edge_features=1, # Number of edge features in the graphs
number_of_edge_outputs=1, # Number of predicted features
edge_output_activation="sigmoid", # Activation function for the output layer
output_type=_OUTPUT_TYPE, # Output type. Either "edges", "nodes", or "graph"
graph_block="MPN",
) |
Thank you for your suggestions! When applying this model to my data, however, the trajectories are reconstructed quite slowly due to the number of localizations (100k -1000k). I was able to parallelize some aspects of the to_trajectories function and run slices (256x256 pixels x 1000 frames) of the data through in batch, but some samples still take hours to process. Are there plans to speed this functionality up? Do you have any recommendations for how I could speed up this function? |
Hello,
I have been trying to link detections using a MAGIK model trained on simulations, and I'm having issues getting any reasonable trajectories from real data. I have attached my train.py and test.py scripts for generating simulated trajectories with varying density and applying to a dataset from real data. I am having a lot of difficulty understanding if there is an issue with the framework or if I am perhaps overtraining on my simulations, as I am getting no trajectories from the edge dataframe.
Here is an example of my training simulations from traindf.csv:
Here is some of the data from dfIndexToEvaluate.csv
I am training and evaluating with (I have to upload as a txt file):
train.py.txt
test.py.txt
dfIndexedToEvaluate.csv
traindf.csv
If the results are what I think they are, there should be many tracks, and not none. Am I interpretting the results of dt.models.gnns.get_traj correctly?
Thank you!
The text was updated successfully, but these errors were encountered: