Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we use Trackpy’s DataFrame format as input for TrackAstra? #17

Open
po60nani opened this issue Oct 29, 2024 · 10 comments
Open

Can we use Trackpy’s DataFrame format as input for TrackAstra? #17

po60nani opened this issue Oct 29, 2024 · 10 comments

Comments

@po60nani
Copy link

po60nani commented Oct 29, 2024

Hi, thanks for developing this great package! I’m currently working with Trackpy for particle tracking, and it outputs trajectory data as a Pandas DataFrame with columns like frame, particle, x, and y. I would like to load this directly into TrackAstra for further analysis but am unsure if it’s compatible.

Example:

Here’s a small sample of the DataFrame output from Trackpy. I expected particle_id to be estimated by TrackAstra in test mode.

frame particle_id x y
0 1 15.1 35.2
0 2 20.3 40.5
1 1 15.4 35.6
1 2 20.8 41.0

However, I'm unsure if this is the right approach, and I’d appreciate any guidance on making Trackpy's format compatible with TrackAstra.

Thanks in advance for your help!

@maweigert
Copy link
Contributor

maweigert commented Oct 29, 2024

This would be easy to do I suppose.

eg try the following:

import torch 
import networkx as nx
import numpy as np
import pandas as pd
from tqdm import tqdm
from trackastra.tracking.utils import ctc_tracklets
from skimage.measure import regionprops

def graph_to_trackpy(
    graph: nx.DiGraph,
    masks_original: np.ndarray,
    frame_attribute: str = "time",
) -> tuple[pd.DataFrame, np.ndarray]:
    """Convert graph to pandas Dataframe with label, time, position information of tracks 

    Args:
        graph: with node attributes `frame_attribute` and "label"
        masks_original: list of masks with unique labels
        frame_attribute: Name of the frame attribute in the graph nodes.
    
    Returns:
        pd.DataFrame: track dataframe with columns ['frame', 'particle_id', 'parent', 'x', 'y']
        np.ndarray: masks with unique color for each track
    """
    # each tracklet is a linear chain in the graph
    tracklets = ctc_tracklets(graph, frame_attribute=frame_attribute)

    regions = tuple(
        dict((reg.label, reg.slice) for reg in regionprops(m))
        for t, m in enumerate(masks_original)
    )

    masks = np.stack([np.zeros_like(m) for m in masks_original])
    rows = []
    # To map parent references to tracklet ids. -1 means no parent, which is mapped to 0 in CTC format.
    node_to_tracklets = dict({-1: 0})

    # Sort tracklets by parent id
    for i, _tracklet in tqdm(
        enumerate(sorted(tracklets)),
        total=len(tracklets),
        desc="Converting graph to CTC results",
    ):
        _parent = _tracklet.parent
        _nodes = _tracklet.nodes
        label = i + 1

        _start, end = _nodes[0], _nodes[-1]
        node_to_tracklets[end] = label

        # relabel masks
        for _n in _nodes:
            node = graph.nodes[_n]
            t = node[frame_attribute]
            lab = node["label"]
            ss = regions[t][lab]
            m = masks_original[t][ss] == lab
            if masks[t][ss][m].max() > 0:
                raise RuntimeError(f"Overlapping masks at t={t}, label={lab}")
            if np.count_nonzero(m) == 0:
                raise RuntimeError(f"Empty mask at t={t}, label={lab}")
            masks[t][ss][m] = label


            d = dict(frame=t, particle_id=label,  parent=node_to_tracklets[_parent])
            coords = node['coords']
            if len(coords) ==2:
                d.update({'x': coords[1], 'y': coords[0]})
            elif len(coords) == 3:
                d.update({'x': coords[2], 'y': coords[1], 'z': coords[0]})
            else: 
                raise ValueError('Coordinates should be 2D or 3D')
            rows.append(d)

    df = pd.DataFrame(rows)

    masks = np.stack(masks)

    return df, masks

and then

track_graph = model.track(imgs, masks...)
df, masks = graph_to_trackpy(track_graph, masks)
print(df)

Should give you something along these lines.

PS: Maybe its useful to just include this helper function actually

@po60nani
Copy link
Author

Thank you for the quick response! I'm looking for a function that generates a mask from this data frame. Since my PSFs are almost fixed in size, the mask creation is straightforward. However, I want to ensure compatibility with the other methods in your package.

@maweigert
Copy link
Contributor

The function above returns the dataframe and the masks. Isn't that what you need?

@po60nani
Copy link
Author

The function you suggested, which takes graph, masks_original, and frame_attribute, is helpful for final results. However, I’m specifically looking for a function that generates a mask to be used as input for track_graph = model.track(imgs, masks, mode="greedy"). At this stage, we only have localization information available and none of the additional inputs your function requires.

@maweigert
Copy link
Contributor

Ah, you mean from the input side. From the locations you could create a label mask that has a single pixel label at the coordinates of the particle and feed that to the model.

@po60nani
Copy link
Author

Exactly! I’m using the following function, but I encountered an error when calling track_graph = model.track(video, masks, mode="greedy"):

def create_psf_mask(data, video_shape):
    """
    Create 2D mask for each frame based on PSF size (sqrt(2) * sigma).

    Parameters:
        data (DataFrame): DataFrame with columns ['x', 'y', 'sigma', 'frame']
        video_shape (tuple): Shape of each frame as (frames, height, width)

    Returns:
        dict: A dictionary where keys are frame numbers and values are 2D masks.
    """
    frames = data['frame'].unique()
    masks = np.zeros(video_shape, dtype=np.int16)

    for frame in frames:
        frame_data = data[data['frame'] == frame]

        for _, row in frame_data.iterrows():
            x, y, sigma = int(row['x']), int(row['y']), row['sigma']
            radius = int(np.sqrt(2) * sigma)

            for i in range(max(0, x - radius), min(video_shape[2], x + radius + 1)):
                for j in range(max(0, y - radius), min(video_shape[1], y + radius + 1)):
                    if np.sqrt((i - x) ** 2 + (j - y) ** 2) <= radius:
                        masks[frame, j, i] = int(i + j)  # Set mask value (e.g., 1 for binary mask)

    return masks

This results in the following error:

IndexError: min(): Expected reduction dim 1 to have non-zero size.

It appears that the error occurs when calling model.track() with the generated masks. Any insights into what might be causing this would be greatly appreciated!

@maweigert
Copy link
Contributor

Hard to debug. Can you share the mask? eg does every frame of the mask has at least one label?

@po60nani
Copy link
Author

Below is the CSV file I used to create the mask with the following code:

df = pd.read_csv(r'.\df_PSFs.csv')

video_shape = (19900, 126, 128) 
masks = create_psf_mask(df, video_shape)

To answer your question, no, this dataset includes some frames without any particles.

df_PSFs.csv

@maweigert
Copy link
Contributor

  1. your mask creation function is not really working properly, try something like this:
masks = np.zeros((19900, 126, 128), np.uint16)
masks[tuple(df[['frame','y','x']].astype(int).to_numpy().T)] = df['particle'].astype(int).to_numpy()
  1. there are lots of empty frames (eg the first 632)

  2. your particles are not really moving?

@po60nani
Copy link
Author

I updated my code based on your suggestion, but using the particle ID as mask values may not be ideal, as I expected TrackAstra to handle this aspect automatically (e.g., linking particles across frames). While I have particle IDs from Trackpy in this dataset to validate final results, they aren’t always available.

Regarding your second point, I do indeed have many empty frames. I removed these from the dataset and applied the suggested modifications to the mask function, but I’m still encountering the following error:

File "...\trackastra\model\model_api.py", line 140, in _track_from_predictions
    candidate_graph = build_graph(
                      ^^^^^^^^^^^^
  File "...\trackastra\tracking\tracking.py", line 141, in build_graph
    pj = np.stack(pj)
         ^^^^^^^^^^^^
  File "...\numpy\core\shape_base.py", line 445, in stack
    raise ValueError('need at least one array to stack')
ValueError: need at least one array to stack

As for your third question, the particles in this example are stationary. My goal here is to test the package’s ability to perform time-based tracking without particle movement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants