Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a KBMOD results filter for matching "known objects" #741

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

wilsonbb
Copy link
Collaborator

@wilsonbb wilsonbb commented Nov 12, 2024

Adds a filter for matching and filtering KBMOD results to "known objects", as defined by a user-provided astropy table specifying a catalog of objects we expect to find in the KBMOD data as part of addressing #528. The catalog of known objects can either be cached information of real known objects from a service such as astroquery or a catalog of inserted synthetic fakes a user has added to the data.

Such a catalog must have columns representing an object's:

  • Name (either a synthetically generated or real target name as defined in https://astroquery.readthedocs.io/en/latest/imcce/imcce.html)
  • it's RA and Dec for each observation (which can be reflex-corrected to a given guess distance if the user wants to compare against reflex-corrected results)
  • the mjd of each observation

On klone/hyak loading and filtering with an approximately 750 mb catalog of cached astroquery results data corresponding to cone searches around bore sights in the DEEP search was tested, and loading and filtering took about 30 seconds. So there is room for optimization but currently would likely not be a scaling bottle neck.

The filter can be called either in the kbmod search wrokflow in src/kbmod/run_search.py where it can be called multiple times for different data sources or from any post-processing steps with a saved KBMOD Results object and its saved WCS.

The filter matches each result observation to potentially multiple objects with the user being able to apply thresholds for how close they need to be both spatially and temporally. Observations that match to known objects are then set as invalid in the Results table's "obs_valid" column, and remove_match_obs=True Results table filtering is applied by the filter to remove results that no longer have enough matching observations.

The function returns which which known objects matched to which observation for each KBMOD result (regardless of how many observations matched and the truth value of remove_match_obs). This preserves as much matching information as possible for cases such as when multiple known objects intersect different parts of a result trajectory. While this PR does not provide a convenient list of which expected recovered objects were not in the KBMOD results as requested in #528, the caller of the filter has all of the information needed to construct that.

An example workflow for filtering out known objects, identifying recovered fakes, and then processing potentially real results is provided below:

from kbmod.filters.known_object_filters import KnownObjsMatcher
from kbmod.results import Results

res = Results.read_table("/path/to/results")

# Remove all real observations from real objects from results
real_obj_filter_params = {
          "filter_type": "real_obj_matches",
          "known_obj_thresh": 0.5,
          "known_obj_sep_thresh": 1.0,
          "known_obj_sep_time_thresh_s": 600,
}
real_obj_table = Table("/path/to/real_obj_table/")
real_obj_matcher = KnownObjsMatcher(real_obj_table, obstimes, real_obj_filter_params)
res = real_obj_matcher.apply_known_obj_valid_obs_filter(
            res,
            wcs = res["wcs"][0],
            update_obs_valid=True,
        )

# Identify all recovered fakes
fake_filter_params = {
          "filter_type": "fake_matches",
          "known_obj_thresh": 0.5,
          "known_obj_sep_thresh": 1.0,
          "known_obj_sep_time_thresh_s": 600,
          "recovered_fake_matches_obs_ratio": 0.5
}
fake_table = Table("/path/to/real_obj_table/")
fake_matcher = KnownObjsMatcher(real_obj_table, obstimes, real_obj_filter_params)

# Here we match observations to our fakes. Note that this does not update the "obs_valid"
# column of the Results table
res = fake_matcher.match_known_obj_filters(
            res,
            wcs = res["wcs"][0],
        )

# Apply a threshold for how many observations from the fake catalog we had to recover
# in order for the fake to be found. (note that we already filter near the obstimes for this
# KBMOD run, so fakes on distant nights shouldn't matter). Our cutoff ratio is
# from fake_filter_params.known_obj_match_obs_ratio
res  = fake_matcher.apply_known_obj_match_obs_ratio(res)

# Now we should have the column "recovered_fake_matches_obs_ratio" in the Results table.
# And we can use it to generate a list of recovered fakes. 
recovered_fakes = set([])
for r in res:
    recovered_fakes.update(res["recovered_fake_matches_obs_ratio"])
print(f"Recovered fakes: {recovered_fakes}")

# Now we can filter out all recovered fakes to continue processing potential results with ML
res = fake_matcher.filter_known_obj(res, "recovered_fake_matches_obs_ratio", recovered_fake_matches_obs_ratio")
res = ml_magic_yay(res)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant