DM-38850: Make trailedAssociatorTask #173

bsmartradio · 2023-06-29T18:28:20Z

Make trailedAssociatorTask which filters out trails whose lengths are above 0.416 arcseconds/second in length. The trailed sources are currently dropped once they are filtered out. More complexity for filtering will be added at a later date.

parejkoj · 2023-07-07T21:43:43Z

The pull request should have the ticket number in it: https://developer.lsst.io/work/flow.html#make-a-pull-request

parejkoj

Have you run this? I think it would have failed a linter, at least. There were a lot of typos and leftover print/import statements that were clearly from debugging; please try to have those cleaned up before you send it for review.

There are no tests specific to the new Task itself: please add a file to test the new Task output.

Is this really worth having a separate Task, instead of just adding a config, branch, and method to AssociationTask? It's just ~3 lines of code; do we expect that we will be significantly expanding functionality compared to what you've implemented here, and are there any other places where we need this as a Task?

python/lsst/ap/association/association.py

tests/test_association_task.py

bsmartradio · 2023-08-23T23:22:26Z

Here is the jenkins run using this branch.
https://rubin-ci.slac.stanford.edu/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/95/pipeline/

bsmartradio · 2023-09-11T22:53:21Z

@parejkoj Are you happy with the changes? If so I can rebase onto main and merge the changes.

python/lsst/ap/association/association.py

python/lsst/ap/association/trailedSourceFilter.py

parejkoj · 2023-09-12T21:41:51Z

python/lsst/ap/association/trailedSourceFilter.py

+            - ``trailed_dia_sources`` : DIASources that have trailed more
+            than 0.416 arcseconds/second*exposure_time. (`pandas.DataFrame`)
+        """
+


No newlines after function docstrings. https://developer.lsst.io/python/numpydoc.html#docstrings-of-methods-and-functions-should-not-be-preceded-or-followed-by-a-blank-line

Ah yes, I fixed in one place but not another. I've fixed them all now.

python/lsst/ap/association/association.py

parejkoj · 2023-09-12T21:47:23Z

python/lsst/ap/association/association.py

+            if len(diaTrailedResult.trailedDiaSources) > 0:
+                print("Trailed sources cleaned.")
+            else:
+                print("No trailed sources to clean.")


It looks like you removed print statements, but didn't add a log.info: I think we absolutely want to have a log statement about how many sources were removed.

python/lsst/ap/association/trailedSourceFilter.py

parejkoj · 2023-09-12T21:55:01Z

python/lsst/ap/association/trailedSourceFilter.py

+"""A simple implementation of source association task for ap_verify.
+"""
+
+__all__ = ["TrailedSourceFilterTask", "TrailedSourceFilterConfig"]


KT has gone the other way on my dev guide suggestion, so lets not touch anything else right now while we sort that out.

lsst-dm/dm_dev_guide#632

python/lsst/ap/association/trailedSourceFilter.py

parejkoj · 2023-09-12T22:20:59Z

python/lsst/ap/association/trailedSourceFilter.py

+        result : `lsst.pipe.base.Struct`
+            Results struct with components.
+
+            - ``"dia_sources"`` : DiaSource table that is free from unwanted


This is how they should be documented: https://developer.lsst.io/python/numpydoc.html#struct-types

Please file a ticket to go through all of ap_assocation and fix the Struct docstrings, if you know some are wrong.

timj · 2023-09-12T22:42:10Z

python/lsst/ap/association/trailedSourceFilter.py

+            Boolean mask for DIASources which are greater than the
+            cutoff length.
+        """
+        diffIm_time = diffIm.getInfo().getVisitInfo().getExposureTime()


Difference images come from visits. Given we can take 2 snaps and there is a gap between the snaps, is the relevant time here the exposure time of a snap or the duration of the visit including the gap (which is greater than the exposure time)?

(also, if diffim is an exposure then the docstring for diffIm is wrong).

Is there a policy and/or actual code that defines what the "exposure time" of a snap-combined image is? The docs for VisitInfo just say:

get exposure duration (shutter open time); (sec)

Since we can't yet run CharacterizeImageTask or CalibrateImageTask on multi-snap visits, I'm pretty sure that this case is untestable either way.

At the moment it's defined to be the sum of the exposure time of the snaps. Unlike ObservationInfo, VisitInfo doesn't record the start and end time, only the midpoint (which for two snaps might be a time where no data are being taken).

parejkoj

It looks like you marked some comments resolved without making the requested changes (e.g. excessive newlines, only needing the exposure time not the full exposure), and didn't incorporate some others (e.g. referring to the default value in docstrings, making the check_dia_source_trail method private)? I've flagged some of them with eyes, but please check for others that were missed.

tests/test_trailedSourceFilter.py

tests/test_association_task.py

parejkoj · 2023-09-12T23:37:07Z

tests/test_association_task.py

+        for test_obj_id, expected_obj_id in zip(
+                results.matchedDiaSources["diaObjectId"].to_numpy(),
+                [1, 2, 3, 4]):
+            self.assertEqual(test_obj_id, expected_obj_id)
+        for test_obj_id, expected_obj_id in zip(
+                results.unAssocDiaSources["diaObjectId"].to_numpy(),
+                [0]):
+            self.assertEqual(test_obj_id, expected_obj_id)


Why can't you write these like self.assertEqual(results.matchedDiaSources["diaObjectId"], [1,2,3,4]? Or at worst use np.testing.assert_arrays_equal? The loops make it hard to follow what is being tested.

This results in ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() for the arrays, and the original person who wrote the tests I followed likely did it in this way to get around this problem. I looked up ways around this, and this popped up as one of the suggested ways of comparing values in an array for unit testing. The other option seems to be self.assertTrue(np.array_equal(results.matchedDiaSources["diaObjectId"].values, [1,2], equal_nan=True)), which I've swapped to since it seems a tad more explicit in what its doing.

Use np.testing.assert_arrays_equal then, instead. That will give explicit information which values mismatch. These cannot be NaN, so there's no need for NaN-safe comparisons.

On that note, would it add clarity if I changed the other unit tests to follow this format?

That would definitely be helpful, yes! Please do it on a separate commit, though.

Swapped to np.testing and also made a separate commit updating the prior unit tests

tests/test_trailedSourceFilter.py

parejkoj · 2023-09-12T23:42:12Z

tests/test_trailedSourceFilter.py

+
+        results = trailedSourceFilterTask.run(self.diaSources, self.exposure)
+
+        self.assertEqual(len(results.diaSources),


Each of these tests should also have an assert on the contents of the other array returned in the struct (the sources that were filtered). Better to test on which ones were included than just the length: I think this is the first 3 in the first list, and the last two in the second? Similarly in test_run_short_max_trail.

Added tests similar to self.assertTrue(np.array_equal(results.matchedDiaSources["diaObjectId"].values, [1,2], equal_nan=True)) to check that the output arrays are what is expected.

Updated to np.testing.

tests/test_trailedSourceFilter.py

Update unit tests in test_association_task.py to use np.testing.assert_array_equal for testing array equality.

parejkoj

A few more small comments, and I unresolved and put eyes on a few others that it looks like you missed. Clean these up, and you're good to go.

parejkoj · 2023-09-26T22:03:39Z

python/lsst/ap/association/association.py

+            diaTrailedResult = self.trailedSourceFilter.run(diaSources, exposure_time)
+            matchResult = self.associate_sources(diaObjects, diaTrailedResult.diaSources)
+
+            self.log.warning("%i DIASources exceed maxTrailLength, dropping "


Why a warning? Nothing went wrong, since the intent was to remove sources. log.info is probably fine.

I saw that the other log.info that drops a source was a warning. However, you are right since dropping the source makes it work as intended so it should be a log.info.

parejkoj · 2023-09-26T22:04:43Z

python/lsst/ap/association/diaPipe.py

-        assocResults = self.associator.run(diaSourceTable,
-                                           loaderResult.diaObjects)
+        assocResults = self.associator.run(diaSourceTable, loaderResult.diaObjects,
+                                           exposure_time=diffIm.getInfo().getVisitInfo().getExposureTime())


Use the properties:

Suggested change

exposure_time=diffIm.getInfo().getVisitInfo().getExposureTime())

exposure_time=diffIm.visitInfo.exposureTime)

Swapped to using properties.

parejkoj · 2023-09-26T22:08:20Z

python/lsst/ap/association/trailedSourceFilter.py

+    """Config class for TrailedSourceFilterTask.
+    """
+
+    maxTrailLength = pexConfig.Field(


I guess I missed this earlier: if we're going with snake_case throughout, we should make the configs also snake_case.

I've swapped to snake case in just trailedSourceFilter.py and left it camel in association.py

parejkoj · 2023-09-26T22:08:44Z

python/lsst/ap/association/trailedSourceFilter.py

+        Creates a mask for sources with lengths greater than 0.416
+        arcseconds/second multiplied by the exposure time.


Again, don't mention the default value, since it's configurable.

Suggested change

Creates a mask for sources with lengths greater than 0.416

arcseconds/second multiplied by the exposure time.

Return a mask of sources with lengths greater than ``config.maxTrailLength`` multiplied by the exposure time.

Swapped the wording to what you've suggested.

Plus several edits from review

bsmartradio · 2023-09-28T22:02:51Z

Jenkins run with this passing: https://rubin-ci.slac.stanford.edu/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/354/pipeline

bsmartradio force-pushed the tickets/DM-38850 branch 2 times, most recently from 56c586d to 62f1e4b Compare June 29, 2023 18:38

bsmartradio requested a review from parejkoj June 29, 2023 23:16

parejkoj requested changes Jul 7, 2023

View reviewed changes

bsmartradio changed the title ~~Make trailedAssociatorTask~~ DM-38850: Make trailedAssociatorTask Jul 21, 2023

bsmartradio force-pushed the tickets/DM-38850 branch 3 times, most recently from 43b2a7d to d7c4173 Compare August 21, 2023 18:39

bsmartradio force-pushed the tickets/DM-38850 branch 6 times, most recently from 3dedc44 to 2c8f5ae Compare August 23, 2023 21:37

bsmartradio force-pushed the tickets/DM-38850 branch from 9e17d3b to d555112 Compare August 24, 2023 18:17

parejkoj reviewed Sep 12, 2023

View reviewed changes

timj reviewed Sep 12, 2023

View reviewed changes

parejkoj requested changes Sep 12, 2023

View reviewed changes

bsmartradio force-pushed the tickets/DM-38850 branch 7 times, most recently from 1af91a5 to 19a1c29 Compare September 19, 2023 20:43

bsmartradio added 2 commits September 19, 2023 13:55

Make trailedAssociatorTask

e0a704b

Update test_association_task.py array unit tests

58a77e4

Update unit tests in test_association_task.py to use np.testing.assert_array_equal for testing array equality.

bsmartradio force-pushed the tickets/DM-38850 branch from 41bc3f9 to 20baae2 Compare September 19, 2023 20:55

parejkoj approved these changes Sep 26, 2023

View reviewed changes

bsmartradio force-pushed the tickets/DM-38850 branch from 5f3e408 to a907931 Compare September 26, 2023 23:40

Align Struct doc strings with developer guide

a5f9b31

Plus several edits from review

bsmartradio force-pushed the tickets/DM-38850 branch from a907931 to a5f9b31 Compare September 28, 2023 22:01

bsmartradio merged commit 9b2d7c2 into main Sep 28, 2023

bsmartradio deleted the tickets/DM-38850 branch September 28, 2023 22:02


		results = trailedSourceFilterTask.run(self.diaSources, self.exposure)

		self.assertEqual(len(results.diaSources),

	exposure_time=diffIm.getInfo().getVisitInfo().getExposureTime())
	exposure_time=diffIm.visitInfo.exposureTime)

		Creates a mask for sources with lengths greater than 0.416
		arcseconds/second multiplied by the exposure time.

	Creates a mask for sources with lengths greater than 0.416
	arcseconds/second multiplied by the exposure time.
	Return a mask of sources with lengths greater than ``config.maxTrailLength`` multiplied by the exposure time.

DM-38850: Make trailedAssociatorTask #173

DM-38850: Make trailedAssociatorTask #173

Uh oh!

Conversation

bsmartradio commented Jun 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

parejkoj commented Jul 7, 2023

Uh oh!

parejkoj left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bsmartradio commented Aug 23, 2023

Uh oh!

bsmartradio commented Sep 11, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

parejkoj left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bsmartradio commented Jun 29, 2023 •

edited

Loading