DM-45218: Refactoring diaPipe #237

abudlong · 2024-09-16T22:22:19Z

No description provided.

isullivan

Looks good overall. I have a few requests about organization and commit history that I would like to take another look at after you address these comments.

Also, please check that all dataset type definitions in the docstrings use back-ticks ` instead of single quotes '. Our documentation build system uses the back-ticks to make links and references, so the formatting is important.

isullivan · 2024-09-16T23:49:51Z

python/lsst/ap/association/utils.py

+    diaForcedSources = pd.DataFrame(columns=[
+        "diaForcedSourceId", "diaObjectID", "ccdVisitID", "psfFlux", "psfFluxErr",
+        "ra", "dec", "midpointMjdTai", "band",
+    ])
+    diaForcedSources = convertTableToSdmSchema(schema, diaForcedSources,
+                                               tableName="DiaForcedSource",
+                                               )


Looking at this more carefully, all of this could be reduced to:
diaForcedSources = convertTableToSdmSchema(schema, pd.DataFrame(), tableName="DiaForcedSource")
since that function will add the necessary columns and will delete any additional columns.

isullivan · 2024-09-16T23:54:55Z

tests/utils_tests.py

-        objectIds = diaObjectIds[rng.randint(len(diaObjectIds), size=nSources)]
+        objectIds = diaObjectIds[rng.integers(len(diaObjectIds), size=nSources)]


Move this change to its own commit (it's a real bug fix, not just a refactor). It should be before the refactor commit in the commit history.

isullivan · 2024-09-16T23:55:14Z

tests/utils_tests.py

@@ -240,6 +268,7 @@ def makeExposure(flipX=False, flipY=False):
    exposure.setDetector(detector)
    exposure.info.setVisitInfo(visit)
    exposure.setFilter(afwImage.FilterLabel(band='g'))
+    exposure.setPhotoCalib(afwImage.PhotoCalib(1., 0., exposure.getBBox()))


As above, move this change to its own commit (it's a real bug fix, not just a refactor). It should be before the refactor commit in the commit history.

isullivan · 2024-09-17T16:48:29Z

python/lsst/ap/association/diaPipe.py

+        return (associatedDiaSources, createResults.newDiaObjects)
+
+    @timeMethod
+    def associationCatalogMerge(self, preloadedDiaSources, associatedDiaSources, diaObjects, newDiaObjects):


LSST code style is to name methods in active tense, when possible. Here I would use mergeAssociatedCatalogs

isullivan · 2024-09-17T16:52:57Z

python/lsst/ap/association/diaPipe.py

+            # alertPackager needs correct columns
+            diaForcedSources = makeEmptyForcedSourceTable(self.schema)
+
+        # Write results to Alert Production Data Base (APDB)


Database is one word

isullivan · 2024-09-17T18:16:16Z

python/lsst/ap/association/diaPipe.py

+            Previously detected DiaSources, loaded from the APDB.
+        preloadedDiaForcedSources : `pandas.DataFrame`
+            Catalog of previously detected forced DiaSources, from the APDB.
+        associatedDiaSources : 'pandas.DataFrame'


Rename as above.

isullivan · 2024-09-17T18:17:21Z

python/lsst/ap/association/diaPipe.py

+        preloadedDiaForcedSources : `pandas.DataFrame`
+            Catalog of previously detected forced DiaSources, from the APDB.
+        associatedDiaSources : 'pandas.DataFrame'
+            Associated DiaSources with DiaObjects.


How about:
Updated DiaSource catalog with associated diaObjectIds.

isullivan · 2024-09-17T18:21:50Z

tests/test_diaPipe.py

+        self.apdb = daxApdb.Apdb.from_config(apdb_config)
+        self.schema = readSchemaFromApdb(self.apdb)


Neither self.apdb or self.schema are used, so remove them.

isullivan · 2024-09-17T18:23:36Z

tests/test_diaPipe.py

+    def _testRun(self, doPackageAlerts=False, doSolarSystemAssociation=False, doRunForcedMeasurement=False):
        """Test the normal workflow of each ap_pipe step.
        """
        config = self._makeDefaultConfig(
            config_file=self.config_file.name,
            doPackageAlerts=doPackageAlerts,
-            doSolarSystemAssociation=doSolarSystemAssociation)
+            doSolarSystemAssociation=doSolarSystemAssociation,
+            doRunForcedMeasurement=doRunForcedMeasurement)


Remove doRunForcedMeasurement for now

isullivan · 2024-09-17T18:24:35Z

tests/utils_tests.py

+def makeSolarSystemSources(nSources, diaObjectIds, exposure, rng, randomizeObjects=False):
+    """Make a test set of solar system sources.
+
+    Parameters
+    ----------
+    nSources : `int`
+        Number of sources to create.
+    diaObjectIds : `numpy.ndarray`
+        Integer Ids of diaobjects to "associate" with the DiaSources.
+    exposure : `lsst.afw.image.Exposure`
+        Exposure to create sources over.
+    randomizeObjects : `bool`, optional
+        If True, randomly draw from `diaObjectIds` to generate the ids in the
+        output catalog, otherwise just iterate through them, repeating as
+        necessary to get nSources objectIds.
+
+    Returns
+    -------
+    solarSystemSources : `pandas.DataFrame`
+        Solar system sources generated across the exposure.
+    """
+    solarSystemSources = makeDiaSources(nSources, diaObjectIds, exposure, rng, randomizeObjects=False)
+    solarSystemSources["ssObjectId"] = rng.integers(0, high=2**63-1, size=nSources)
+    solarSystemSources["Err(arcsec)"] = rng.uniform(0.2, 0.4, size=nSources)
+
+    return solarSystemSources
+
+


Add this function on its own commit

isullivan

Looks good! Just a couple of minor cleanups this time around.

isullivan · 2024-09-19T22:22:28Z

python/lsst/ap/association/diaPipe.py

+        # columns "ra" and "dec" are required for spatial sharding in Cassandra
+        diaForcedSources.rename(columns={"coord_ra": "ra", "coord_dec": "dec"}, inplace=True)


These lines were removed on main after you made your branch, and should be removed from the refactor.

isullivan · 2024-09-19T22:26:48Z

python/lsst/ap/association/diaPipe.py

@@ -451,11 +451,138 @@ def run(self,
        # Accept either legacySolarSystemTable or optional solarSystemObjectTable.
        if legacySolarSystemTable is not None and solarSystemObjectTable is None:
            solarSystemObjectTable = legacySolarSystemTable
+


Not your change, but solarSystemObjectTable should be added to the docstring above (I can't comment on unchanged lines, so I had to put it here). You can just copy what you have in associateDiaSources

isullivan · 2024-09-19T22:36:38Z

python/lsst/ap/association/diaPipe.py

@@ -493,6 +620,38 @@ def run(self,
                                        createResults.nNewDiaObjects,
                                        nTotalSsObjects,
                                        nAssociatedSsObjects)
+        return (associatedDiaSources, createResults.newDiaObjects)


It would be helpful to add a log message just before the return:

self.log.info("%i updated and %i unassociated diaObjects. Creating %i new diaObjects", assocResults.nUpdatedDiaObjects, assocResults.nUnassociatedDiaObjects, createResults.nNewDiaObjects, )

isullivan · 2024-09-19T22:43:02Z

python/lsst/ap/association/diaPipe.py

+        return diaForcedSources
+
+    @timeMethod
+    def writeToApdb(self, associatedDiaSources, diaForcedSources, updatedDiaObjects):


Re-order parameters to match the order supplied to apdb.store:

def writeToApdb(self, updatedDiaObjects, associatedDiaSources, diaForcedSources):

Make sure to also update the order of the parameters in run where self.writeToApdb is called.

isullivan · 2024-09-19T22:44:58Z

python/lsst/ap/association/utils.py

+
+
+def makeEmptyForcedSourceTable(schema):
+    """Return a dataframe with the correct columns for diaForcedSources table.
+
+    Returns
+    -------
+    diaForcedSources : `pandas.DataFrame`
+        Empty dataframe.
+    """
+    diaForcedSources = convertTableToSdmSchema(schema, pd.DataFrame(), tableName="DiaForcedSource")
+    return diaForcedSources


Edit: leave this as-is.

abudlong force-pushed the tickets/DM-45218 branch from dc15d6a to a712fa2 Compare September 16, 2024 22:36

isullivan requested changes Sep 17, 2024

View reviewed changes

abudlong force-pushed the tickets/DM-45218 branch 3 times, most recently from fea6e8a to 5dfb07c Compare September 19, 2024 20:18

abudlong added 4 commits September 19, 2024 15:25

Cleanup code comment.

f0a9b9b

Update random integer function

1b3192d

Attach real photoCalib to test exposure

f5a9124

Add solar system test sources

330a15a

abudlong force-pushed the tickets/DM-45218 branch from 5dfb07c to 94431bc Compare September 19, 2024 22:25

isullivan approved these changes Sep 19, 2024

View reviewed changes

Refactoring diaPipe.py

f1c5103

abudlong force-pushed the tickets/DM-45218 branch from e02711f to f1c5103 Compare September 20, 2024 22:30

abudlong merged commit 71bdef3 into main Sep 20, 2024
2 checks passed

abudlong deleted the tickets/DM-45218 branch September 20, 2024 22:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-45218: Refactoring diaPipe #237

DM-45218: Refactoring diaPipe #237

abudlong commented Sep 16, 2024

isullivan left a comment

isullivan Sep 16, 2024

isullivan Sep 16, 2024

isullivan Sep 16, 2024

isullivan Sep 17, 2024

isullivan Sep 17, 2024

isullivan Sep 17, 2024

isullivan Sep 17, 2024

isullivan Sep 17, 2024

isullivan Sep 17, 2024

isullivan Sep 17, 2024

isullivan left a comment

isullivan Sep 19, 2024

isullivan Sep 19, 2024

isullivan Sep 19, 2024

isullivan Sep 19, 2024

isullivan Sep 19, 2024 •

edited

Loading

		objectIds = diaObjectIds[rng.randint(len(diaObjectIds), size=nSources)]
		objectIds = diaObjectIds[rng.integers(len(diaObjectIds), size=nSources)]

		self.apdb = daxApdb.Apdb.from_config(apdb_config)
		self.schema = readSchemaFromApdb(self.apdb)

		# columns "ra" and "dec" are required for spatial sharding in Cassandra
		diaForcedSources.rename(columns={"coord_ra": "ra", "coord_dec": "dec"}, inplace=True)

DM-45218: Refactoring diaPipe #237

DM-45218: Refactoring diaPipe #237

Conversation

abudlong commented Sep 16, 2024

isullivan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

isullivan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

isullivan Sep 19, 2024 • edited Loading

Choose a reason for hiding this comment

isullivan Sep 19, 2024 •

edited

Loading