Implement higher fidelity Gaia modelling #36

sefffal · 2024-08-04T17:50:26Z

This issue outlines my plan for higher fidelity modelling of Gaia data, either as part of the HGCA modelling or by itself.

Currently our HGCA likelihood models the Gaia and Hipparcos catalog measurements by averaging the instantaneous proper motion and position of the star 25 times during each of the Hipparcos and Gaia missions.

Recently, I merged full Hipparcos IAD modelling. I'd like to develop higher fidelity Gaia modelling to match.

After reviewing Orvara and Orbitize, I see that they model the Gaia catalog values by:

using scan epochs and angles estimated by GHOST
calculating the perturbation only from any planets (not the full sky path)
fitting a straight line in least squares sense in order to get delta position and proper motion
compare these delta positions and proper motions to the HGCA

These choices are certainly defensible and are better than what we currently do with Octofitter. Still, I find the approach a little bit unsatisfying and think we could implement an even-further improved version in the following ways:

use our AbsoluteVisual type to calculate the full sky path, accounting for spherical coordinates, perspective acceleration, changing parallax, light travel time, and perturbations from planets given each parameter draw
after getting the model sky path with all disturbances etc, fit a true sky path model (using the same simplified model that Gaia does)
take the computed best fitting position and PM and compare it to the Gaia catalog values.

A few open questions remain:

gradients of the nested model fit: i think these can be calculated using the implicit function theorem eg via ImplicitDifferentiation.jl
how to compare the model fit to the Gaia catalog values. We could just take the new best fit values and compare to the catalog using the catalog uncertainty. But I think we do even a lot better if we could compute uncertainties too, and then compare the catalog distribution to the mode fit distribution. This would have to use some kind of distribution divergence metric, but I don't know much about this.
how to compare to the Hipparcos value. We could use the HGCA cross cal parameters to nudge the Hipparcos raw data into the Gaia reference frame ourselves, and otherwise ignore the HGCA. But this doesn't capture the error inflation benefits of the catalog.
we could add a "jitter" uncertainty to the Hipparcos epoch to let the uncertainties grow on the fly, just like I think Fabo Feng does

Other thoughts:

if we structure this well, we'll be able to also model DR2 separately from DR3. I think this could help constrain velocity changes between the DR2 time range and the longer DR3 time range. Uncertainties will obviously be correlated though due to shared data.

sefffal · 2024-08-07T23:43:06Z

Okay so this plan is showing surprisingly promising results.

Procedure:

get GOST scan angles and dates
simulate a true sky path (including planet perturbations)
find the best fitting Gaia 5-parameter fit
calculate uncertainties and covariance on the Gaia fit using the inverse hessian (almost certainly what they're doing)
compare the "posterior" from the Gaia fit to the catalog means, uncertainties, and correlations using the KL divergence
profit!

Right now to compare the Gaia simple fit + uncertainties to the catalog means and uncertainties, I'm using the KL-divergence and just treating it like a likelihood.
I don't know if that's valid! I thought we'd need to calibrate the score against simulated data or something. But it seems to work to a crazy good degree?

                        plx             ra              dec              pmra           pmdec           unc_mas
Catalog values:         29.145325       158.30708       40.425554       -136.29064      1.7735165       1.0
Posterior values:       29.134312       158.30198       40.442729       -136.29526      1.7646462       3.0593668e-5
Catalog uncertainties:  0.14073011      0.080025345     0.10712508      0.11222193      0.13798834
Posterior uncertainties:0.13947858      0.079581708     0.10309836      0.11191418      0.13681167      0.16279464

edit: it's a bit odd that the RA and Dec are ~0.005 or so degrees off, and yet their uncertainties and even covariances are basically spot on. Maybe something to do with reference epochs. Edit 2: never mind, fixed.

sefffal · 2024-08-08T01:09:01Z

I realized something pretty funny. Including the parameter means, uncertainties, and correlations, each target with a five parameter solution has 19 reported variables. (+1 more for excess astrometric noise).

For many targets, that’s about number of degrees of freedom the as the total number of scans!

That the system of equations should be fairly well-determined and we should almost be able to back out each individual scan measurement(?).

sefffal · 2024-08-13T00:01:43Z

Moderate success fitting DR2 and DR3 to back out IAD. It seems to work best when there are fewer visibility windows than reported parameters, so that everything is well constrained.

The next issue is accounting for missed / skipped / extra scans. GHOST isn't matching the actual reported number of visibility windows very well for either DR2 or DR3.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement higher fidelity Gaia modelling #36

Implement higher fidelity Gaia modelling #36

sefffal commented Aug 4, 2024

sefffal commented Aug 7, 2024 •

edited

Loading

sefffal commented Aug 8, 2024

sefffal commented Aug 13, 2024

Implement higher fidelity Gaia modelling #36

Implement higher fidelity Gaia modelling #36

Comments

sefffal commented Aug 4, 2024

sefffal commented Aug 7, 2024 • edited Loading

sefffal commented Aug 8, 2024

sefffal commented Aug 13, 2024

sefffal commented Aug 7, 2024 •

edited

Loading