-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Looking to contribute] Where can I find the parameter update loop? #245
Comments
Hi @Joshuaalbert! I can certainly try to help you. The first thing to understand is where the solvers are actually called - each gain type calls a different underlying implementation. That happens here (apologies in advance, the calling function needs polishing): QuartiCal/quartical/calibration/solver.py Lines 145 to 148 in ad31f54
If you want to utilize QuartiCal as a test bed for different solvers/approaches, you are more than welcome to implement an entirely new solver which can slot in at that location - the only real requirement is the way in which the arguments enter the function. Obviously, an entirely new solver may be overkill. If you want to modify one of the existing ones, that code lives in QuartiCal/quartical/gains/phase/kernel.py Line 54 in ad31f54
And the call which computes the gradients (well, sort of - QC doesn't ever explicitly form up the Jacobian or J^H J) is here: QuartiCal/quartical/gains/phase/kernel.py Lines 109 to 114 in ad31f54
Hopefully that is enough to get your started! I have neglected to mention anything about the Dask layer and passing additional arguments into the solver, so let me know if those block you. One final thing to be aware of is #244 - I am in the middle of making a fairly large number of changes that will probably make things a little clearer/easier. Just wanted to give you fair warning as the code may shift under you a little. |
Thanks for the points. I'm going to take a look at modifying the phase solver. I'll try to make it a modular modification so that it will be easy to morph to your refactor.
|
A pattern that I'd like to explore is:
Do you think this could be seen as a
|
@JSKenyon You mention in the documentation that effectively setting an This two-pass approach is a fast approximation of actually regularised optimisation. So, it's not ideal, but it's far more efficient than a general approach. Questions:
|
Hi @Joshuaalbert - apologies for the slow reply. I have been a combination of on leave/ill for the last week or so.
This is something we have considered in the past but it has some problems from a computational perspective, depending on whether you require access to all the gain solutions before smoothing. It is not impossible but I suspect it will end up being unnecessarily complicated.
Absolutely, and in fact I am working on improving this functionality to correctly interpolate parameterized terms. I think that this may be the ideal location to incorporate smoothing functionality as you will have access to all the gains. Smoothing is just a more intelligent form of interpolation in this context.
You could definitely try doing what you suggest. The only issue I see is that the smooth term would actually be updating an ML gain (if I am following you correctly). The map in question is https://github.com/ratt-ru/QuartiCal/blob/main/quartical/gains/__init__.py.
It depends - each gain term has a a QuartiCal/quartical/gains/phase/kernel.py Line 29 in ad31f54
Note that this relies on the argument in question having entered the dask layer with the same name. Unfortunately, this is where things get complicated as the dask code is involved. It may be worth setting up a Zoom to discuss it at some point. Finally, I think that the best way of doing this would not involve a smoothing term in the chain - chains are currently quite uniform i.e. one term behaves like another. I worry that a smoothing term in the chain may become complicated. Instead, it may make more sense to aim for a more general smoothing solution such that any gain could be loaded with something like Let me know if you think it would be easier to discuss this in a meeting. |
@Joshuaalbert I am very curious to see what your experiments reveal. It would be nice to contrast this to the visibility space approach that @rcyndie is working on. I smooth bandpass solutions after the fact and this seems to work pretty well (this is actually why the JHJ terms are written out as one of the gain output products, soon also for parameterised terms). For bandpass it is a bit difficult to do this inside the solver layer because neighbouring frequency chunks don't speak to each other, but you don't have that problem for the direction axis. One thing that might be missing at the moment are the direction coordinates which would be needed if you want to smooth using GPR with some form of stationary smoothing kernel. I know @JSKenyon is working on that. It may be beneficial to have a meeting to discuss this as we have been thinking along similar lines. If you do get around to setting up a meeting please can I be included? |
When I do smoothing in a densely filled aperture with a stationary kernel (not the fancy tomographic kernel), the gains do get closer to ground truth. However, in https://www.aanda.org/articles/aa/full_html/2020/03/aa37424-19/aa37424-19.html you will see that we actually proved that using the tomographic kernel beats using a simple kernel (in both dense and sparse aperture sampling). At Lofar with low frequencies we got outliers in gains that threw things for a bit of a loop. Even when outliers are identified and flagged there are still phase wraps to deal with. Any phase wrap will cause problems. So this solution really only works at higher frequencies. Certainly L-band. |
@landmanbester Update on this. Smoothing in the presence of tiny systematics leads to considerable deterioration of the solutions. And decoherence due to the ionosphere changing is enough at 700MHz to render smoothing unless you know for certain your temporal solution intervals are short enough. I'm currently looking at alternative approaches to smoothing on solutions to achieve science goals. Decoherence due to varying ionosphereThese are from simulated visibilties with frozen flow, with nominal ionosphere properties. At 10s, there is decoherence happening, and dropping to 6s removes that. Smoothing with decoherence leads to large errors. |
Just to make sure I am following (I missed your previous post somehow, sorry) are each of the dots a direction that has been solved for and do you have homogeneous SNR in each direction? If not you can consider using JHJ as a proxy for the weights when doing the smoothing. I have found that this works well but that the overall scale might be off so using JHJ/sigma**2 where you treat sigma as a hyperparameter might be necessary. If there are considerable systematics in the data (like residual RFI) I find that using a Student's-t likelihood during smoothing helps but runs the risk of suppressing real gain variations if the parameters are initialised too far off their nominal values. Definitely initialising with JHJ as weights seems to work better than starting from unity. Have you considered smoothing in time as well to deal with a varying ionosphere? I mean by making the time solution intervals quite short and then smoothing after the fact.
We have been thinking about ways to deal with this but it is hard, especially for noisy data. I think we do not have sufficient experience with real data to know if our current solution will work in realistic cases, or when it would work. Do you see phase wraps in the spatial axes or is it sufficient to have a zeroth order term to capture them? I mean is it usually possible to write the phase as phi(t, nu, l, m) = phi0(t, nu) + delta(l, m) where all the wrapping can be captured by the DI term phi0(t, nu)? Sorry @JSKenyon, we have definitely strayed from the original issue. These are all just things I have been thinking about. Might be good to touch base at some point |
@landmanbester each dot is an antenna actually. It just looks like a FoV because it's the DSA2k and it's a well-filled aperature (a little bit more than 1-square km actually within a 18km x 10km oval). We're doing some highly controlled experiments with simulated dsa2k data and are learning a fair amount about smoothing, as it was inintially a targeted approach for speeding up calibration while meeting science goals. I'd be happy to have a call to share what is being learned and get you involved. |
Describe the problem that the feature should address
I'd like to contribute a small extension to quartical.
What I'd like to do is a small modification to the parameter update loop, where you compute the gradients. Can you point me there? I'll experiment with adding a regularisation term, that enforces smoothness over aperature of phases. I'm exploring calibration schemes for DSA2000 and would like to see if Quartical could be the test bed, so I'd need to have a working understanding of how it's designed. @JSKenyon would you be able to assist this understanding?
The text was updated successfully, but these errors were encountered: