Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subaru-pfsObject data loader fails with the latest datamodel #1201

Open
monodera opened this issue Dec 4, 2024 · 0 comments · May be fixed by #1202
Open

Subaru-pfsObject data loader fails with the latest datamodel #1201

monodera opened this issue Dec 4, 2024 · 0 comments · May be fixed by #1202

Comments

@monodera
Copy link

monodera commented Dec 4, 2024

The Subaru-pfsObject data loader fails with the latest datamodel as follows.

from specutils import Spectrum1D

filename = "pfsObject-00009-00000-0,0-0000000000000067-002-0x60241444a9af118c.fits"

spec = Spectrum1D.read(filename, format="Subaru-pfsObject")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[4], line 1
----> 1 spec = Spectrum1D.read(filename, format="Subaru-pfsObject")

File ~/tmp/test_specutils/.venv/lib/python3.11/site-packages/astropy/nddata/mixins/ndio.py:59, in NDDataRead.__call__(self, *args, **kwargs)
     58 def __call__(self, *args, **kwargs):
---> 59     return self.registry.read(self._cls, *args, **kwargs)

File ~/tmp/test_specutils/.venv/lib/python3.11/site-packages/astropy/io/registry/core.py:221, in UnifiedInputRegistry.read(self, cls, format, cache, *args, **kwargs)
    218         kwargs.update({"filename": path})
    220 reader = self.get_reader(format, cls)
--> 221 data = reader(*args, **kwargs)
    223 if not isinstance(data, cls):
    224     # User has read with a subclass where only the parent class is
    225     # registered.  This returns the parent class, so try coercing
    226     # to desired subclass.
    227     try:

File ~/tmp/test_specutils/.venv/lib/python3.11/site-packages/specutils/io/default_loaders/subaru_pfs_spec.py:68, in pfs_spec_loader(file_obj, **kwargs)
     65 with read_fileobj_or_hdulist(file_obj, **kwargs) as hdulist:
     66     header = hdulist[0].header
     67     meta = {'header': header,
---> 68             'tract': m['tract'],
     69             'patch': m['patch'],
     70             'catId': m['catId'],
     71             'objId': m['objId'],
     72             'nVisit': m['nVisit'],
     73             'pfsVisitHash': m['pfsVisitHash']}
     75     # spectrum is in HDU 2
     76     data = hdulist[2].data['flux']

TypeError: 'NoneType' object is not subscriptable

This is due to significant updates on the datamodel (https://github.com/Subaru-PFS/datamodel/blob/244bdeacf0e062e13b75d8d541e962b52c22bffb/datamodel.txt#L866)

Combined spectra.

In SDSS we used "spPlate" files, named by the plugplate and MJD of observation but this is not suitable for
PFS where we:
  1.  Will split observations of the same object over multiple nights
  2.  Will potentially reconfigure the PFI between observations.

I don't think it makes sense to put multiple spectra together based on sky coordinates as we may go back and
add more observations later, so I think we're forced to separate files for every object.  That's a lot of
files, but maybe not too bad?  We could use a directory structure based on HSC's (tract, patch) -- note that
these are well defined even if we are not using HSC data to target.  An alternative would be to use a
healpix or HTM id.

Because we may later obtain more data on a given object, or decide that some data we have already taken is
bad, or process a number of subsets of the available data, there may be more than one set of visits used
to produce a pfsObject file for a given object.  We therefore include both the number of visits (nVisit)
and a SHA-1 hash of the visits, pfsVisitHash.  We use both as nVisits may be ambiguous, while pfsVisitHash
isn't human-friendly;  in particular it doesn't sort in a helpful way.  It seems improbable that we will
ever have more than 1000 visits, but as the pfsVisitHash is unambiguous it seemed safer to only allow for
larger values of nVisit, but record them only modulo 1000.

     "pfsObject-%05d-%05d-%s-%016x-%03d-0x%016x.fits"
         % (catId, tract, patch, objId, nVisit % 1000, pfsVisitHash)

The path would be
   catId/tract/patch/pfsObject-*.fits

The file will have several HDUs:

HDU #0 PDU
HDU #1 FLUX        Flux in units of nJy                           [FLOAT]        NROW
HDU #2 MASK        Pixel mask                                     [32-bit INT]   NROW
HDU #3 TARGET      Binary table                                   [FITS BINARY TABLE] NFILTER
          Columns for:
          filterName                                  [STRING]
          fiberFlux                                   [FLOAT]
HDU #4 SKY         Sky flux in same units as FLUX                 [FLOAT]        NROW
HDU #5 COVAR       Near-diagonal part of FLUX's covariance        [FLOAT]        NROW*3
HDU #6 COVAR2      Low-resolution non-sparse estimate covariance  [FLOAT]        NCOARSE*NCOARSE
HDU #7 OBSERVATIONS    Binary table                               [FITS BINARY TABLE] NOBS
          Columns for:
          visit                                       [32-bit INT]
          arm                                         [STRING]
          spectrograph                                [32-bit INT]
          pfsDesignId                                 [64-bit INT]
          fiberId                                     [32-bit INT]
          nominal PFI position (millimeters)          [FLOAT]*2
          actual PFI position (millimeters)           [FLOAT]*2
HDU #8 FLUXTABLE   Binary table                                   [FITS BINARY TABLE] NOBS*NROW
          Columns for:
		      wavelength in units of nm (vacuum)          [64-bit FLOAT]
		      intensity in units of nJy                   [FLOAT]
		      intensity error in same units as intensity  [FLOAT]
		      mask                                        [32-bit INT]
HDU #9 NOTES       Reduction notes                                [FITS BINARY TABLE] NNOTES

The wavelengths are specified via the WCS cards in the header (e.g. CRPIX1,
CRVAL1) for the FLUX, MASK, SKY, COVAR extensions and explicitly in the table
for the FLUXTABLE.  We chose these two representations for the data due to the
difficulty in resampling marginally sampled data onto a regular grid,  while
recognising the convenience of such a grid when rebinning, performing PCAs, or
stacking spectra.  For highest precision the data in the FLUXTABLE is likely to
be used.

The TARGET HDU must contain at least the keywords
    catId       Catalog identifier         INT
    tract       Tract identifier           INT
    patch       Patch identifier           STRING
    objId       Object identifier          INT
    ra          Right Ascension (degrees)  DOUBLE
    dec         Declination (degrees)      DOUBLE
    targetType  Target type enum           INT

(N.b. the keywords are case-insensitive).  Other HDUs should specify INHERIT=T.

See pfsArm for definition of the COVAR data

What resolution should we use for HDU #1?  The instrument has a dispersion per pixel which is roughly constant
(in the blue arm Jim-sensei calculates that it varies from 0.70 to 0.65 (going red) A/pix; in the red, 0.88 to
0.82, and in the IR, 0.84 to 0.77).  We propose that we sample at 0.8 A/pixel.

The second covariance table (COVAR2) is the full covariance at low spectral resolution, maybe 10x10. It's
really only 0.5*NCOARSE*(NCOARSE + 1) numbers, but it doesn't seem worth the trouble to save a few bytes.
This covariance is needed to model the spectrophotometric errors.

The reduction notes (NOTES) HDU is a FITS table HDU with a single row and a variety of columns.
The values record operations performed and measurements made during reduction for the spectrum.

Note that we don't keep the SDSS "AND" and "OR" masks -- if needs be we could set two mask bits to capture
the same information, but in practice SDSS's OR masks were not very useful.

For data taken with the medium resolution spectrograph, HDU #1 is expected to be at the resolution of
the medium arm, and to omit the data from the blue and IR arms.

Then, the loader should read wavelength, flux, and flux error from the 8th extension, while the current loader read them from the 2nd extension.

I will make a PR to fix the issue.

@pllim pllim changed the title Subaur-pfsObject data loader fails with the latest datamodel Subaru-pfsObject data loader fails with the latest datamodel Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant