Skip to content

Commit

Permalink
add: new SDSS V datatype loaders (#1107)
Browse files Browse the repository at this point in the history
* add: sdss_v.py

working on loaders

* add: test.py for testing things, fix ordered method for apVisit

* feat: apStar/apVisit functionality, new helper funcs

new helper funcs:
 - _fetch_metadata to perform grab of common metadata
 - _fetch_flux_unit to get flux unit from the given HDU and convert it to an astropy unit object

* add: astra_readers.py

pulled from https://raw.githubusercontent.com/sdss/astra/298a73ce600db428cf0a2ed8a707a56c2182ae57/python/astra/tools/spectrum/readers.py

* feat: BOSS specFull loaders

can fully load boss files now

other changes:
- commented out abstract @ things to make it so ipython autoreloads

notes:
- flux unit where for BOSS files?
- what the heck are spAll, zAll, and zList HDUs?
- does the InverseVariance need a unit?

* feat: mwmVisit, mwmStar loaders

added mwmVisit and mwmStar loaders.

updated demonstration notebook accordingly and output to PDF

del: test.pdf and test.ipynb

del: secret SDSS-V data

* feat: specLite and other BOSS REDUX loader functionality

able to now load all BOSS spec directly with the same underlying code.

required refactoring methods into BOSS_spec loaders

* chore: identifier + documentation

* add: test_implementation.py

going to now write implementation test

add

* feat: partial implementation of loaders

mwm confirmed working

still todo:
- add HDU not-specified message
- merge the mwm types into a single 2 loaders
- confirm all other loaders work and add to __all__

* feat: functioning loaders (+refactoring & chore: docs)

- refactored BOSS spec methods and mwm spec methods into single functions for simplicity
- all loaders WORKING!! (except apStar multi)
- all the documentation + type hinting (excluding outputs)
- changed variable names to standard types used in specutils
- TODO: the apStar multi-loader is confusing, so it remains unimplemented for now.
- CHECK: do I need to clean the files of zero vals?
- TODO: BUNIT pulls for spec and mwm files
- TODO: check with data team what mwm files are needed

* add: test_implementation jupyter notebook

- currently non-functional because of zero values in x-axis

- deleted test_implementation.ipynb for policy reasons

* fix: jdaviz nan and zero flux bug

- jdaviz hates nan and zero flux, so they have to be removed
- TODO: open issue on jdaviz repo about nan and zero flux bug

the bug originates in the x_min x_max values used for the redshift slider for line_lists (somehow) on nan and zero flux values in the Spectrum object.

* feat: all multiloaders functional

apStar loader not yet tested because file is of length 1 (no visits)
mwm loaders will SKIP any DATASUM=0 because Spectrum1D cannot be instantiated with zero data

* fix: Astropy units warning + warning format -> print

* ignore: demonstration and test files

* fix: jdaviz specviz 1D in 2D array handling

fixes a jdaviz issue regarding a 1D flux in a 2D object, where it gets confused and explodes

i will put an issue in for it

this fix is different from the previous as it keeps all zero and NaN flux points

* fix: header method -> .get() + other minor fixes

* feat: unit tests on dummy data (excl. MWM)

need someone to help me write a BinTableHDU for mwm files...

* feat: unit tests with assertions

still need to write mwm dummy file for the tests

there's also a foobar variable check for the metadata

* chore: docustrings

* fix: header fetch method -> specutils standard

now obtains header from PrimaryHDU in the HDUList, any data that was previously accessed through it has been removed too

* del: mpl_preamble.py

* feat: individual identifiers + unit tests update

* fix: .gitignore list

keeping .jukit incase anyone else uses vim-jukit during dev

* add: bitmasks to Spectrum object outputs

Spectrum1D intializer converts any 0 to valid values. I'm assuming that zeroes in the bitmask means that its valid, as per manga.py

* fix: spec mask, AND_MASK -> OR_MASK

fix as per @Sean-Morrison 's suggestion in astropy pull req [#1107](#1107)

could be reverted in future, in which case this commit can just be deleted

* fix: spec file identify OBSERVAT column

OBSERVAT column not in everything so i changed it, also adding another LOGLAM check to the coadd HDU check.

* fix: hdu spec -> find 1st hdu with data

instead of specifying a hdu on Spectrum1D loaders for spec and mwm types, it will not find the first HDU with data, or in the case of spec, just use the coadd.

this means that it works directly with jdaviz for those two datatypes correctly now.

there are no user facing methods, and I don't want to break anything, but it should be noted that these datafiles can contain several spectra, which inherently limits this.

in theory, I could put everything as a Spectrum1D nD flux object, but I'm pretty sure that breaks sometimes for jdaviz.

* add: mwm dummy file tests + mask fixes

- force masks to be boolean prior to entering initializer
- add mwm file tests based on dummy file (credit to @andycasey for those dummy file generators)
- add more mwm file tests for failures
- added checks to see if file is empty for mwm files based on datasum (failsafe)

* Fix codestyle errors

---------

Co-authored-by: Ricky O'Steen <[email protected]>
  • Loading branch information
rileythai and rosteen authored Feb 13, 2024
1 parent c646007 commit 0b7098d
Show file tree
Hide file tree
Showing 4 changed files with 1,401 additions and 49 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -66,4 +66,9 @@ pip-wheel-metadata/
# Mac OSX
.DS_Store
.vscode

# Other
.pytest_cache
.jukit
venv
.venv
116 changes: 67 additions & 49 deletions specutils/io/default_loaders/sdss.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,10 @@ def _sdss_wcs_to_log_wcs(old_wcs):
This function does the conversion from the SDSS WCS to FITS WCS.
"""
w0 = old_wcs.wcs.crval[0]
w1 = old_wcs.wcs.cd[0,0]
crval = 10 ** w0
w1 = old_wcs.wcs.cd[0, 0]
crval = 10**w0
cdelt = crval * w1 * np.log(10)
cunit = Unit('Angstrom')
cunit = Unit("Angstrom")
ctype = "WAVE-LOG"

w = WCS(naxis=1)
Expand All @@ -70,11 +70,11 @@ def spec_identify(origin, *args, **kwargs):
"""
# Test if fits has extension of type BinTable and check for spec-specific keys
with read_fileobj_or_hdulist(*args, **kwargs) as hdulist:
return (hdulist[0].header.get('TELESCOP') == 'SDSS 2.5-M' and
hdulist[0].header.get('FIBERID', 0) > 0 and
len(hdulist) > 1 and
(isinstance(hdulist[1], fits.BinTableHDU) and
hdulist[1].header.get('TTYPE3').lower() == 'ivar'))
return (hdulist[0].header.get("TELESCOP") == "SDSS 2.5-M"
and hdulist[0].header.get("FIBERID", 0) > 0
and len(hdulist) > 1
and (isinstance(hdulist[1], fits.BinTableHDU)
and hdulist[1].header.get("TTYPE3").lower() == "ivar"))


def spSpec_identify(origin, *args, **kwargs):
Expand All @@ -85,10 +85,10 @@ def spSpec_identify(origin, *args, **kwargs):
# Test telescope keyword and check if primary HDU contains data
# consistent with spSpec format
with read_fileobj_or_hdulist(*args, **kwargs) as hdulist:
return (hdulist[0].header.get('TELESCOP') == 'SDSS 2.5-M' and
hdulist[0].header.get('FIBERID', 0) > 0 and
isinstance(hdulist[0].data, np.ndarray) and
hdulist[0].data.shape[0] == 5)
return (hdulist[0].header.get("TELESCOP") == "SDSS 2.5-M"
and hdulist[0].header.get("FIBERID", 0) > 0
and isinstance(hdulist[0].data, np.ndarray)
and hdulist[0].data.shape[0] == 5)


def spPlate_identify(origin, *args, **kwargs):
Expand All @@ -99,14 +99,16 @@ def spPlate_identify(origin, *args, **kwargs):
# Test telescope keyword and check if primary HDU contains data
# consistent with spSpec format
with read_fileobj_or_hdulist(*args, **kwargs) as hdulist:
return (hdulist[0].header.get('TELESCOP') == 'SDSS 2.5-M' and
hdulist[0].header.get('FIBERID', 0) <= 0 and
isinstance(hdulist[0].data, np.ndarray) and
hdulist[0].data.shape[0] > 5)
return (hdulist[0].header.get("TELESCOP") == "SDSS 2.5-M"
and hdulist[0].header.get("FIBERID", 0) <= 0
and isinstance(hdulist[0].data, np.ndarray)
and hdulist[0].data.shape[0] > 5)


@data_loader(
label="SDSS-III/IV spec", identifier=spec_identify, extensions=['fits'],
label="SDSS-III/IV spec",
identifier=spec_identify,
extensions=["fits"],
priority=10,
)
def spec_loader(file_obj, **kwargs):
Expand All @@ -127,30 +129,37 @@ def spec_loader(file_obj, **kwargs):
"""
with read_fileobj_or_hdulist(file_obj, **kwargs) as hdulist:
header = hdulist[0].header
meta = {'header': header}
meta = {"header": header}

bunit = header.get('BUNIT', '1e-17 erg / (Angstrom cm2 s)')
if 'Ang' in bunit and 'strom' not in bunit:
bunit = bunit.replace('Ang', 'Angstrom')
bunit = header.get("BUNIT", "1e-17 erg / (Angstrom cm2 s)")
if "Ang" in bunit and "strom" not in bunit:
bunit = bunit.replace("Ang", "Angstrom")
flux_unit = Unit(bunit)

# spectrum is in HDU 1
flux = hdulist[1].data['flux'] * flux_unit
flux = hdulist[1].data["flux"] * flux_unit

uncertainty = InverseVariance(hdulist[1].data['ivar'] / flux_unit**2)
uncertainty = InverseVariance(hdulist[1].data["ivar"] / flux_unit**2)

dispersion = 10**hdulist[1].data['loglam']
dispersion_unit = Unit('Angstrom')
dispersion = 10**hdulist[1].data["loglam"]
dispersion_unit = Unit("Angstrom")

mask = hdulist[1].data['and_mask'] != 0
mask = hdulist[1].data["and_mask"] != 0

return Spectrum1D(flux=flux, spectral_axis=dispersion * dispersion_unit,
uncertainty=uncertainty, meta=meta, mask=mask)
return Spectrum1D(
flux=flux,
spectral_axis=dispersion * dispersion_unit,
uncertainty=uncertainty,
meta=meta,
mask=mask,
)


@data_loader(
label="SDSS-I/II spSpec", identifier=spSpec_identify,
extensions=['fit', 'fits'], priority=10,
label="SDSS-I/II spSpec",
identifier=spSpec_identify,
extensions=["fit", "fits"],
priority=10,
)
def spSpec_loader(file_obj, **kwargs):
"""
Expand All @@ -176,32 +185,37 @@ def spSpec_loader(file_obj, **kwargs):
with read_fileobj_or_hdulist(file_obj, **kwargs) as hdulist:
header = hdulist[0].header
# name = header.get('NAME')
meta = {'header': header}
meta = {"header": header}
wcs = WCS(header).dropaxis(1)

bunit = header.get('BUNIT', '1e-17 erg / (Angstrom cm2 s)')
bunit = header.get("BUNIT", "1e-17 erg / (Angstrom cm2 s)")
# fix mutilated flux unit
bunit = bunit.replace('/cm/s/Ang', '/ (Angstrom cm2 s)')
if 'Ang' in bunit and 'strom' not in bunit:
bunit = bunit.replace('Ang', 'Angstrom')
bunit = bunit.replace("/cm/s/Ang", "/ (Angstrom cm2 s)")
if "Ang" in bunit and "strom" not in bunit:
bunit = bunit.replace("Ang", "Angstrom")
flux_unit = Unit(bunit)
flux = hdulist[0].data[0, :] * flux_unit

uncertainty = StdDevUncertainty(hdulist[0].data[2, :] * flux_unit)

# Fix the WCS if it is claimed to be linear
if header.get('DC-Flag', 1) == 1:
if header.get("DC-Flag", 1) == 1:
fixed_wcs = _sdss_wcs_to_log_wcs(wcs)
else:
fixed_wcs = wcs

mask = hdulist[0].data[3, :] != 0

return Spectrum1D(flux=flux, wcs=fixed_wcs,
uncertainty=uncertainty, meta=meta, mask=mask)
return Spectrum1D(flux=flux,
wcs=fixed_wcs,
uncertainty=uncertainty,
meta=meta,
mask=mask)


@data_loader(label="SDSS spPlate", identifier=spPlate_identify, extensions=['fits'])
@data_loader(label="SDSS spPlate",
identifier=spPlate_identify,
extensions=["fits"])
def spPlate_loader(file_obj, limit=None, **kwargs):
"""
Loader for SDSS spPlate files, reading flux spectra from all fibres into single array.
Expand All @@ -223,26 +237,30 @@ def spPlate_loader(file_obj, limit=None, **kwargs):
"""
with read_fileobj_or_hdulist(file_obj, **kwargs) as hdulist:
header = hdulist[0].header
meta = {'header': header}
meta = {"header": header}
wcs = WCS(header).dropaxis(1)
if limit is None:
limit = header['NAXIS2']
limit = header["NAXIS2"]

bunit = header.get('BUNIT', '1e-17 erg / (Angstrom cm2 s)')
if 'Ang' in bunit and 'strom' not in bunit:
bunit = bunit.replace('Ang', 'Angstrom')
bunit = header.get("BUNIT", "1e-17 erg / (Angstrom cm2 s)")
if "Ang" in bunit and "strom" not in bunit:
bunit = bunit.replace("Ang", "Angstrom")
flux_unit = Unit(bunit)
flux = hdulist[0].data[0:limit, :] * flux_unit
uncertainty = InverseVariance(hdulist[1].data[0:limit, :] / flux_unit**2)
uncertainty = InverseVariance(hdulist[1].data[0:limit, :] /
flux_unit**2)

# Fix the WCS if it is claimed to be linear
if header.get('DC-Flag', 1) == 1:
if header.get("DC-Flag", 1) == 1:
fixed_wcs = _sdss_wcs_to_log_wcs(wcs)
else:
fixed_wcs = wcs

mask = hdulist[2].data[0:limit, :] != 0
meta['plugmap'] = Table.read(hdulist[5])[0:limit]
meta["plugmap"] = Table.read(hdulist[5])[0:limit]

return Spectrum1D(flux=flux, wcs=fixed_wcs,
uncertainty=uncertainty, meta=meta, mask=mask)
return Spectrum1D(flux=flux,
wcs=fixed_wcs,
uncertainty=uncertainty,
meta=meta,
mask=mask)
Loading

0 comments on commit 0b7098d

Please sign in to comment.