-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
astrodata #181
Comments
Editor in Chief checksHi there! Thank you for submitting your package for pyOpenSci review. Below are the basic checks that your package needs to pass to begin our review. If some of these are missing, we will ask you to work on them before the review process begins. Please check our Python packaging guide for more information on the elements below.
NOTE: We prefer that you have development instructions in your documentation too.
Editor commentsNothing to say from the EiC initial perspective, it's a strong submission! |
Editor response to review:Editor comments👋 Hi @aaryapatil and @mwcraig! Thank you for volunteering to review Please fill out our pre-review surveyBefore beginning your review, please fill out our pre-review survey. This helps us improve all aspects of our review and better understand our community. No personal data will be shared from this survey - it will only be used in an aggregated format by our Executive Director to improve our processes and programs.
The following resources will help you complete your review:
Please get in touch with any questions or concerns! Your review is due:Reviewers: @aaryapatil @mwcraig |
@teald : @aaryapatil @mwcraig have agreed to review your package! Thanks! Links to the review guide and the template to get them started are in the post above; and of course they might open issues (or PRs!) in your repro if they have suggestions. In the meantime, feel free to reach out to me with any questions you might have. |
@aaryapatil @mwcraig : I remember you both told me that you would need a little longer than the "three weeks" that we usually allocate to the first round of review and with the holiday coming up in the US, I understand that you are probably busy with other things this week. However, if you can make a realistic estimate when you can find time to review, it would be great if you could post here, so that the package authors know where we are in the process. Thank you so much for agreeing to review this package! |
@hamogu Thank you for the ping. I am working on the review now but won't have much time this week. Realistically, I should have my review done by next week. Hope that is okay! UPDATE: The rest of the review will be completed by July 16. |
@hamogu -- thanks for the reminder; I should be able to get it done by the end of the day on Monday, July 15. |
Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Readme file requirements
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
UsabilityReviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Functionality
For packages also submitting to JOSS
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted. The package contains a
Final approval (post-review)
Estimated hours spent reviewing: Review Comments@teald this is ready for a look **Thanks for submitting this! 🎉 ** I am really excited about the potential of this package to fill some holes in the ecosystem. There are some details below in the "folded" sections that should be straightforward to address. Here are the high-level comments (background: I'm one of the maintainers for
Issues/PRs opened as part of review:
test error traceback_____________________________ ERROR at setup of test_append_array_to_root_with_arbitrary_name ______________________________
filename = 'N20160524S0119.fits', path = '/Users/mattcraig/development/astronomy/astrodata/.tox/py311/_test_cache'
sub_path = 'raw_files', env_var = 'ASTRODATA_TEST', cache = True, fail_on_error = True
def download_from_archive(
filename,
path=None,
sub_path="raw_files",
env_var="ASTRODATA_TEST",
cache=True,
fail_on_error=True,
):
"""Download file from the archive and store it in the local cache.
Parameters
----------
filename : str
The filename, e.g. N20160524S0119.fits
path : str or os.PathLike or None
Path to the cache directory. If None, the environment variable
ASTRODATA_TEST is used. otherwise, the file is saved to:
os.path.join(path, sub_path, filename)
sub_path : str
By default the file is stored at the root of the cache directory, but
using ``path`` allows to specify a sub-directory.
env_var: str
Environment variable containing the path to the cache directory.
cache : bool
If False, the file is downloaded and replaced in the cache directory.
fail_on_error : bool
If True, raise an error if the download fails. If False, return None.
Returns
-------
str
Name of the cached file with the path added to it.
"""
# Handle None sub_path
if sub_path is None:
sub_path = ""
warnings.warn(
"sub_path is None, so the file will be saved to the root of the "
"cache directory. To suppress this warning, set sub_path to a "
"valid path (e.g., empty string)."
)
# Check that the environment variable is a valid name.
if not isinstance(env_var, str) or not env_var.isidentifier():
raise ValueError(f"Environment variable name is not valid: {env_var}")
# Find cache path and make sure it exists
root_cache_path = os.getenv(env_var)
if root_cache_path is None:
if path is not None:
root_cache_path = os.path.expanduser(path)
else:
root_cache_path = os.path.join(os.getcwd(), "_test_cache")
warnings.warn(
f"Environment variable not set: {env_var}, writing "
f"to {root_cache_path}. To suppress this warning, set "
f"the environment variable {env_var} to the desired path "
f"for the testing cache."
)
# This is cleaned up once the program finishes.
os.environ[env_var] = str(root_cache_path)
root_cache_path = os.path.expanduser(root_cache_path)
if path is None:
path = root_cache_path
cache_path = os.path.join(os.path.expanduser(path), sub_path)
if not os.path.exists(cache_path):
os.makedirs(cache_path)
# Now check if the local file exists and download if not
try:
local_path = os.path.join(cache_path, filename)
url = GEMINI_ARCHIVE_URL + filename
if cache and os.path.exists(local_path):
# Use the cached file
return local_path
> tmp_path = download_file(url, cache=False)
astrodata/testing.py:479:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
.tox/py311/lib/python3.11/site-packages/astropy/utils/data.py:1529: in download_file
f_name = _download_file_from_source(
.tox/py311/lib/python3.11/site-packages/astropy/utils/data.py:1313: in _download_file_from_source
with _try_url_open(
.tox/py311/lib/python3.11/site-packages/astropy/utils/data.py:1227: in _try_url_open
return urlopener.open(req, timeout=timeout)
../../../mambaforge/envs/astrodata-review-dev/lib/python3.11/urllib/request.py:519: in open
response = self._open(req, data)
../../../mambaforge/envs/astrodata-review-dev/lib/python3.11/urllib/request.py:536: in _open
result = self._call_chain(self.handle_open, protocol, protocol +
../../../mambaforge/envs/astrodata-review-dev/lib/python3.11/urllib/request.py:496: in _call_chain
result = func(*args)
../../../mambaforge/envs/astrodata-review-dev/lib/python3.11/urllib/request.py:1391: in https_open
return self.do_open(http.client.HTTPSConnection, req,
../../../mambaforge/envs/astrodata-review-dev/lib/python3.11/urllib/request.py:1352: in do_open
r = h.getresponse()
../../../mambaforge/envs/astrodata-review-dev/lib/python3.11/http/client.py:1395: in getresponse
response.begin()
../../../mambaforge/envs/astrodata-review-dev/lib/python3.11/http/client.py:325: in begin
version, status, reason = self._read_status()
../../../mambaforge/envs/astrodata-review-dev/lib/python3.11/http/client.py:286: in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
../../../mambaforge/envs/astrodata-review-dev/lib/python3.11/socket.py:706: in readinto
return self._sock.recv_into(b)
../../../mambaforge/envs/astrodata-review-dev/lib/python3.11/ssl.py:1314: in recv_into
return self.read(nbytes, buffer)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <ssl.SSLSocket [closed] fd=-1, family=2, type=1, proto=0>, len = 8192, buffer = <memory at 0x139cdcdc0>
def read(self, len=1024, buffer=None):
"""Read up to LEN bytes and return them.
Return zero-length string on EOF."""
self._checkClosed()
if self._sslobj is None:
raise ValueError("Read on closed or unwrapped SSL socket.")
try:
if buffer is not None:
> return self._sslobj.read(len, buffer)
E TimeoutError: The read operation timed out
../../../mambaforge/envs/astrodata-review-dev/lib/python3.11/ssl.py:1166: TimeoutError
The above exception was the direct cause of the following exception:
@pytest.fixture
def testfile2():
"""
Pixels Extensions
Index Content Type Dimensions Format
[ 0] science NDAstroData (4608, 1056) uint16
[ 1] science NDAstroData (4608, 1056) uint16
[ 2] science NDAstroData (4608, 1056) uint16
[ 3] science NDAstroData (4608, 1056) uint16
[ 4] science NDAstroData (4608, 1056) uint16
[ 5] science NDAstroData (4608, 1056) uint16
"""
> return download_from_archive("N20160524S0119.fits")
tests/test_object_construction.py:42:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
filename = 'N20160524S0119.fits', path = '/Users/mattcraig/development/astronomy/astrodata/.tox/py311/_test_cache'
sub_path = 'raw_files', env_var = 'ASTRODATA_TEST', cache = True, fail_on_error = True
def download_from_archive(
filename,
path=None,
sub_path="raw_files",
env_var="ASTRODATA_TEST",
cache=True,
fail_on_error=True,
):
"""Download file from the archive and store it in the local cache.
Parameters
----------
filename : str
The filename, e.g. N20160524S0119.fits
path : str or os.PathLike or None
Path to the cache directory. If None, the environment variable
ASTRODATA_TEST is used. otherwise, the file is saved to:
os.path.join(path, sub_path, filename)
sub_path : str
By default the file is stored at the root of the cache directory, but
using ``path`` allows to specify a sub-directory.
env_var: str
Environment variable containing the path to the cache directory.
cache : bool
If False, the file is downloaded and replaced in the cache directory.
fail_on_error : bool
If True, raise an error if the download fails. If False, return None.
Returns
-------
str
Name of the cached file with the path added to it.
"""
# Handle None sub_path
if sub_path is None:
sub_path = ""
warnings.warn(
"sub_path is None, so the file will be saved to the root of the "
"cache directory. To suppress this warning, set sub_path to a "
"valid path (e.g., empty string)."
)
# Check that the environment variable is a valid name.
if not isinstance(env_var, str) or not env_var.isidentifier():
raise ValueError(f"Environment variable name is not valid: {env_var}")
# Find cache path and make sure it exists
root_cache_path = os.getenv(env_var)
if root_cache_path is None:
if path is not None:
root_cache_path = os.path.expanduser(path)
else:
root_cache_path = os.path.join(os.getcwd(), "_test_cache")
warnings.warn(
f"Environment variable not set: {env_var}, writing "
f"to {root_cache_path}. To suppress this warning, set "
f"the environment variable {env_var} to the desired path "
f"for the testing cache."
)
# This is cleaned up once the program finishes.
os.environ[env_var] = str(root_cache_path)
root_cache_path = os.path.expanduser(root_cache_path)
if path is None:
path = root_cache_path
cache_path = os.path.join(os.path.expanduser(path), sub_path)
if not os.path.exists(cache_path):
os.makedirs(cache_path)
# Now check if the local file exists and download if not
try:
local_path = os.path.join(cache_path, filename)
url = GEMINI_ARCHIVE_URL + filename
if cache and os.path.exists(local_path):
# Use the cached file
return local_path
tmp_path = download_file(url, cache=False)
shutil.move(tmp_path, local_path)
# `download_file` ignores Access Control List - fixing it
os.chmod(local_path, 0o664)
except Exception as err:
if not fail_on_error:
log.debug(f"Failed to download {filename} from the archive")
log.debug(f" - Error: {err}")
return None
> raise IOError(
f"Failed to download {filename} from the archive ({url})"
) from err
E OSError: Failed to download N20160524S0119.fits from the archive (https://archive.gemini.edu/file/N20160524S0119.fits)
astrodata/testing.py:492: OSError |
Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Readme file requirements
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
UsabilityReviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Functionality
For packages also submitting to JOSS
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted. The package contains a
Final approval (post-review)
Estimated hours spent reviewing: 10 Review CommentsThank you for submitting this package @teald. I have completed my first pass, so you can go through it now. The package is already in great shape! If there are any questions, let me know. I will also be in touch in case something else comes up. I am looking forward to further using Pull requests opened as part of review:
Poetry test runpy312: SKIP ⚠ in 0.02 seconds py310: SKIP ⚠ in 0.02 seconds py311: OK ✔ in 2 minutes 36.38 seconds py310: SKIP (0.02 seconds) py311: OK (156.38=setup[12.41]+cmd[0.00,4.36,139.60] seconds) py312: SKIP (0.02 seconds) coverage: SKIP (0.02 seconds) congratulations :) (156.44 seconds) |
Thank you both for your reviews! This feedback has been incredibly useful. I will continue working on these points and will respond to more general comments later this week or next week. Update August 6: things have been taking a bit longer than anticipated. I'm hoping to have this wrapped up & a response ~this upcoming Monday (Aug. 12) Update August 15: there have been some setbacks, I will just respond when this is finished. Hopefully soon 🤞 without further issues. Thanks for your patience! Update September 6: We are waiting on feedback from a collaborator/contributor on the reviewer response, and should have it ready next week :) |
ResponseHi all, I think I've addressed everything covered by the review. Thank you both for your PRs, issues, and notes here. They've helped immensely so far! I'll itemize (for ease of reference) specific response points and how I've fixed them, then summarize other changes to the code. Specific points (review checklist)These are in order of reading the responses in the checklist above, with the ~sections in (parentheses).
IssuesAll issues raised as part of the initial review have been addressed: Issues raised as part of this reviewPull RequestsAll PRs have been merged into the main code: List of Pull Requests made by reviewersThank you for your contributions! List of Pull Requests related to this review.Other changes of note
Responses to reviewer commentsmwcraig1. What is your conception of how this should interact with the rest of the ecosystem (thinking mostly about nddata and ccdproc here)?
Since 2. Does it make sense to upstream any of this (like the arithmetic handling or allowing for any WCS, not just an astropy.wcs) to astropy.nddata? Or to ccdproc? There are plans to upstream some of the work done in the astrodata.wcs module to gwcs, specifically regarding the conversion between gwcs objects and their representation as FITS keywords, and then use gwcs throughout astrodata. But, that work hasn't been started yet and it's not clear when the resources will be available. I think that's probably the better avenue for more generic WCS support than As mentioned above, I think there are some good opportunities for taking some of the handling done by We could also consider how 3. Does it make sense for ccdproc to depend on astrodata or try to integrate usage of astrodata into it? ccdproc has never had a good way of handling MEF files, which is faintly ridiculous (I'm the maintainer of ccdproc so I'm looking in the mirror rather throwing stones here). It depends on the goals of ccdproc moving forward. For MEFs it's feasible to break images up from a list into individual images ccdproc can then process. 4. My take is that astrodata provides a way to abstract images and metadata from the underlying way they are stored, which is something that none of the current tools that I'm aware of provide. It may very well not make sense to upstream any of this. I think the biggest consideration is whether resources required to make these changes are worth the benefits themselves. There are natural places where Right now, though, the work required to share data between, e.g., Code example working with astrodata and ccdprocfrom astrodata import AstroData, create
from astropy.nddata import CCDData, NDData
from astropy.io import fits
import astropy.units as u
import numpy as np
import ccdproc
# Create a simple FITS file object with data and a header:
hdu = fits.PrimaryHDU(data=np.ones((100, 100)))
hdu.header["INSTRUME"] = "random_inst"
hdu.header["MODE"] = "random_mode"
hdu.header["UNIT"] = "adu"
hdu.header["EXPTIME"] = 5.0
# Create an AstroData object from the FITS file object:
ad = create(hdu)
# Access the underlying data and create a CCDData object:
ccd_image = CCDData(data=ad[0].data, unit=ad[0].hdr["UNIT"], meta=ad[0].hdr)
# Create a dark frame with the same shape as the data:
hdu_dark = fits.PrimaryHDU(data=np.random.random((100, 100)) * 10)
hdu_dark.header["INSTRUME"] = "random_inst"
hdu_dark.header["MODE"] = "random_dark_mode"
hdu_dark.header["UNIT"] = "adu"
hdu_dark.header["EXPTIME"] = 10.0
# Create an AstroData object from the FITS file object:
ad_dark = create(hdu_dark)
# Access the underlying data and create a CCDData object:
dark = CCDData(
ad_dark[0].data,
unit=ad_dark[0].hdr["UNIT"],
meta=ad_dark[0].hdr,
)
# Subtract the dark frame from the data:
ccd_dark_subtracted = ccdproc.subtract_dark(
ccd_image,
dark,
dark_exposure=dark.header["EXPTIME"] * u.s,
data_exposure=ccd_image.header["EXPTIME"] * u.s,
) This is a pretty minimal example, of course, but I think cases where the two interact can be managed primarily through accessing underlying data and metadata directly, rather than creating outright support for 5. Would it be possible to provide a small example of how to develop a processing tool with astrodata that goes beyond just adding properties and tags? In otherwords, once I have done those things what does astrodata do for me? I'm not suggesting a full reduction pipeline here (DRAGONS does that) but something that shows a step or two of processing files using would be helpful. This is still being worked on. I'd like this example to be relatively complete in overview of how Development is being tracked in this issue. For now, the User Manual and Programmer's manuals have a bit more in-depth explanation, though I agree a more concrete, featureful example would be ideal. aaryapatilThank you for your review and feedback! Let me know if you have any further questions or comments. |
I see a very detailed reply from the package author thanks to @teald for explaining so detailed what work you have done to address the review! @aaryapatil and @mwcraig: Please have a look and if you agree that that addresses all your concerns, please tick the box "The author has responded to my review and made changes to my satisfaction. I recommend approving this package." by editing the message of your review (and I appreciate a small note here, because GH will notify me if you post a new comment, but not if you edit an existing comment); if there is more to be done, please reply here so that we all know. |
Submitting Author: D.J. Teal (@teald)
All current maintainers: @teald, @chris-simpson, @jehturner
Package Name: astrodata
One-Line Description of Package: Common interface for astronomical data products.
Repository Link: https://github.com/GeminiDRSoftware/astrodata
Version submitted: 2.9.2
Editor: @hamogu
Reviewer 1: @aaryapatil
Reviewer 2: @mwcraig
Archive: TBD
JOSS DOI: TBD
Version accepted: TBD
Date accepted (month/day/year): TBD
Code of Conduct & Commitment to Maintain Package
Description
astrodata
is a package meant to facilitate developing common interfaces for astronomical data formats. Often, specific instruments and models will have different ways of storing their data, including metadata.astrodata
offers a single interface that uses metadata to resolve these disparate file formats, while enabling common operations and values to share the same interface. The abstraction is meant to be conceptually simple and meaningful to scientists.Previously,
astrodata
was a core module within the DRAGONS package, and has been used for the various instruments that exist at the Gemini Observatory. It has proved useful in consolidating differences in metadata and data formatting between instruments that produce FITS files, which is a common pattern in astronomical data handling. Alongside automating interface selection based on these differences, it also comes with helpful operators and methods out-of-the-box, by extendingastropy
'sNDData
class.Scope
Please indicate which category or categories.
Check out our package scope page to learn more about our
scope. (If you are unsure of which category you fit, we suggest you make a pre-submission inquiry):
Domain Specific
Community Partnerships
If your package is associated with an
existing community please check below:
For all submissions, explain how the and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each):
Who is the target audience and what are scientific applications of this package?
Astronomers & astronomical software developers
Are there other Python packages that accomplish the same thing? If so, how does yours differ?
There are no specific packages we are aware of. There is some overlap with the
gwcs
package and ourwcs
module, but we are planning to collaborate with that package for future development and to reconsolidate those overlaps. Astrodata offers an interface based onastropy.nddata.NDData
, but allowing more than one instance to be mapped to the same file (e.g., to multiple sets of FITS extensions).If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or
@tag
the editor you contacted:N/A
Technical checks
For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:
Publication Options
JOSS Checks
paper.md
matching JOSS's requirements with a high-level description in the package root or ininst/
.Note: JOSS accepts our review as theirs. You will NOT need to go through another full review. JOSS will only review your paper.md file. Be sure to link to this pyOpenSci issue when a JOSS issue is opened for your package. Also be sure to tell the JOSS editor that this is a pyOpenSci reviewed package once you reach this step.
Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
Confirm each of the following by checking the box.
Please fill out our survey
submission and improve our peer review process. We will also ask our reviewers
and editors to fill this out.
P.S. Have feedback/comments about our review process? Leave a comment here
Editor and Review Templates
The editor template can be found here.
The review template can be found here.
Footnotes
Please fill out a pre-submission inquiry before submitting a data visualization package. ↩
The text was updated successfully, but these errors were encountered: