Skip to content

Commit

Permalink
Release/0.8.0 (#102)
Browse files Browse the repository at this point in the history
* Feature/update add collection test (#94)

* update add collection test to get the url for json history

* update changelog

* /version 0.7.0a24

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Feature/update add collection test (#95)

* update add collection test to get the url for json history

* update changelog

* update test to test for nan

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Bump jinja2 from 3.1.2 to 3.1.3 (#99)

Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.2 to 3.1.3.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](pallets/jinja@3.1.2...3.1.3)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Using cmr-umm-updater default branch (develop)

* use develop


* Update CONTRIBUTING.md


* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update uat_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Update ops_associations.txt with new collections

* Issue #96: ensure the created dimension is sorted (#101)

* implement sorting of the output queue according to the order of the input queue to satisfy issue #96

* Update CHANGELOG.md with issue-96 fix

* release 0.8.0


---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: sliu008 <[email protected]>
Co-authored-by: concise bot <[email protected]>
Co-authored-by: jonathansmolenski <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: ank1m <[email protected]>
Co-authored-by: James Wood <[email protected]>
  • Loading branch information
8 people authored Mar 1, 2024
1 parent f86cc5b commit c9157ec
Show file tree
Hide file tree
Showing 10 changed files with 124 additions and 139 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/add-collection-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ jobs:
pip3 install netCDF4
pip3 install git+https://github.com/nasa/harmony-py.git
pip3 install git+https://github.com/podaac/cmr-umm-updater.git
pip3 install git+https://github.com/podaac/cmr-association-diff.git@6193079a14e36f4c9526aa426015c2b6be41f0e2
pip3 install git+https://github.com/podaac/cmr-association-diff.git
pip3 install python-dateutil --upgrade
- name: Run CMR Association diff scripts
run: |
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/build-pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ jobs:
git tag -a "${{ env.software_version }}" -m "Version ${{ env.software_version }}"
git push origin "${{ env.software_version }}"
- name: Publish UMM-S with new version
uses: podaac/cmr-umm-updater@feature/umm_version
uses: podaac/cmr-umm-updater@develop
if: |
github.ref == 'refs/heads/main' ||
startsWith(github.ref, 'refs/heads/release')
Expand All @@ -160,7 +160,7 @@ jobs:
LAUNCHPAD_TOKEN_UAT: ${{secrets.LAUNCHPAD_TOKEN_UAT}}
LAUNCHPAD_TOKEN_OPS: ${{secrets.LAUNCHPAD_TOKEN_OPS}}
- name: Publish L2ss Concise Chain UMM-S with new version
uses: podaac/cmr-umm-updater@feature/umm_version
uses: podaac/cmr-umm-updater@develop
if: |
github.ref == 'refs/heads/main' ||
startsWith(github.ref, 'refs/heads/release')
Expand Down
16 changes: 14 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added
### Changed
### Deprecated
### Deprecated
### Removed
### Fixed


## [0.8.0]

### Added
### Changed
- [issues/96](https://github.com/podaac/concise/issues/96):
- Preserve the order of the input files so the output file matches order
### Deprecated
### Removed
### Fixed

Expand All @@ -24,7 +35,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Updated jupyter notebook
- Update notebook test to use python code directly instead of using jupyter notebook
- Updated python libraries
- Update history json to have url in history
- Update history json to have url in history
- Update add collection test to use url in json history
### Deprecated
### Removed
### Fixed
Expand Down
10 changes: 1 addition & 9 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,14 +77,6 @@ If any performance improvements are being made, include graphs and charts.
- `feature/issue-#`
- Work for enhancements and new features should be done in a branch with this naming convention
- The issue number should match the associated Github issue number
- `bugfix/issue-#`
- Work for bug fixes should be done in a branch with this naming convention
- The issue number should match the associated Github issue number
- `hotfix/issue-#` or `hotfix/short-fix-description`
- Rare/special case to address a special anomaly.
- The issue number should match the associated Github issue number,
unless no such issue exists. If not, use a short description of the
issue e.g. `hotfix/fix-request-url`

### Changelog

Expand Down Expand Up @@ -200,4 +192,4 @@ All functions should contain a docstring, though short or trivial
function may contain a 1-line docstring.

If adding a new module, ensure it has been added to [index.rst](docs/index.rst)
for inclusion in auto-generated Sphinx docs.
for inclusion in auto-generated Sphinx docs.
45 changes: 18 additions & 27 deletions add_collection_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
import numpy as np
import netCDF4 as nc
import requests
import json
from harmony import BBox, Client, Collection, Request, Environment
import argparse
from utils import FileHandler
Expand Down Expand Up @@ -135,22 +136,29 @@ def verify_variables(merged_group, origin_group, subset_index, both_merged):
merged_data = np.resize(merged_var[subset_index], origin_var.shape)
origin_data = origin_var

equal_nan = True
if merged_data.dtype.kind == 'S':
equal_nan = False

# verify variable data
if isinstance(origin_data, str):
unittest.TestCase().assertEqual(merged_data, origin_data)
else:
unittest.TestCase().assertTrue(np.array_equal(merged_data, origin_data, equal_nan=True))
unittest.TestCase().assertTrue(np.array_equal(merged_data, origin_data, equal_nan=equal_nan))


def verify_groups(merged_group, origin_group, subset_index, file=None, both_merged=False):
if file:
print("verifying groups ....." + file)

def verify_groups(merged_group, origin_group, subset_index, both_merged=False):
verify_dims(merged_group, origin_group, both_merged)
verify_attrs(merged_group, origin_group, both_merged)
verify_variables(merged_group, origin_group, subset_index, both_merged)

for child_group in origin_group.groups:
merged_subgroup = merged_group[child_group]
origin_subgroup = origin_group[child_group]
verify_groups(merged_subgroup, origin_subgroup, subset_index, both_merged)
verify_groups(merged_subgroup, origin_subgroup, subset_index, both_merged=both_merged)


# GET TOKEN FROM CMR
Expand All @@ -173,7 +181,7 @@ def download_file(url, local_path, headers):
with open(local_path, 'wb') as file:
for chunk in response.iter_content(chunk_size=8192):
file.write(chunk)
print("Original File downloaded successfully.")
print("Original File downloaded successfully. " + local_path)
else:
print(f"Failed to download the file. Status code: {response.status_code}")

Expand Down Expand Up @@ -217,6 +225,7 @@ def test(collection_id, venue):
print('\nDone downloading.')

filename = file_names[0]

# Handle time dimension and variables dropping
merge_dataset = nc.Dataset(filename, 'r')

Expand All @@ -233,34 +242,16 @@ def test(collection_id, venue):
}

original_files = merge_dataset.variables['subset_files']
history_json = json.loads(merge_dataset.history_json)
assert len(original_files) == max_results

for file in original_files:

# if the file name end in an alphabet so we know there is some extension
if file[-1].isalpha():
file_name = file.rsplit(".", 1)[0]
else:
file_name = file

print(file_name)
cmr_query = f"{cmr_base_url}{file_name}&collection_concept_id={collection_id}"
print(cmr_query)

response = requests.get(cmr_query, headers=headers)

result = response.json()
links = result.get('items')[0].get('umm').get('RelatedUrls')
for link in links:
if link.get('Type') == 'GET DATA':
data_url = link.get('URL')
parsed_url = urlparse(data_url)
local_file_name = os.path.basename(parsed_url.path)
download_file(data_url, local_file_name, headers)
for url in history_json[0].get("derived_from"):
local_file_name = os.path.basename(url)
download_file(url, local_file_name, headers)

for i, file in enumerate(original_files):
origin_dataset = nc.Dataset(file)
verify_groups(merge_dataset, origin_dataset, i)
verify_groups(merge_dataset, origin_dataset, i, file=file)


def run():
Expand Down
29 changes: 29 additions & 0 deletions cmr/ops_associations.txt
Original file line number Diff line number Diff line change
Expand Up @@ -70,3 +70,32 @@ C2274919541-POCLOUD
C2205620319-POCLOUD
C2183155461-POCLOUD
C2208421887-POCLOUD
C2628595723-POCLOUD
C2746966926-POCLOUD
C2628600898-POCLOUD
C2746966928-POCLOUD
C2746966927-POCLOUD
C2754895884-POCLOUD
C2746966657-POCLOUD
C2628598809-POCLOUD
C2799465529-POCLOUD
C2799465526-POCLOUD
C2799465507-POCLOUD
C2799465497-POCLOUD
C2799465538-POCLOUD
C2799465544-POCLOUD
C2799465542-POCLOUD
C2799465428-POCLOUD
C2799438350-POCLOUD
C2799438351-POCLOUD
C2799438353-POCLOUD
C2296989388-POCLOUD
C2205553958-POCLOUD
C2706513160-POCLOUD
C2147480877-POCLOUD
C2147478146-POCLOUD
C2730520815-POCLOUD
C2799465509-POCLOUD
C2799465518-POCLOUD
C2799465522-POCLOUD
C2068529568-POCLOUD
37 changes: 37 additions & 0 deletions cmr/uat_associations.txt
Original file line number Diff line number Diff line change
Expand Up @@ -80,3 +80,40 @@ C1238621102-POCLOUD
C1240739713-POCLOUD
C1243175554-POCLOUD
C1245295750-POCLOUD
C1256783381-POCLOUD
C1259115177-POCLOUD
C1256783388-POCLOUD
C1259115167-POCLOUD
C1259115178-POCLOUD
C1256783382-POCLOUD
C1259115166-POCLOUD
C1261072655-POCLOUD
C1261072658-POCLOUD
C1261072648-POCLOUD
C1261072646-POCLOUD
C1261072656-POCLOUD
C1261072645-POCLOUD
C1261072659-POCLOUD
C1261072654-POCLOUD
C1254854453-LARC_CLOUD
C1254855648-LARC_CLOUD
C1254854962-LARC_CLOUD
C1247485682-LARC_CLOUD
C1247485690-LARC_CLOUD
C1247485685-LARC_CLOUD
C1242274079-POCLOUD
C1240739526-POCLOUD
C1261072651-POCLOUD
C1261072650-POCLOUD
C1242274070-POCLOUD
C1240739691-POCLOUD
C1257081729-POCLOUD
C1261072661-POCLOUD
C1261072652-POCLOUD
C1261072662-POCLOUD
C1261072660-POCLOUD
C1261645986-LARC_CLOUD
C1258237266-POCLOUD
C1259966654-POCLOUD
C1258237267-POCLOUD
C1240739686-POCLOUD
10 changes: 5 additions & 5 deletions podaac/merger/harmony/download_worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,8 @@ def multi_core_download(urls, destination_dir, access_token, cfg, process_count=
url_queue = manager.Queue(len(urls))
path_list = manager.list()

for url in urls:
url_queue.put(url)
for iurl, url in enumerate(urls):
url_queue.put((iurl, url))

# Spawn worker processes
processes = []
Expand All @@ -64,7 +64,7 @@ def multi_core_download(urls, destination_dir, access_token, cfg, process_count=

path_list = deepcopy(path_list) # ensure GC can cleanup multiprocessing

return [Path(path) for path in path_list]
return [Path(path) for ipath, path in sorted(path_list)]


def _download_worker(url_queue, path_list, destination_dir, access_token, cfg):
Expand All @@ -91,7 +91,7 @@ def _download_worker(url_queue, path_list, destination_dir, access_token, cfg):

while not url_queue.empty():
try:
url = url_queue.get_nowait()
iurl, url = url_queue.get_nowait()
except queue.Empty:
break

Expand All @@ -105,4 +105,4 @@ def _download_worker(url_queue, path_list, destination_dir, access_token, cfg):
else:
logger.warning('Origin filename could not be assertained - %s', url)

path_list.append(str(path))
path_list.append((iurl, str(path)))
Loading

0 comments on commit c9157ec

Please sign in to comment.