Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preliminary steps for Saim on Pleaides #9

Open
5 of 7 tasks
rabernat opened this issue Nov 18, 2022 · 11 comments
Open
5 of 7 tasks

Preliminary steps for Saim on Pleaides #9

rabernat opened this issue Nov 18, 2022 · 11 comments
Assignees

Comments

@rabernat
Copy link
Collaborator

rabernat commented Nov 18, 2022

Here are the steps that we need to accomplish in order to get to the point where we can start our pipeline

  • Ensure that @rsaim can log in with SSH passthrough and run commands
  • Make X11 forwarding working
  • Make VSCode Remote Connection work with pfe
  • Get singularity container working with Pangeo docker image
  • Export a single netCDF file with proper chunking and compressions options
  • Generate kerchunk index for this file and verify that it works
  • Export the file to OSN

Let's use this issue to discuss the details of these steps.

@dhruvbalwada
Copy link
Member

dhruvbalwada commented Nov 18, 2022

Instructions for singularity and pangeo docker impage on Pleiades:
https://hackmd.io/RSgC2eaQS0CXdSfEthKSLg

OSN link: https://www.openstoragenetwork.org/

Links to Pleiades docs:
General: https://www.nas.nasa.gov/hecc/support/kb/
Setting up ssh passthrough: https://www.nas.nasa.gov/hecc/support/kb/setting-up-public-key-ssh-passthrough-94/

@rsaim rsaim self-assigned this Nov 18, 2022
@rsaim
Copy link
Contributor

rsaim commented Nov 18, 2022

  • Make X11 forwarding working

The steps described at X11-Forwarding weren't enough for me to make X11 forward work on my Mac. However, following the steps in the article How to install X Window System XQuartz on macOS for ssh X11 forwarding worked.

@rabernat
Copy link
Collaborator Author

Why do you need X11 forwarding?

@rsaim
Copy link
Contributor

rsaim commented Nov 18, 2022

In order to launch GUI applications like Geany on pfe. Such GUI applications increase productivity. Otherwise, we're limited to command-line tools like Vim, Emacs.

@rabernat
Copy link
Collaborator Author

FWIW, I have worked on this computer for 12 years, and I have never used any of those applications.

You could investigate this if you want to use VS-code: https://code.visualstudio.com/docs/remote/ssh

@rsaim
Copy link
Contributor

rsaim commented Nov 18, 2022

Thank you for pointing that @rabernat. I was able to setup VSCode with remote connection to ple.

Get singularity container working with Pangeo docker image

> sraza2 @ pfe22 ~ 12:37:22
$ singularity shell --bind /nobackup:/nobackup --bind /nobackupp1:/nobackupp1 --bind /nobackupp17:/nobackupp17 --bind /nobackupp19:/nobackupp19 --bind /home3/sraza2:/home3/sraza2  /home3/sraza2/notebook_pangeo.sif
Singularity> hostname
pfe22

Could someone please check if the bindings in the command above look good? @dhruvbalwada

Get-interactive-node says that I need to get an interactive node. However, I was able to run the container without it. Are there any advantages of launching the singularity container on an interactive node?

@dhruvbalwada
Copy link
Member

When you land on pfe you are on a login node that is not very powerful and not meant to do computations. In fact if you try to do anything too big (like add 2 numbers) too often on a login node, eventually the admin will kick you off. So always get an interactive node when starting to do anything more than just looking through your files.

I think the bindings look fine. If there is a problem, you will run into a problem further down the line (when running your scripts), and then we can come back and try to fix them. These bindings were figured out by trial and error.

@rsaim
Copy link
Contributor

rsaim commented Nov 20, 2022

Export a single netCDF file with proper chunking and compressions options

> sraza2 @ r445i6n6 ~ 14:43:52
$ time python3 /home3/sraza2/pleiades_llc_recipes/python_cli_data_export/extract_llc.py
llc4320_Eta_k0_iter_92160.nc

real    0m9.571s
user    0m5.507s
sys     0m3.232s

@rsaim
Copy link
Contributor

rsaim commented Nov 21, 2022

@rabernat -

[] Generate kerchunk index for this file and verify that it works

Does the following code suffice for the action item above?

from xmitgcm import llcreader

model_name = 'llc4320'
variable = ["Eta"]
klevel = [0]
iter = [92160]
out_dir = '~/temp'
facen = [1]
istart = 1080
iend = 3240
jstart = 0
jend = 2160
fdepth = 'n'
verbose = 'n'

var_names = '-'.join(vars  for vars in variable)
k_names = '-'.join(str(klev) for klev in klevel)
fname = f'~/temp/{model_name}_'+var_names+'_k'+k_names+f'_iter_{iter[0]}.nc'
print(fname)

model = llcreader.PleiadesLLC4320Model()
ds = model.get_dataset(varnames=variable, iters=iter, read_grid=False)
ds_selected = ds.sel(face=facen).isel(i=slice(istart,iend),i_g=slice(istart,iend),j=slice(jstart,jend),j_g=slice(jstart,jend))


In [16]: from kerchunk import netCDF3

In [27]: h = netCDF3.netcdf_file(os.path.expanduser(fname))

In [28]: h.variables
Out[28]: 
{'face': <scipy.io._netcdf.netcdf_variable at 0x2aaaae28afa0>,
 'i': <scipy.io._netcdf.netcdf_variable at 0x2aaaae82e0d0>,
 'i_g': <scipy.io._netcdf.netcdf_variable at 0x2aabc2c02730>,
 'j': <scipy.io._netcdf.netcdf_variable at 0x2aabccd2a760>,
 'j_g': <scipy.io._netcdf.netcdf_variable at 0x2aaaaeb130d0>,
 'k': <scipy.io._netcdf.netcdf_variable at 0x2aaaaeb13e50>,
 'k_u': <scipy.io._netcdf.netcdf_variable at 0x2aabc31dd6d0>,
 'k_l': <scipy.io._netcdf.netcdf_variable at 0x2aabc31ddc70>,
 'k_p1': <scipy.io._netcdf.netcdf_variable at 0x2aabc31dd2b0>,
 'niter': <scipy.io._netcdf.netcdf_variable at 0x2aabc31dd610>,
 'time': <scipy.io._netcdf.netcdf_variable at 0x2aabc31dd6a0>,
 'Eta': <scipy.io._netcdf.netcdf_variable at 0x2aabc2bc0c70>}

In [29]: h.dimensions
Out[29]: 
{'face': 1,
 'i': 2160,
 'i_g': 2160,
 'j': 2160,
 'j_g': 2160,
 'k': 90,
 'k_u': 90,
 'k_l': 90,
 'k_p1': 91,
 'time': 1}

[] Export the file to OSN

Could you please guide me about this action item?

@cspencerjones
Copy link
Contributor

I want to make a kerchunk file that will work on OSN. If I generate a kerchunk file using the files on Pleiades, then the json file contains references to paths on Pleiades. The kerchunk file works on Pleiades but I don't anticipate it will work on OSN. How do I make a kerchunk file that will work on OSN? Do I just search-and-replace the pleiades path? @rabernat - is this something we already know how to do or do we need to ask someone?

I am currently making the `kerchunk' file like this:

urls = ["/nobackup/csjone15/pleiades_llc_recipes/python_cli_data_export/surf_extract/surf_fields/" + p for p in arr]
so = dict(
    anon=True, default_fill_cache=False, default_cache_type='first'
)
singles = []
for u in urls:
    print(u)
    with fsspec.open(u, **so) as inf:
        h5chunks = kerchunk.hdf.SingleHdf5ToZarr(inf, u, inline_threshold=100)
        singles.append(h5chunks.translate())
     
#now combine the singles   
from kerchunk.combine import MultiZarrToZarr
import ujson
mzz = MultiZarrToZarr(
    singles,
    remote_options={'anon': True},
    concat_dims=["time"]
)

out = mzz.translate()
fs2 = fsspec.filesystem('')

with fs2.open('surf_fields_test.json', 'wb') as f:
    f.write(ujson.dumps(out).encode())

@cspencerjones
Copy link
Contributor

Maybe this is the answer: fsspec/kerchunk#309 ? I will give it a try...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants