Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STY: cdaweb methods streamline #72

Open
jklenzing opened this issue Jun 2, 2021 · 3 comments
Open

STY: cdaweb methods streamline #72

jklenzing opened this issue Jun 2, 2021 · 3 comments
Assignees
Labels
Milestone

Comments

@jklenzing
Copy link
Member

Some of the target parsing in the cdaweb methods for recognizing file names could be improved.

Alternatively, better incorporation of the official API may replace these routines entirely.

@jklenzing jklenzing added the style label Jun 3, 2021
@jklenzing jklenzing added this to the 0.1.0 Release milestone Sep 2, 2022
@jklenzing
Copy link
Member Author

It looks like using the get_original_filenames function in cdasws will handle this in a much more streamlined way. It won't eliminate the problem, as not all datasets are supported here.

@jklenzing
Copy link
Member Author

cdasws was incorporated in #145, solving this for most datasets.

To close, I think that the following lines need to be simplified.

try:
for top_url in url_list:
for level in range(n_layers + 1):
for directory in remote_dirs[level]:
temp_url = '/'.join((top_url.strip('/'), directory))
soup = BeautifulSoup(requests.get(temp_url).content,
"lxml")
links = soup.find_all('a', href=True)
for link in links:
# If there is room to go down, look for directories
if link['href'].count('/') == 1:
remote_dirs[level + 1].append(link['href'])
else:
# If at the endpoint, add matching files to list
add_file = True
for target in targets:
if link['href'].count(target) == 0:
add_file = False
if add_file:
full_files.append(link['href'])

Probably need to incorporate a directory structure as part of supported_tags so that the layers loop can be removed.

@jklenzing jklenzing self-assigned this Oct 2, 2023
@jklenzing
Copy link
Member Author

Work on after merge of #200

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant