Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Submodule Import Problems #34

Open
tinhb opened this issue Jun 10, 2023 · 4 comments
Open

Submodule Import Problems #34

tinhb opened this issue Jun 10, 2023 · 4 comments

Comments

@tinhb
Copy link

tinhb commented Jun 10, 2023

Suppose I have a string of HTML content and would like to extract certain information from it:

pyp "xml.etree.ElementTree.fromstring('<html><head><title>Title</title></head></html>').find('head/title').text"

Even though pyp tried to import xml, there will still be AttributeError: module 'xml' has no attribute 'etree' because of xml.etree.ElementTree’s submodule structure.

I can explicitly use -b parameter for proper importing:

pyp -b "import xml.etree.ElementTree" "xml.etree.ElementTree.fromstring('<html><head><title>Title</title></head></html>').find('head/title').text"
# Title

However, if I add the same line to the PYP_CONFIG_PATH config file, the same AttributeError happens still.

cat $PYP_CONFIG_PATH
# import xml.etree.ElementTree
pyp "xml.etree.ElementTree.fromstring('<html><head><title>Title</title></head></html>').find('head/title').text"
# AttributeError: module 'xml' has no attribute 'etree'

So, the question is:
What is the correct way to have xml.etree.ElementTree imported automatically?

@hauntsaninja
Copy link
Owner

Thanks for the issue!

Yeah, it's hard to know statically what to import when you see an expression like "xml.etree.ElementTree". Not sure I see a way to get that to work out of the box without special casing.

Hm, adding that line to your config should work... we should treat the config element "import xml.etree.ElementTree" as defining "xml" and so statically include it in the code we execute. It looks like that's not happening and so it's falling back to the generic import missing things code. This is a bug, I can fix it.

In the meantime, a workaround could be something like adding from xml.etree import ElementTree to your config and using ElementTree. Similarly, adding import xml.etree.ElementTree as ET to your config and using ET would also work.

@hauntsaninja
Copy link
Owner

The commit that I just pushed a3f2ebc makes your config example work. I'll see if I can think of improvements that would make your initial version work as well (that are compatible with pyp's mostly static analysis)

@tinhb
Copy link
Author

tinhb commented Jun 10, 2023

Wow, thank you for the quick fix.

I searched the internet for Python module resolution and found importlib.util.find_spec.
Not sure if it can be used.

Say, if I want to use another call xml.dom.minidom.parse(...), is it possible to search level by level?

from importlib.util import find_spec
def spec_of(target):
    spec = find_spec(target)
    return (spec, spec.submodule_search_locations) 

# spec_of('xml.dom.minidom.parse')[1]     # Exception, __path__ not found on xml.dom.minidom (parent?)
# ModuleNotFoundError: __path__ attribute not found on 'xml.dom.minidom' while trying to find 'xml.dom.minidom.parse'

spec_of('xml.dom.minidom')[1] is not None # False, no submodule? (pure guess)
spec_of('xml.dom')[1] is not None         # True, has submodule? (ditto)
spec_of('xml')[1] is not None             # True, has submodule? (ditto)

@hauntsaninja
Copy link
Owner

Yeah, let me think about how to better make some of this stuff work. Not straightforward given the current implementation and static constraints.

In the meantime, imports like import xml.etree.ElementTree as ET / from xml.etree import ElementTree will work without issue.

@hauntsaninja hauntsaninja reopened this Jun 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants