Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for specifying explicit Import file formats #1164

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

aravindh-krishnamoorthy
Copy link

@aravindh-krishnamoorthy aravindh-krishnamoorthy commented Nov 12, 2024

Note

This PR is still a WIP. Some functionality, documentation, and tests are not yet implemented. However, community comments and suggestions are welcome.

Summary

Add support for:

  • Import[file, "fmt"], where "fmt" is a valid format from $ImportFormats. If not, "fmt" is treated as the "element" argument.
  • Import[file, {"fmt", element...}], where, again, "fmt" is a valid format. If not, "fmt" is again treated as an "element" argument.
  • The functionality is already implemented, but lacks documentation and needs a bugfix.

Work progress

  • Implementation for Import[file, "fmt"]
  • Implementation for Import[file, {"fmt", element...}]
  • Bugfix for Import[file, {"fmt", element...}] - only check the first element for file format*. (not needed as well with old implementation).
  • Documentation updates.
  • Test updates.
    • Import["file.svg"] -> Error Import::fmtnosup for "SVG".
    • Import["file.svg", "XML"] -> Imported as XML.
    • Import["file.svg", {"XML"}] -> Imported as XML.
    • Import["file.svg", {"XML", "XML"}] -> Error Import::noelem for the second "XML".

*Otherwise, it's not possible to get elements with the same names as file formats.

General rant by the author

Using an ambiguous argument, which could mean "fmt" or "element" does not seem to be a good design choice by Wolfram language developers...

return self.eval_elements(
filename, ListExpression(element), evaluation, options
)
if element.get_string_value() in IMPORTERS.keys():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it would be a good idea to move the implementation of these methods to mathics.eval.files_io

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the comments, @mmatera. Would it make sense to move all related (to import/export) eval_xxx functions there?

First, I will implement the 2nd checkbox above in the PR description, and then, soon after receiving your response, will move the implementation to mathics.eval.files_io.

Copy link
Member

@rocky rocky Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are methods that start with eval in mathics.builtins and there are functions that start with eval_ in mathics.eval.

The methods have a funny docstring that indicates the function signature from Mathics3's perspective. This indicates to the Mathics3 interpreter when these methods get invoked. See https://mathics-development-guide.readthedocs.io/en/latest/extending/developing-code/extending/tutorial/1-builtin.html for an example.

Those kinds of evaluation methods can't be moved anywhere and have to be in mathics.builtins.

In the past everything was shoved into the class that implements a Mathics3 builtin function. These classes were big, harder to test in isolation, and were hard to understand.

We have been breaking these down. In particular, other than parameter checking and parameter conversion, any code in a Mathics3 Builtin Function (implemented as a Python class) that has any substance should be added as a (Python) function inside mathics.eval using the corresponding method name from the class. Using the same or similar name is intended to simplify understanding the correspondence.

In the future, if we are to be able to support instruction-like execution, we will need this kind of code in functions not as method objects of classes.

Does this answer your question and make sense?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation and the context, @rocky. Indeed, this helps. Since the changes for this PR are small, I'll add them to the existing methods in mathics.builtins. From the next PR on, I'll implement substantial code as helper functions in mathics.eval.

@rocky
Copy link
Member

rocky commented Nov 13, 2024

@aravindh-krishnamoorthy Why was it you were confused about whether Import had not been implemented for that aspect you wanted? Is there something we could have done, or should do, to make it more likely not to confuse others in the future?

@aravindh-krishnamoorthy
Copy link
Author

@aravindh-krishnamoorthy Why was it you were confused about whether Import had not been implemented for that aspect you wanted? Is there something we could have done, or should do, to make it more likely not to confuse others in the future?

Thank you for your prompt response. The reason I started implementing this is because I had to import an "SVG" file as an "XML." Instead of trying it out (which would be better in retrospect), I looked at the documentation Sec. 29.3.9 (as of now) Import, which does not show Import[file, "fmt"] and Import[file, {"fmt", elements}]. So, I decided to implement them.

While implementing the second one, I saw that this was already (quite elegantly imho) done in Import._import:

# Determine file type
for el in elements:
if el in IMPORTERS.keys():
filetype = el
elements.remove(el)
break
else:
filetype = determine_filetype()

So, I redid this PR to a rather lackluster documentation update. Once I add the tests mentioned above, I think this PR will be ready for final review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants