Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Python support for newspaper4k #4658

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

marcus-crane
Copy link

@marcus-crane marcus-crane commented Nov 7, 2024

Hi there,

This PR adds support for importing the Python module newspaper4k which is loaded under the newspaper namespace.

I should note that newspaper4k is a fork of the popular newspaper3k. At first I was going to add support for both but I realise this might end up in Windmill might pull in either 3k or 4k depending on list ordering or just a plain race condition.

As neat as magic loading in of libraries is, it seems a bit too magical for cases like this where you have to modify Windmill to support certain libraries. It seems like an aliasing redirect using the comment format could be neat ie; newspaper4k as newspaper; import newspaper

Anyway, thanks!


Important

Maps newspaper import to newspaper4k in lib.rs to avoid conflicts with newspaper3k.

  • Behavior:
    • Maps newspaper import to newspaper4k in PYTHON_IMPORTS_REPLACEMENT in lib.rs.
    • Ensures newspaper4k is used when newspaper is imported, avoiding conflicts with newspaper3k.

This description was created by Ellipsis for 004fd1f. It will automatically update as commits are pushed.

Copy link
Contributor

github-actions bot commented Nov 7, 2024

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

@marcus-crane
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to 004fd1f in 47 seconds

More details
  • Looked at 12 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. backend/parsers/windmill-parser-py-imports/src/lib.rs:68
  • Draft comment:
    Consider changing the parameter type of replace_import to &str to avoid unnecessary cloning.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The replace_import function should take a reference to a string instead of a string to avoid unnecessary cloning.

Workflow ID: wflow_Jb6UyaZqLtMPz0ME


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

github-actions bot added a commit that referenced this pull request Nov 7, 2024
@marcus-crane
Copy link
Author

For anyone reading this in the future, I didn't have any luck trying to do this within the script editor due to Windmill trying to resolve import newspaper to a Python 2 module which would break my script.

I had a look at using an explicit requirements.txt file to get around inferred dependencies but couldn't get that to work for whatever reason.

In the meantime, I ended up just hacking around the whole inferred dependencies issue like so:

#extra_requirements:
#newspaper4k
import importlib

newspaper = importlib.import_module('newspaper')

This let me import a module with a different name without triggering Windmill's inferred dependency functionality.

@rubenfiszel
Copy link
Contributor

@marcus-crane

you can also just do:

#requirements:
#newspaper4k
import newspaper

I'm a bit on the fence of merging this bc i'm not sure if newspaper4k is more popular than newspaper3k at the moment

@marcus-crane
Copy link
Author

marcus-crane commented Nov 7, 2024

@marcus-crane

you can also just do:


#requirements:

#newspaper4k

import newspaper

I'm a bit on the fence of merging this bc i'm not sure if newspaper4k is more popular than newspaper3k at the moment

That doesn't work for me because newspaper is also a real package so Windmill tried to import both newspaper and newspaper4k. newspaper is a Python 2 package so it causes pip to bail out unless I'm missing something here 🤔

EDIT: Oh, actually I was using extra_requirements rather than requirements so I'll give that a try

@marcus-crane
Copy link
Author

Ah yes, that does work! I had misunderstood that requirements does override the inferred installation unlike extra_requirements.

As far as this PR, I don't personally need it anymore but given there's newspaper3k and newspaper4k both actively used, it does kind of a raise a shortcoming with the single overrides for inferred dependencies 😓

Perhaps the docs for requirements could be updated to make it a bit clearer that it can be used for cases where the package name doesn't match the import name 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants