Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

itertools and feedback #1

Open
mattsb42 opened this issue Dec 24, 2019 · 1 comment
Open

itertools and feedback #1

mattsb42 opened this issue Dec 24, 2019 · 1 comment

Comments

@mattsb42
Copy link

Hey, I just wanted to say props for taking action and putting something out there to solve the problem you saw. And congrats on your first PyPI package. :)

I'd also like to point out a way to do this with the standard library and offer some feedback on the package.

Doing the same thing with the standard library

One of the oft-forgotten corners of the standard library is a wonderful module called itertools. This module contains all sorts of helpers for combining iterators in different ways. One of these is itertools.chain[1], which just takes several iterables and flattens them into one. That's not quite what we want, though. What we want is itertools.chain.from_iterable[2]. This is an alternate constructor for itertools.chain that takes a single argument: an iterable that yields iterables. These resulting iterables are then chained together.

As an example, I've put together a comparison of using the pattern that bbp does vs using these two itertools helpers:

import itertools

KEY = "foo"
DATA = [
    {"foo": [1, 2, 3]},
    {"foo": [4, 5, 6]},
    {"foo": [7, 8, 9]},
    # throw in some weird entries
    {"baz": ["a", "b", "c"]},
    {},
]


def paginator():
    """generator that yields data in the same shape as a boto3 paginator"""
    for each in DATA:
        yield each


def bbp_like(pager):
    """emulate the bbp behavior"""
    for page in pager:
        if KEY in page:
            for element in page[KEY]:
                yield element


# use list() to expand the generator so that we can compare the results
using_bbp_like = list(bbp_like(paginator()))

using_itertools = list(itertools.chain.from_iterable(
    (page.get(KEY, []) for page in paginator())
))

using_chain_directly = list(itertools.chain(
    *(page.get(KEY, []) for page in paginator())
))

assert using_bbp_like == using_itertools == using_chain_directly
# verify that we didn't pick up baz
assert all([isinstance(i, int) for i in using_itertools])

print(using_itertools)

Feedback

Use

Managing boto3 clients can get complicated. There are a lot of factors that someone might encounter that will require them to build a custom client (special credentials or other configuration, thread safety[3], client reuse to minimize run time, or any of a variety of other issues). Rather than trying to accommodate all of these possible scenarios, as well as having to pass through arguments to the paginator, I would recommend simply taking a paginator and lookup key as input. If you want to get fancy, you could even inspect the paginator's result keys to guess the lookup key:

>>># some have more than one field; maybe default to the most likely?
>>> s3 = boto3.client("s3")
>>> paginator = s3.get_paginator("list_object_versions")
>>> paginator.result_keys
[{'type': 'field', 'children': [], 'value': 'Versions'}, {'type': 'field', 'children': [], 'value': 'DeleteMarkers'}, {'type': 'field', 'children': [], 'value': 'CommonPrefixes'}]
>>># some just have one field; easy default :)
>>> kms = boto3.client("kms")
>>> paginator = kms.get_paginator("list_keys")
>>> paginator.result_keys
[{'type': 'field', 'children': [], 'value': 'Keys'}]

Package Metadata

I see you've already fixed the url in setup.py to point to the GitHub package rather than your website. You might want to publish that change to PyPI; actually tracking down this repo ended up being rather complicated since you also don't like to this repo from the bbp page on your website.

Setuptools is notoriously underused in most projects (including most of my own). There is an argument that you might be interested in for this case: project_urls[4]. This lets you define multiple URLs that will be rendered on PyPI. For example, you can see that attrs defines multiple URLs[5] that are then rendered on PyPI[6] in addition to the "Homepage" link that comes from the url parameter.

Licensing

How you license your project is entirely up to you, and I do not intend this as saying that you were wrong to pick any given license. My only intention is to provide some food for thought if this is not something you have encountered yet.

From an ideological perspective, I appreciate what the FSF is trying to do with the GPL, and I actually personally agree with those goals. However, in practice some of those goals (especially as expressed in GPLv3) can make it...difficult...for a business to use any such licensed software[7][8][9]. Depending on your perspective this might be a good thing or a bad thing.

In case you haven't seen it yet, GitHub offers a great resource[10] for helping you find the right OSS license for you.

[1] https://docs.python.org/3/library/itertools.html#itertools.chain
[2] https://docs.python.org/3/library/itertools.html#itertools.chain.from_iterable
[3] https://boto3.amazonaws.com/v1/documentation/api/latest/guide/resources.html#multithreading-multiprocessing
[4] https://packaging.python.org/guides/distributing-packages-using-setuptools/#project-urls
[5] https://github.com/python-attrs/attrs/blob/b78720245a7944e5a091a075b7e9784fc93be05f/setup.py#L14-L18
[6] https://pypi.org/project/attrs/
[7] https://opensource.com/article/17/2/decline-gpl
[8] https://pdfs.semanticscholar.org/b028/b6ee54c8a44a363481e9059491658ae87dc0.pdf
[9] https://www.synopsys.com/blogs/software-security/whos-afraid-gpl3/
[10] https://choosealicense.com/

@mdavis-xyz
Copy link
Owner

Thanks for this feedback. I did read it the day you wrote it, but haven't had a chance to implement any of it yet. I will eventually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants