Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for plugin analytics #12

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from
Draft

Proposal for plugin analytics #12

wants to merge 6 commits into from

Conversation

uschmidt83
Copy link
Member

@uschmidt83 uschmidt83 commented Mar 25, 2022

This pull request adds privacy-friendly and opt-in functionality to transmit anonymous plugin usage data.

Concretely, anonymous usage data is transferred to and aggregated by Plausible Analytics (using the plausible-events package), with all recorded data being publicly visible at https://plausible.io/stardist-napari.

Of course, sharing your data is a sensitive topic and nobody likes to be monitored. I'll try my best to convince you below that the proposed implementation preserves your privacy and doesn't record any personally identifiable information. I welcome any constructive feedback, especially if you disagree.

Why?

As developers, it would be very useful to see how many people use the plugin and which features and parameters are important to them. This would allow us to prioritize our time more effectively, e.g. to improve highly-used features first, or refactor rarely-used features. Furthermore, by knowing the versions of the plugin and important dependencies in use, we can better estimate if/when to move to newer versions of napari and other dependencies.

Besides having recently received a napari Plugin Accelerator Grant, our plugin and the underlying StarDist package are non-commercial open-source software, which are in large parts developed and supported on a voluntary basis. Hence, it would also be quite motivating to see (in numbers) that our work is actually appreciated and used.

What?

In this PR, I propose to record two types of events, namely plugin Launch (opening the plugin in napari), and plugin Run 2D / Run 3D (executing the plugin with a 2D / 3D model):

  • During plugin Launch, an identifier of the platform (operating system, etc.) and the versions of some dependencies are transmitted to the analytics server.

    launch_props = {
    "platform": platform.platform().strip(),
    "python": platform.python_version(),
    "napari": napari.__version__,
    "magicgui": _magicgui_version,
    # "tensorflow": tensorflow.__version__,
    # "keras": keras.__version__,
    "csbdeep": csbdeep.__version__,
    "stardist": stardist.__version__,
    "stardist-napari": __version__,
    }
    PE.event("Launch", launch_props)

  • During plugin Run 2D / Run 3D, the shape (but not the name) of the image and all plugin parameters are transmitted to the analytics server. However, we do not transmit the names of custom (user-trained) models to preserve anonymity. Furthermore, we round up the (non-channel) dimensions of input image to the next power of two since we're only interested in approximate image sizes.

    def _model_name():
    # only disclose model names of "public" registered/pre-trained models
    model_type, model_name = model_selected
    if model_type in models_reg:
    return (
    model_name
    if (model_name in models_reg_public.get(model_type, {}))
    else "Custom (registered)"
    )
    else:
    return "Custom (folder)"
    def _shape_pow2(shape, axes):
    return tuple(
    s if a == "C" else int(2 ** np.ceil(np.log2(s)))
    for s, a in zip(shape, axes)
    )
    run_props = {
    "model": _model_name(),
    "image_shape": _shape_pow2(x.shape, axes),
    "image_axes": axes,
    "image_norm": (perc_low, perc_high) if norm_image else False,
    "image_scale": input_scale,
    "image_tiles": n_tiles,
    "thresh_prob": prob_thresh,
    "thresh_nms": nms_thresh,
    "output_type": {t.value: t.name for t in Output}[output_type],
    "output_cnn": cnn_output,
    "norm_axes": norm_axes,
    }
    if "T" in axes:
    run_props["timelapse"] = {t.value: t.name for t in TimelapseLabels}[
    timelapse_opts
    ]
    run_event = {StarDist2D: "Run 2D", StarDist3D: "Run 3D"}[type(model)]
    PE.event(run_event, run_props)

How?

Opt-in

Plugin analytics are opt-in, i.e. nothing is recorded without the user explicitly enabling sharing of usage data. You can change your decision at any time.

Transparent

All recorded data will be publicly visible at https://plausible.io/stardist-napari.

Privacy-friendly analytics platform

I have chosen Plausible as the analytics platform, which calls itself a "simple and privacy-friendly Google Analytics alternative". Although Plausible is primarily an open source web analytics software (which can be self-hosted), it does support recording of custom events via an API, which we use here. Furthermore, Plausible describes itself as an independent, self-funded, and debt-free EU-based company, which uses a straightforward subscription-based business model (i.e., I pay for using this service).

Your public IP address is obtained via ipify.org and transmitted to Plausible for the purposes of unique user counting and approximate geolocation (city granularity). However, ipify and Plausible do not store your IP address.

@psobolewskiPhD
Copy link
Contributor

As a user, this is a really solid and respectful proposal.
I can see how it benefits you—and it could benefit napari and napari devs in general—but you've clearly given privacy a lot of thought so it harms me not.
I really want to highlight that I strong think that with the napari ecosystem still developing, this type of data can be extremely valuable, but your approach makes it public, so the value is shared with the ecosystem. This is a really commendable approach.
I would/will opt in as soon as this is available.

@jni
Copy link

jni commented Mar 30, 2022

I love this Uwe! ❤️ I agree with all of @neuromusic's comments in the Zulip thread. It would be great to make this generally used in napari and usable by all napari plugins.

As @neuromusic has pointed out, single/few points of control in the source code are important. I haven't looked at the implementation details but one idea would be to make a single decorator, which when added to a function records the function call and approximate/anonymised parameters.

Another idea from @tlambert03's napari-error-reporter package is that a user's opt-in choice should be invalidated by new versions of plausible-events, so that users can be sure that they don't opt in to one thing and then get a different thing after they update their conda environment.

Anyway, love this, and I'm super happy for stardist to be the guinea pig here and then napari/other napari plugins following suit. 😂 Thank you!

@uschmidt83
Copy link
Member Author

It would be great to make this generally used in napari and usable by all napari plugins.

I guess we should get a broader set of responses from other plugin devs, maybe most aren't interested?

As @neuromusic has pointed out, single/few points of control in the source code are important.

I agree.

I haven't looked at the implementation details but one idea would be to make a single decorator, which when added to a function records the function call and approximate/anonymised parameters.

I doubt that it's going to be that easy/automated. I think plugin devs really have to think about what to record, and how to ensure that the data is properly anonymized.

Another idea from @tlambert03's napari-error-reporter package is that a user's opt-in choice should be invalidated by new versions of plausible-events, so that users can be sure that they don't opt in to one thing and then get a different thing after they update their conda environment.

I hadn't thought about that. Right now, the plausible-events package is merely a simple tool that doesn't collect any data on its own.

Anyway, love this, and I'm super happy for stardist to be the guinea pig here and then napari/other napari plugins following suit. 😂 Thank you!

I don't know if I have the time for this plugin to be the guinea pig ;) Also not sure how to proceed... merge and release this, or wait for a broader discussion on this?

@jni
Copy link

jni commented Apr 1, 2022

I don't know if I have the time for this plugin to be the guinea pig ;)

Oh LOL I thought the whole point of this PR was for that to happen? 😂

Also not sure how to proceed... merge and release this, or wait for a broader discussion on this?

This depends on your appetite for risk, and I guess also on @neuromusic's. 😂 It's a bit unfair, but it's easier for you @uschmidt83 to propose and deploy such a scheme than it is for CZI. But anyway, the fact that we all like it (and I have in general been extremely risk averse about these things) is a good sign!

I guess we should get a broader set of responses from other plugin devs, maybe most aren't interested?

I'd be surprised, I'd say it's more likely that most aren't aware of the possibility. Generally we are all a bunch of academics and usage statistics are a good thing.

- Move code to separate file
- Update 'plausible-events' dependency (package is now on PyPI)
- Add 'Consent' event (for opt-in / opt-out)
- Add more package versions to 'Launch' event
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants