Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add google analytics or similar logging for static OBO website, and anciliary static websites #126

Open
cmungall opened this issue Sep 28, 2015 · 18 comments
Labels
attn: Outreach WG Issues pertinent to outreach activities, such as user interactions and documentation attn: Technical WG Issues pertinent to technical activities, such as maintenance of website, PURLs, and tools website Issues related to the OBO Foundry website

Comments

@cmungall
Copy link
Contributor

No description provided.

@cmungall cmungall added the website Issues related to the OBO Foundry website label Sep 28, 2015
@alanruttenberg
Copy link
Member

I'm not keen on doing so, because of tracking and privacy issues. Or make it opt-in on an ontology by ontology.

@cmungall
Copy link
Contributor Author

Any update on thinking on this post-GDPR? E.g https://support.google.com/analytics/answer/9019185?hl=en

@kltm
Copy link
Contributor

kltm commented Apr 13, 2020

What is it that people want information on? If one wants to see how people move through pages, something like GA or Matomo is needed. If one wants to see downloads and ontology usage, plain ol' server logs are what one wants.

@wdduncan
Copy link
Member

We operate on an "open" principle. Does this affect this principle? If so how?

@cmungall
Copy link
Contributor Author

cmungall commented Apr 13, 2020 via email

@nlharris nlharris added the attn: Outreach WG Issues pertinent to outreach activities, such as user interactions and documentation label Apr 13, 2020
@cmungall
Copy link
Contributor Author

Copying @alanruttenberg's helpful response from email:

3 recent articles on alternatives

https://mscholz.dev/post/logging/

https://plausible.io/ - pay service, cost depends on traffic. Anyone know our current level of traffic?

https://medium.com/better-marketing/why-i-switched-from-google-analytics-to-fathom-287017424069 also pay but maybe their lite version would work for us: https://github.com/usefathom/fathom/blob/master/README.md

Other suggestions from various posts - I looked at HN search results for google analytics alternative within the last year.

https://goaccess.io/
https://count.ly/ (there is a self-hosted version)
https://simpleanalytics.com/?ref=iceland-blog (service, cost depends on traffic)
https://matomo.org/matomo-on-premise/ (can be self hosted)
https://github.com/frequencyanalytics/frequency (self hosted, open source)
https://github.com/parkr/ping (there's a docker image)

I haven't assessed these yet - in part because I don't know the details of our current setup. However, I'd be open to a call in which a few of us walk through these and try to narrow down to a plausible 1 or 2.

For the pay services there's always the option of contacting them and asking if they can give us a break for being a good cause. Not worth doing, though, unless we particularly like one of them.

@cmungall
Copy link
Contributor Author

To answer @kltm's question. This is not about analyzing the logs for PURL accesses. We have this ticket on the purl repo for that: OBOFoundry/purl.obolibrary.org#63 (there is an analogous conversation about privacy from 2015 there too...)

This ticket is about the obofoundry.org repo, which is hosted from this repo via jekyll/GH pages. We want to have a handle on how many people view the mega ontology table, how many people look at the individual ontology metadata pages, how many people look at the principles.

Given the site is hosted in GH pages, we may be limited in options, as we are not committing to hosting the site ourselves at this time. Presumably GH keeps logs of visitors. I do not know if they provide tools for analyzing these. If not, then presumably the only option is client-side js such as is used for GA

This SO question is about how to do GA in GH pages:
https://stackoverflow.com/questions/43880617/log-visitors-on-github-page

@kltm
Copy link
Contributor

kltm commented Apr 29, 2020

@cmungall Another approach might be to just deploy the site to S3 and get access to the bucket logs (which are easily turned on in AWS)--we do this a couple of places elsewhere and it works pretty well. As you already deploy with jekyll statically, you'd only have to add a bit to push it out instead of GH for that.

Also noting that we use goaccess.io for GO (quite a nice log analyzer) and have some experience with Matomo (previously Piwik), a nice GA alternative but you have a little more overhead with the hosting.

@jamesaoverton
Copy link
Member

The idea of moving to S3 has also come up in the context of the Dashboard. I think it's worth serious consideration. Although running our own build would be one more moving part, I see several benefits:

  • simplify the system, e.g. by avoiding the (brilliant) _config.yml hack
  • one-step builds ensure registry/ files are always up to date
  • fully integrate the dashboard
  • simpler, unified logging solution
  • maybe eventually migrate to Python (we don't use Ruby anywhere else)

Using GitHub Pages was a really great solution, but I think we're growing past it.

@alanruttenberg
Copy link
Member

alanruttenberg commented Apr 29, 2020 via email

@nlharris nlharris added the attn: Technical WG Issues pertinent to technical activities, such as maintenance of website, PURLs, and tools label Oct 13, 2020
@nlharris
Copy link
Contributor

see also #46

@cmungall
Copy link
Contributor Author

I am currently using a GA tag from a different project (linkml) in RO at the moment:

https://github.com/oborel/obo-relations/blob/master/mkdocs.yml

This is suboptimal

@jamesaoverton
Copy link
Member

My preference would be not to use Google Analytics, but right now I don't have the time or resources to provide a good alternative.

@alanruttenberg
Copy link
Member

alanruttenberg commented Jan 11, 2022 via email

@kltm
Copy link
Contributor

kltm commented Jan 11, 2022

Now that there is no proxy server running and we are directly on GH pages, all the logging solutions are off the table.
GA replacements are like https://plausible.io/ and https://matomo.org/, but either require payment to use their cloud version or additional setup and servers to run on one's own.
A good decision point would be understanding additional resources that could be applied. If the answer is "0", GA is likely the way yo go. If the answer is "a bunch of money", alternate providers could be easily considered and used in their cloud form. If the answer is "time and money", the field is wide open.

@alanruttenberg
Copy link
Member

alanruttenberg commented Jan 11, 2022 via email

@kltm
Copy link
Contributor

kltm commented Jan 11, 2022

From above (like #126 (comment) and #126 (comment)), if only ontology download information is desired and users are transiting the PURLS, this information could be mined from non-JS/non-browser sources, like purl server logging.

@cmungall cmungall changed the title Add google analytics Add google analytics or similar logging for static OBO website, and anciliary static websites Jan 11, 2022
@cmungall
Copy link
Contributor Author

I updated the title to make it clear that this is just about the static pages on https://obofoundry.org/

We have tickets in the purl tracker about logging class and ontology PURL accesses (which may be the more meaningful statistic)

@alanruttenberg

There was another thread on this, where I also noted my preference that we
not use google, and offered a number alternatives.

You did indeed, in an email thread I believe, this was very helpful, I copied your suggestion to a comment in this issue (#126 (comment)), @kltm also provided some above. But my understanding is that this would either require coordinating payments or switching to a different infrastructure (we use gh pages for static hosting)

I can offer grunt labor if someone else makes decisions and points me at a task that would help.

Most welcome, but it's not clear to me what that labor would be.

My understanding of this thread is that for our current static site, the options are GA or no logging. We lack consensus on using GA, so I'm happy to close this, and reopen if we ever switch our infrastructure.

I think static sites for anciliary projects, whether ontology (COB, RO) or tools (ROBOT, ODK) should be influenced the OBO central decision, but it seems that say if we want to use GA for the RO static sites then we wouldn't use an OBO-wide GA token as such a thing will never exist, and mint a new one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
attn: Outreach WG Issues pertinent to outreach activities, such as user interactions and documentation attn: Technical WG Issues pertinent to technical activities, such as maintenance of website, PURLs, and tools website Issues related to the OBO Foundry website
Projects
None yet
Development

No branches or pull requests

8 participants