Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Export html #9

Open
wants to merge 27 commits into
base: master
Choose a base branch
from
Open

Conversation

dschick
Copy link

@dschick dschick commented Feb 17, 2019

there are still some problems with caching - this might be related to the use of docker:
-once an export is created its content does not change, although e.g. new entries have been added at any recreation

I am not sure if all logbook names are encoded properly to html filenames.
Images are not included properly, yet.
Need to do some css formating of the output html file.

@dschick
Copy link
Author

dschick commented Feb 18, 2019

FollowUps are also not included yet.

@johanfforsberg
Copy link
Contributor

Looks promising! You might as well change the PDF export to be based on the same HTML too.

@dschick
Copy link
Author

dschick commented Feb 18, 2019

thanks, yeah some refractoring seems to be obvious here.
However, I am a bit stuck with the caching which is I guess a nginx issue.
As I said, the html is not recreated after changing the content of a logbook.
Do you have any idea how to solve that?

as_attachment=True,
attachment_filename=("{logbook.name}.html"
.format(logbook=logbook)))
attachment_filename=(slugify("{logbook.name}.html"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this the slugifying is really necessary? Shouldn't UTF-8 be OK in filenames nowadays? I'm asking because it is another dependency and I always like to think twice before adding those.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also not sure about that either.
If you think filenames with '`´" etc. are okay, than I can remove it, no problem. I have not tested this for now myself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do some testing.

@johanfforsberg
Copy link
Contributor

However, I am a bit stuck with the caching which is I guess a nginx issue.
As I said, the html is not recreated after changing the content of a logbook.
Do you have any idea how to solve that?

Hm, I just tried your branch with docker compose and as far as I can tell it works as intended. Can you describe how to reproduce the problem, e.g. exactly what commands are you running?

@dschick
Copy link
Author

dschick commented Feb 18, 2019

okay, if you export a logbook to HTML and after that, e.g. change or add an entry to that logbook; do again the HTML export --> I get two time the exact same export result, altough the underlaying logbook has changed.

@johanfforsberg
Copy link
Contributor

I tried this and I see the changes I expected.

Are you sure you're looking in the correct output files? When I tested with "wget" it saved each file after the first with a new suffix ".1", ".2" and so on tacked on, instead of overwriting the previous. So the first HTML file was never changed.

@johanfforsberg
Copy link
Contributor

I did some poking around and I think the cache issues can be solved by doing something like this in entries.py:

            response = send_file(html, mimetype="text/html",
                                 as_attachment=True,
                                 attachment_filename=("{logbook.name}.html"
                                                      .format(logbook=logbook)))
            response.cache_control.no_cache = True
            return response

Still thinking about the slugify part. I don't think you need to set a filename for the tempfile, you can just let the module generate a random filename for you there. But for the downloaded filename I agree that "international" characters, as well as spaces and punctuation are usually annoying. And python-slugify seems pretty lightweight. I leave it up to you to decide since you requested the feature anyway (unless the MAXIV guys have an opinion on this).

A suggestion: it might be nice to also tack on a timestamp to the filename, and/or include it in the HTML itself. Always nice to have a record of when the export was done!

@dschick
Copy link
Author

dschick commented Feb 20, 2019

I will try the response.cache_control.no_cache = True command and add it in case it also works for me.
I agree that tagging the export (in the document and/or in the filenam) with a timestamp is a good idea.

Would you prefer to use python re module instead of slugify?

@johanfforsberg
Copy link
Contributor

Slugify is OK.

@johanfforsberg
Copy link
Contributor

When thinking about your suggestion about displaying all entries in a logbook I realised that there might be some potential performance problems here too. If a logbook is very large, will the export take a long time or use lots of memory? It could be a problem mainly because flask is synchronous, so it might block other requests.

Maybe we should add arguments (e.g. "n" and "offset" like the entries API) so that the logbook can be generated in "chunks", e.g. of 100 entries per file. That way it would be possible to export even a huge logbook without eating all system resources. It would require several http calls but that's OK I think?

Anyway, perhaps this is a premature optimisation right now, but it's worth testing a bit to see what performance we get.

@dschick
Copy link
Author

dschick commented Feb 22, 2019

obviously you are think in a bit larger scale than me :)
okay, lets first do a proper html (later also pdf) export an then see how demanding it is for the server.
I agree, that chunking the export with a maximum number of entries per export should be generally no problem.
I guess one needs to think of a proper user interface for that. So one might need some kind of intermediate export page or have a drop-down list selecting the chunk to export. The dropdown could looks like this for example:

  • 1-100
  • 101-200
  • 200-281

@johanfforsberg
Copy link
Contributor

Yes, well the thing is that this project was initially started to replace the use of "elog" at maxiv, which had already been in use for years and contained some quite large logbooks. That's why there is an elog import script. So we were already from the start roughly aware of the scale we had to handle, which is why most parts of elogy already works like this, e.g. the entries list only loads 100 at a time. It's in the back of my mind so to speak :)

The user interface for this will need some thought, I agree. Maybe we should make a separate export page for it so the main interface does not get too crowded. But let's focus on the functionality first. I guess we could also provide a script that downloads and zips the files, or something.

@johanfforsberg
Copy link
Contributor

Any updates on this @dschick , still interested? Otherwise I volunteer to finish this one up :)

@dschick
Copy link
Author

dschick commented Apr 10, 2019

hej @johanfforsberg , a bit of help would be great, e.g. for implementing the pictures. I am currently and most likely till middle of May pretty busy.

Tack so mycket

@johanfforsberg
Copy link
Contributor

OK, I'm having a look and it's clearly a bit trickier than I expected. Parsing HTML is never fun, but fortunately we're already pretty much doing the same thing (but in the opposite direction) when reading posted entries, so I'll re-purpose some of that code. Hope to have something testable soon.

@johanfforsberg
Copy link
Contributor

@dschick since it looks like you're working on this again, I just want to check if you saw my PR #16 where I did some further work on this. I think it's mostly working but needs testing and polish. Feel free to check it out and do whatever you like with it.

@dschick
Copy link
Author

dschick commented Jul 24, 2019

thanks @johanfforsberg for pointing that out. Indeed I also finished some primitive html export.
I thought the project has moved to gitlab though?

@johanfforsberg
Copy link
Contributor

Yes it seems development has moved, but I'm not involved with that. I guess it's necessary to make a new "merge request" at gitlab in order to get it merged.

@dschick
Copy link
Author

dschick commented Jul 24, 2019

now I am a bit confused.
you started the elogy under your personal account - than it was forked by MaxIV-KitsControls and then they moved it to GitLab.
So shouldn't we move all the PRs all to Gitlab?

@johanfforsberg
Copy link
Contributor

johanfforsberg commented Jul 25, 2019

To clarify, I was working at MAXIV when I started developing elogy as a personal project, and I never got around to moving it to the MAXIV organisation properly since I left before it was taken into production. So it's really a MAXIV project now. I'm not using it anymore, I just sometimes hack on it for fun :)

Someone already migrated the PR:s a few months ago, but I don't really know how that works. E.g. it looks like your recent changes to your branch aren't visible in the gitlab MR (https://gitlab.com/MaxIV/Elogy/merge_requests/9), maybe because they refer to github. Also, #16 was made after the move (but before I became aware of it). Since our branches overlap I think it would just be confusing to move mine there anyway, better if you just take whatever you want from it and incorporate into your branch.

Anyway, I guess we should move any further disussion to gitlab, I just replied here since I got a github notification. I think it would be best if this repo was archived with a note about the move, but I'll leave it to the MAXIV guys.

@dschick
Copy link
Author

dschick commented Jul 25, 2019

alright, now a few things are clear!

I will try to put together #9 and #16 and then do a merge request to gitlab and I will do all further discussions on gitlab as well.

Best

Daniel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants