Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Do not merge] 1st Overview Unet Model #9

Draft
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

hanfried
Copy link
Collaborator

@hanfried hanfried commented Sep 8, 2019

I processed the data from the diverse dataset with ~1000 images from the dropbox.
The diff will be unreadable and github does not like to show large Notebooks, so I added a copy to 1stOverview.html on my web space.
It's not production ready, right the notebook expects to have local Dropbox folder (and the models are saved via data version control on my own web host), but I think we can change that easily when the hackathon starts.

I implemented the straight forward Unet model with fastai (that's much faster for me and we get all the data visualizations for free and ease). tensorflow.keras might have the advantage that we could scale it over more than 1 gpu easier (it's already painful on my not so bad GTX 1080 locally), but I'm afraid, if we start doing this, we won't do anything else for the rest of the week. But I would be fine to just rewrite it for keras if you like it more. Prototyping is always easier in fastai...

The biggest problem IMHO right now from technical side is the low batch size (so the batch normalization does not work really good) and so the training is really slow. I think the best line here is to discuss with our mentor.

From content site, I grabbed all the labels from the XML file directly (without using the API - it was easier for me this way). But I think, the labels are too detailed now and the network is very much in progress to distunguish different semantical paragraphs (that are still paragraphs). I'm looking forward to talk about with you domain experts about. I think, this makes the learning process much harder (in the end, the network now will also learn to read a bit in addition :-o)

I'll run it a bit longer from now, just to see whether it would still progress, but close to 100 epochs is already too long to work interactive with it.
But just look at the results, i think have a good starting point for the hackathon and what we might acheive with a segmentation model trying to classify each pixel. (right here on downscaled image)
I guess, we will be able to improve and might still add some class image processing, but I'd say it has a lot of potential...

What is of course missing would be the closing gap to tesseract, we still need something to do next week. I'll only look to tesseract API this evening, but won't implement anything.

hanfried and others added 9 commits September 7, 2019 15:37
Not really working.
Checking for contains/within is too tight (I guess the containers are
touching or slightly overlapping).
So, here is a TODO for a better logic.
Have CUDA memory problems, so will need to check it on a better GPU.
Right now it's pointing to my own webserver,
what's not the desired final state.
@hanfried hanfried requested review from kba, wrznr and bertsky September 8, 2019 13:52
@wrznr wrznr changed the title 1st Overview Unet Model [Do not merge] 1st Overview Unet Model Dec 6, 2019
@hanfried hanfried marked this pull request as draft July 8, 2020 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant