Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea #1

Open
mrojew opened this issue Dec 1, 2022 · 3 comments
Open

Idea #1

mrojew opened this issue Dec 1, 2022 · 3 comments

Comments

@mrojew
Copy link
Contributor

mrojew commented Dec 1, 2022

@tgorochowik

  1. Neural network that will look for format color on some constant resolution:
    • We think we can just look for with dimension, what the step (of increasing dimension value )is gonna be in that case ?
  2. After recognizing color we will pass it to next network that will seek for correct resolution

But we are not so sure about Neural Network capabilities in color recognition

Other option is to make one neural network to make it all but complexity of that solution may be too high

Let us know what you think ?

@tgorochowik
Copy link
Member

Yes, those ideas sound great! To create an input for the neural network, you do need a "standard" format for it.
Below are some comments and further ideas:

It might be easier to start with recognizing the resolution first - maybe not precisely resolution but line length (in bytes).

For example: always treat the data as 800x600, GRAY8 and analyze the distortions to find the correct width.

I suggest using GRAY8 because it's just single byte and it can look relatively correct for a lot of formats if you adjust the width.
For example, if the image is 1920x1080, RGB24, if you interpret is as GRAY, but set the resolution to (3*1920)x1080, it will look somewhat correct - correct enough for algorithms to detect that resolution. You can play with raviewer to check it.

The 3* comes from the fact that RGB24 uses 3 bytes per pixel (BPP), where GRAY uses just one.

So the 3*1920 would not really be the width of the resolution, but rather the total line length in bytes.

With that known, you can pass this data (with the "correct" width!) to another NN, where the format would be analyzed.

Here I think it could be beneficial to try analyzing this as something like RGB24 or some YUV, as the distortions for those would be more pronounced and probably "easier to recognize" for neural networks. However you need to experiment with this a little bit to see what gives the best results.

But we are not so sure about Neural Network capabilities in color recognition

Let's not think about this as "color recognition", but rather specific image distortion recognition (especially for formats using different BPP values).

@mrojew
Copy link
Contributor Author

mrojew commented Dec 1, 2022

Just to make sure is our data compressed in any way ?

@tgorochowik
Copy link
Member

No, the data is not compressed, it is always "raw", please take a look at the resources directory in raviewer, it contains some examples

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants