Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About PSNR and l2 #11

Open
neouyghur opened this issue Nov 26, 2019 · 10 comments
Open

About PSNR and l2 #11

neouyghur opened this issue Nov 26, 2019 · 10 comments

Comments

@neouyghur
Copy link

Hi, I am checking your results provided in Table1. I find the PSNR is not corresponding to l2. For example, 0.2465 l2 is corresponding to 25.60 PSNR, while 0.0627 l2 is corresponding to 24.83, and scene text easier l2 error is very high. Could you check this or could you offer your model for testing? Thanks.

@neouyghur
Copy link
Author

neouyghur commented Nov 26, 2019

I plotted your PSNR and l2 scores in a figure. It clearly shows your result is not consistent. Could you explain why?
@naoto0804 did you get the same PSNR score? Thanks.

EnsNet

@zhangshuaitao
Copy link

@neouyghur,Both PSNR and l2 scores are based on the average scores of the all test images.

@neouyghur
Copy link
Author

@zhangshuaitao I am comparing my method with yours. I am also following the same protocol, however, my MSE and PSNR curves share the same trend. Besides that, as we know PSNR score is calculated based on the MSE score.

@neouyghur
Copy link
Author

@zhangshuaitao is your l2 score is rmse or mse? thanks.

@zhangshuaitao
Copy link

@neouyghur, l2 score is mse. We use the compare_mse and compare_ssim and compare_psnr functions in the skimage.measure module.

@naoto0804
Copy link

naoto0804 commented Nov 28, 2019

@zhangshuaitao
First of all, thanks again for releasing the code and answering a lot of questions patiently.

However, what you say above seems to be inconsistent with the README.md;To evalution the model performace over a dataset, you can find the evaluation metrics in this website PythonCode.zip..
Which is correct?

@naoto0804
Copy link

naoto0804 commented Nov 28, 2019

I would really appreciate it if you could produce the whole pipeline for evaluation?

It might be hard to follow the exactly same evaluation protocol, since some parameters for each function is unknown. (e.g., compare_ssim has some optional params, how did you set it? What's the range of values in images, 0.0~1.0 or 0~255?)

@zhangshuaitao
Copy link

@naoto0804, Sorry for not explaining it clearly. we use AGE, pEPs, pCEPS in the PythonCode.zip. We use the compare_mse and compare_ssim and compare_psnr functions in the skimage.measure module. The default parameters for those functions is ok.

@naoto0804
Copy link

naoto0804 commented Nov 29, 2019

@zhangshuaitao Thank you so much for making it much more clear.

To make sure whether I followed your instruction exactly, I've computed all the metrics between all the original input/ground truth images in the test subset of the synthetic dataset. This is because I want to focus on the difference only in the evaluation phase, before reproducing the training phase.

The result is as follows; Do you think it's reasonable? If possible, could you compute it on your dataset and evaluation code? (I suspect there's still bugs in my implmentation, since these values are much better than the baseline method in Table. 1)
mse 0.006965
ssim 0.933875
psnr 23.996012
AGE 5.851178
pEPs 0.064378
pCEPs 0.050264

@neouyghur
Copy link
Author

neouyghur commented Nov 29, 2019

@naoto0804 @zhangshuaitao I think your result is reasonable since only a small part of the scene is text. However, I felt they didn't fully train the baselines. With more training, the based line should get better results than they reported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants