Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add updated tutorial: Basic-1D-Fitting #14

Merged
merged 8 commits into from
Aug 23, 2024

Conversation

smk78
Copy link
Collaborator

@smk78 smk78 commented Jan 20, 2024

This PR provides an updated version of the Basic-1D-Fitting-in-SasView-v5 tutorial.

Addresses #11
Addresses #12

  • All screenshots have been regenerated due to GUI changes
  • Additional screenshots have been incorporated to show functionality added to the GUI
  • Example 2 has been updated to use a finite salt concentration
  • Text/formatting changes throughout

When merged, the .pdf file will then need moving to Sasview/sasview/src/sas/sasview/media in an appropriate branch, and copying to Sasview/sasview.github.io/downloads. It is provided here simply as a convenience for the reviewer(s).

@Susyana Susyana self-requested a review January 21, 2024 17:25
@Susyana
Copy link

Susyana commented Jan 21, 2024

The print-screens have been updated and the PDF now includes a description for example 2 that addresses the previous concern regarding the salt concentration not being set to zero (page 17 includes a comment alerting the reader for the reason why it is not zero).

@smk78, the tutorial points the users to the \test\1d_data directory, it should be \example_data\1d_data.

Copy link

@Susyana Susyana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous comment, copied below for convenience.

The print-screens have been updated and the PDF now includes a description for example 2 that addresses the previous concern regarding the salt concentration not being set to zero (page 17 includes a comment alerting the reader for the reason why it is not zero).

@smk78, the tutorial points the users to the \test\1d_data directory, it should be \example_data\1d_data.

@smk78
Copy link
Collaborator Author

smk78 commented Jan 21, 2024

@Susyana wrote:

the tutorial points the users to the \test\1d_data directory, it should be \example_data\1d_data.

This is an updated tutorial for v5.x, so the example data still resides in \test; it has been renamed \example_data in v6.x

@smk78
Copy link
Collaborator Author

smk78 commented Jan 22, 2024

Having thought about this some more, I've now also added to this PR the same tutorial but with the new path to the example data that will be present from v6.0.0.

@butlerpd
Copy link
Member

Ouch .. had not thought about that! Does this mean all tutorials need changing JUST because the example data they all use is in a different directory? That will be painful/awkward?

@smk78
Copy link
Collaborator Author

smk78 commented Jan 25, 2024

Ouch .. had not thought about that! Does this mean all tutorials need changing JUST because the example data they all use is in a different directory? That will be painful/awkward?

Not all tutorials.

There are 7 tutorials. This PR & #15 rectify 2 of them. Another 2 (Getting Started & Creating Custom Fitting Models) don't reference \test. But the remaining 3 (P(r) Inversion, Correlation Functions & Subtracting a Model Calculation) will need updating. Of those, the last two will need updating for v6.x for recent changes by @lucas-wilkins & @caitwolf anyway.

I will create issues to track what needs doing.

Copy link
Member

@butlerpd butlerpd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I note that the new v 6.0 file still says it is for 5.0.6 on page 3 in the orange box (same problem as noted in PR #15)

@smk78 smk78 self-assigned this Feb 13, 2024
@smk78 smk78 requested a review from butlerpd March 27, 2024 17:49
@butlerpd
Copy link
Member

butlerpd commented Jul 5, 2024

Sorry for not getting back to this sooner @smk78.

My main complaint now is on page 21-22 (in both versions) where we discuss why we need to include polydispersity for the relatively monodisperse Ludox but not for the SDS micelles. The statement:

In Example 2 we managed to obtain a good solution without having to specifically address polydispersity,
most likely because the micelles were quite polydisperse in size (i.e. had a broad size distribution). In the present dataset, the fringes in the scattering curve suggest there is some polydispersity, but not too much.

seems wrong. In fact a "broad distribution" would make the smearing of the fringes far worse I think? The real cause I think is that for SDS the micelle size is only ~4nm diameter and the fringes are at high Q beyond the range of the fit whereas for Ludox, with a 30 nm diameter, they are well within the fitted Q range? In fact you can see that the model originally shows fringes with the default diameter of 10nm and move to where the data becomes very noisy in the background when you adjust the diameter to 2.6 nm.

The rest are a more nigly, though a few will likely have to be addressed in the near future (but cannot be addressed now?)

  • Chi^2 of much less than 1 in the polymer fit suggests to me that the data is either overfit or that the uncertainties are overestimated on the data points. We just said a perfect fit within errors should lead to a Chi^2 of ~1. 0.7 is well below that no? Of course Xi^2 is not necessarily the best model in some cases but the residuals also seem to show much less than 32% outside the +1/-1 lines? That said, it is not unusual for the uncertainties to be incorrect - though usually they way underestimate (particularly for X-ray)?
  • Version 5 issue: In both versions you talk about clicking the Send To button. In version 5.x that is Send data to. Note that in version 6, it is Send to (the T is not capitalized). As I said: tiny niggle not worth changing except that you may have some other changes?
  • Version 6 issues:
    • on page 3 of both versions we have a orange highlighted box that points out what version this is for which ones it might apply to and the fact that for prior versions yet there are other tutorials. for 6.0 you just changed the first line to read that this is for version 6.0.0 but then say it is generally applicable to any version 5.x (directly left over from 5.x which is for 5.0.6 BUT is applicable to any version 5.x). and finish that "However, there is a separate tutorial using the old program interface released with SasView 4.x" All of the above is true but in fact, you have just updated "a separate tutorial" for version 5.x. So wouldn't it make sense to also point that out? I.e. this is for 6.0.0 and should be valid for all 6.x versions. there are separate tutorials for older versions 5.x and 4.x?
    • And of course both the icon and splash screen look like they will be different. That should definitely change but not sure I'd change it till we get the final PR for that accepted and merged ...

@smk78
Copy link
Collaborator Author

smk78 commented Jul 24, 2024

@butlerpd , thank you for your comments. These have now been addressed.

My main complaint now is (in both versions) where we discuss why we need to include polydispersity for the relatively monodisperse Ludox but not for the SDS micelles. The statement: "In Example 2 we managed to obtain a good solution without having to specifically address polydispersity, most likely because the micelles were quite polydisperse in size (i.e. had a broad size distribution). In the present dataset, the fringes in the scattering curve suggest there is some polydispersity, but not too much" seems wrong. In fact a "broad distribution" would make the smearing of the fringes far worse I think? The real cause I think is that for SDS the micelle size is only ~4nm diameter and the fringes are at high Q beyond the range of the fit whereas for Ludox, with a 30 nm diameter, they are well within the fitted Q range? In fact you can see that the model originally shows fringes with the default diameter of 10nm and move to where the data becomes very noisy in the background when you adjust the diameter to 2.6 nm.

Yes, that was some pretty lame wording! It has been changed (now on p23) to read: "In Example 2 we managed to obtain a good solution without having to specifically address polydispersity, most likely because the size of the micelles meant any fringes in the scattering curve were beyond the range of the data."

Chi^2 of much less than 1 in the polymer fit suggests to me that the data is either overfit or that the uncertainties are overestimated on the data points. We just said a perfect fit within errors should lead to a Chi^2 of ~1. 0.7 is well below that no? Of course Xi^2 is not necessarily the best model in some cases but the residuals also seem to show much less than 32% outside the +1/-1 lines? That said, it is not unusual for the uncertainties to be incorrect - though usually they way underestimate (particularly for X-ray)?

Agreed. The wording (on p13) has been simplified to: "What do you think of the solution? It looks good, right? The model calculation (‘theory’ curve) is running through the measured datapoints, the normalised residuals are all within +/-2 standard deviations, and the Reduced Chi2 is close to 1 (more on this shortly!)." and a new "Aside" box added to p15: "Aside: You may have noticed that all of the Reduced Chi2 values in the summary table above are slightly less than 1. Whilst values >1 are indicative of a poor fit, Reduced Chi2 values <1 are usually an indication that the data is being ‘over-fit’; for example, because the uncertainties on the data have been over-estimated (and the fact that most of the Residuals in the earlier screen shot lie within the ±1 band is also suspicious). It is not unusual for uncertainties to be incorrect, but it is more common for them to be under-estimated, not over-estimated."

Version 5 issue: In both versions you talk about clicking the Send To button. In version 5.x that is Send data to. Note that in version 6, it is Send to (the T is not capitalized). As I said: tiny niggle not worth changing except that you may have some other changes?

These have been update in both versions.

Version 6 issues:

  • on page 3 of both versions we have a orange highlighted box that points out what version this is for which ones it might apply to and the fact that for prior versions yet there are other tutorials. for 6.0 you just changed the first line to read that this is for version 6.0.0 but then say it is generally applicable to any version 5.x (directly left over from 5.x which is for 5.0.6 BUT is applicable to any version 5.x). and finish that "However, there is a separate tutorial using the old program interface released with SasView 4.x" All of the above is true but in fact, you have just updated "a separate tutorial" for version 5.x. So wouldn't it make sense to also point that out? I.e. this is for 6.0.0 and should be valid for all 6.x versions. there are separate tutorials for older versions 5.x and 4.x?

These have been updated in both versions.

  • And of course both the icon and splash screen look like they will be different. That should definitely change but not sure I'd change it till we get the final PR for that accepted and merged ...

The new branding has been applied to the version 6 document.

Copy link
Member

@butlerpd butlerpd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm am approving this. I do have a few comments about the version 6 to consider, but honestly there is not much to do about them till we have a final release so probably not worth waiting for.

  • the images are using the beta1 version which has some significant changes from version 2. Most notably
    • The log explorer is now closed by default not open. This also means that the section talking about "appears in the log explorer" means nothing since we have not discussed how to open the explorer (it will pop open if an error is thrown).
    • The file explorer is now the default system explorer when selecting files to load. This looks quite different that what is in beta1 and before.
    • Less important is the fact that all the icons are the old ball icon and the version name next to the ball in the top left corner of the SasView window sasy 6.0.0b1

As I say, given all the work going on, I would not want to guarantee there will not be any another changes before the final release so doing something now seems silly.

One other tiny niggle 2/1.01=19.8 so closer to 20 then the 19 suggested on page 26

@butlerpd
Copy link
Member

Not something that should affect merging necessarily so thought I'd make a separate comment.

I have gone through example 2 (SDS) using the beta2 version and I get a different answer. In fact my fit (following the instruction to the letter -- I think?) I get a final fit for a sphere (no ellipsoid needed) that fits the data better both visually and in the X^2 which is 0.86 not 1.6 as in the tutorial. In fact you cannot get more perfect a fit. Moreover the vol fraction I get is 1.7 which is quite close to the 1.8 I would expect once I subtract the cmc concentration from 2% (2-0.2 = 1.8/1.01 = 1.78). Not sure why the difference at this point. Is there a difference with previous versions (I hesitate to even ask....)

Also I note in both cases the high q of the sphere fit does show the first half of the monodisperse oscillation but the data there is in the background and so noisy that it doesn't matter.

Finally, to the extent we add other things to consider, I would think that besides the literature suggestion of an ellipsoid, polydispersity which would also affect the low Q somewhat, and or the fact that if the headgroups have water, the core should still be dry so maybe a core shell would be more appropriate? etc.. But I think it is nice to show that it is not always possible to distinguish a polydisperse sphere from an ellipsoid with a small axis ratio.

@smk78 smk78 marked this pull request as draft August 16, 2024 16:53
@smk78
Copy link
Collaborator Author

smk78 commented Aug 16, 2024

@butlerpd commented:

I'm am approving this. I do have a few comments about the version 6 to consider, but honestly there is not much to do about them till we have a final release so probably not worth waiting for.

Even so, I'm making so edits, so have temporarily converted this PR to a Draft.

  • the images are using the beta1 version which has some significant changes from version 2.

I'm replacing the 6.0.0b1 screenshots with 6.0.0b2 screenshots, which also deals with...

  • Less important is the fact that all the icons are the old ball icon and the version name next to the ball in the top left corner of the SasView window sasy 6.0.0b1

So the new icon appears and I'm scrubbing the 'b2' to make it look like 6.0.0 :-)

  • Most notably the log explorer is now closed by default not open.

So the new screenshots deal with that too.

  • This also means that the section talking about "appears in the log explorer" means nothing since we have not discussed how to open the explorer (it will pop open if an error is thrown).

Have added a tip to address this.

  • The file explorer is now the default system explorer when selecting files to load. This looks quite different that what is in beta1 and before.

All the same, I hadn't noticed! Again, the new screenshots deal with this.

One other tiny niggle 2/1.01=19.8 so closer to 20 then the 19 suggested on page 26

Have changed the value from 19 to 19.8!

@smk78
Copy link
Collaborator Author

smk78 commented Aug 16, 2024

@butlerpd commented:

I have gone through example 2 (SDS) using the beta2 version and I get a different answer. In fact my fit (following the instruction to the letter -- I think?) I get a final fit for a sphere (no ellipsoid needed) that fits the data better both visually and in the X^2 which is 0.86 not 1.6 as in the tutorial. In fact you cannot get more perfect a fit. Moreover the vol fraction I get is 1.7 which is quite close to the 1.8 I would expect once I subtract the cmc concentration from 2% (2-0.2 = 1.8/1.01 = 1.78). Not sure why the difference at this point. Is there a difference with previous versions (I hesitate to even ask....)

Interestingly, when I re-run the fits in b2 to generate the new screenshots I get fits that are ~same as you describe here, and not what I had in the tutorial. Very odd. But glad we now agree!

I've also elaborated on the expected volume fraction point.

Also I note in both cases the high q of the sphere fit does show the first half of the monodisperse oscillation but the data there is in the background and so noisy that it doesn't matter.

I've added instructions to move the high-Q limit in to exclude the noise.

Finally, to the extent we add other things to consider, I would think that besides the literature suggestion of an ellipsoid, polydispersity which would also affect the low Q somewhat, and or the fact that if the headgroups have water, the core should still be dry so maybe a core shell would be more appropriate? etc.. But I think it is nice to show that it is not always possible to distinguish a polydisperse sphere from an ellipsoid with a small axis ratio.

I have added the following wording to the end of Example 2:
"One could also explore the extent to which size polydispersity (see Example 3) impacts both models, or whether more nuanced models, for example, the core_shell_sphere or the core_shell_ellipsoid, allowing for a difference in SLD between the head and tail regions of the micelles, are better descriptions of the data."

@smk78
Copy link
Collaborator Author

smk78 commented Aug 20, 2024

Have also exapanded Example 3 to:

  • highlight that it produced a graph of the size distribution;
  • highlight where additional data (eg, S(Q)) created by the fit process lives;
  • show how you can plot the separate contributions of P(Q) and S(Q);
  • append to an existing graph.

@smk78 smk78 marked this pull request as ready for review August 20, 2024 16:39
@butlerpd
Copy link
Member

Since you added so much new I took a quick look at v. 6. Very nice on the addition to example 3. I like it!

I did note one error which I do not remember being there before?! Maybe I was just blind? But you mention the residual lines which I don't remember you doing before? the dotted black lines are +/1 sigma = 68% confidence but the red lines are +/-3 or the 98% confidence level which, as you say in the glossary, "Residuals larger than +/-3 indicate significant problems" hence the solid red color. In other words not +/-2 mentioned in the text. Of course that is clear in the bottom plot of page 12.

I would not rewrite for this one typo.

Also not worth another rewrite, but now that the fit of the SDS gives a chi^2 of 1 and the correct volume fraction, I would have changed the introduction to the next step (ellipsoid fitting) to start by pointing out this is essentially a perfect fit given the data we have. However literature suggest ellipsoids. Is that possible? we fit to the ellipsoid and see that it is indeed equally good so consistent. The lesson here would be that the it is not always possible to distinguish between different models in SAS and that Occam's razor tells us to be very careful to not attempt to overfit our data... or some such. In fact there are rules of thumb about how much eccentricity you need in order to have a chance at distinguishing... Essentially what you say but I think would come across more clearly if guided in this order?

BTW ... is it true that 2% SDS without salt is shown to be elliptical? It has been a very long time since I've looked at SDS, but my recollection was you had to add salt to get some elongation into ellipsoids?

Finally, and again I would nor rewrite just for this, in version 5.0, because you didn't replace all the figures, the fit to SDS actually (again for reasons we don't understand) show a poorer fit the that values you updated. The resolution of the images is quite low because they pick up such a large screen area but if you squint you will see the vol. faction is closet to 15% than 17% and the fit at low Q is not as good as it is in version 6 (and should be for the values we get now).

Are we ready to merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants