MCR Fails When All-zero Spectral Channels Present #27

AndrewHerzing · 2019-06-12T21:58:05Z

@CCampJr : @jat255 and I have been testing our Hyperspy MCR implementation. We noticed that if one or more spectral channels in the dataset contains all zeros then MCR returns empty factors. Is this behavior to be expected? If so, is there an accepted way of working around the limitation?

CCampJr · 2019-06-13T02:51:39Z

@AndrewHerzing @jat255

In general, all 0's is a no-no (in all sorts of optimization methods) because there is no instability to cause the algorithm to move/evolve. Better is to seed with small random values.

I could foresee adding a check-for-constants/0's and give a warning to the log

jat255 · 2019-06-13T02:57:15Z

Do you see any negative consequences of adding an infinitesimally-small value to channels that are uniformly-zero before proceeding with the MCR algorithm? I suppose it could preferentially bias the solution positive or negative?

charlesll · 2019-06-13T08:25:25Z

@jat255 why not just drop those channels? They are virtually information-less for the algorithm, so no need to include them I think.

charlesll · 2019-06-13T08:25:47Z

@CCampJr we could write an example on this subject!

CCampJr · 2019-06-13T13:51:12Z

@AndrewHerzing @jat255 I may have misinterpreted "spectral channel". Do you mean portions of spectra or do you mean, e.g., you seeded in 3 initial guess spectra and 2 of them were just 0's?

@charlesll I agree!

jat255 · 2019-06-13T13:57:54Z

@CCampJr I mean portions of the spectra. i.e. if you have an x y spectrum image map of spectra, if one of the channels is zero-valued at every x and y position, then you'll get the error described.

@charlesll I don't think we can "drop" channels (at least in the HyperSpy sense). If for instance, you have a dead pixel on your detector or something, removing the channel entirely would break the axis calibration. Admittedly, having all zero values is probably a pretty rare occurrence, but it could feasibly happen in very sparse maps, or if your detector has some incorrect offset or something (which was the case that caused me to notice the problem).

I don't know MCR intimately, but my hunch is that the best solution will just to make the zero channels actually be infinitesimally small (but definite), so that the algorithm can proceed.

CCampJr · 2019-06-13T14:30:22Z

@jat255 So by channels you mean a particular energy bin on the CCD? So across the entire image, with (e.g.) 1024 energy/frequencies channels/bins, it is identically zero across the whole image?

Just checking because I'll test this real quick to see what's up. Making it a very small random value would work, but I'm a bit surprised at the problem now that I understand what's going on.

CCampJr · 2019-06-13T14:40:11Z

@jat255 So I did not have this problem. See the images below. I set the 50th and 100th spectral bin to be identically 0 at every image pixel. Note the "max spectrum..." image is the maximum spectral intensity across the entire HSI image (I'm using the Demo dataset in the Jupyter NB)

jat255 · 2019-06-13T14:46:28Z

I just saw you responded, but I'll send this comment through anyway...

Yes, zero across the whole image. Here's an example of what I mean (plotted within HyperSpy). This is two signals made of three gaussians with added noise. In one of them, I set the 25th energy channel to zero. This is a "pixel-first" view of the data (i.e. a map of spectra):

And this is "energy-first" (i.e. a series of images):

When I run our "decomposition" routine using pyMCR, I get a valid result for the "no-zeros" case, but not for the one with one channel forced to zero:

jat255 · 2019-06-13T14:47:10Z

I don't believe HyperSpy is doing anything unusual, but I will try to reproduce using only pyMCR and not HyperSpy to make sure I see the same behavior... Give me a few to work it out.

CCampJr · 2019-06-13T14:53:26Z

Here's a more intense example (I randomly killed half of the channels). Just for info I used:

# Just a copy of my hyperspectral image data
hsi2 = 1*hsi

# Kill half the spectral pixels across the image
locs_to_kill = np.random.randint(0,200,100)
hsi2[...,locs_to_kill] = 0

initial_spectra2 = 1*initial_spectra
initial_spectra2[...,locs_to_kill] = 0

mcrar = McrAR(max_iter=100, st_regr='OLS', c_regr='OLS', 
              c_constraints=[ConstraintNonneg(), ConstraintNorm()],
              st_constraints=[ConstraintNonneg()])

mcrar.fit(hsi2.reshape((-1, wn.size)), ST=initial_spectra2, verbose=True)
print('\nFinal MSE: {:.7e}'.format(mcrar.err[-1]))

CCampJr · 2019-06-13T14:54:54Z

@jat255

I don't believe HyperSpy is doing anything unusual, but I will try to reproduce using only pyMCR and not HyperSpy to make sure I see the same behavior... Give me a few to work it out.

Maybe it's a dumb question, but are your spectra/images integer type? Weird things happen if you put in an image with integer type and then work in float space.

jat255 · 2019-06-13T15:16:39Z

Unfortunately, no they're floats. It's required by the SVD that we run prior to feeding those in as initial conditions to pyMCR

AndrewHerzing · 2019-06-13T15:28:40Z

We should clarify that in our implementation of MCR, we first perform an SVD decomposition using numpy.linalg.svd. The resulting factors are then rotated by varimax before being given as the input for PyMCR.

CCampJr · 2019-06-13T15:32:00Z

@AndrewHerzing @jat255

How many SVD components are you keeping?
Make sure the input spectra (into MCR) have positive values?
If you all want, you can send me data (numpy binaries would be fine)
- HSI
- SVD-varimax'd initial spectra

AndrewHerzing · 2019-06-13T15:40:04Z

The number of SVD components depends on the data but in the cases we have tested it is around 3 or 4. After Varimax rotation, the code attempts to find the sign of the majority of the signal and then sets that to positive. In other words, if the SVD output is mostly negative after Varimax, the sign would be flipped. I'll let Josh send you the data as it is his that we have been working on the most.

jat255 · 2019-06-13T18:53:49Z

HSI_data.zip

Ok, here's some dummy data that has been exhibiting the issue. There are four data files in that zip. HSI_data_unfolded.npy is a 100x100x40 (x,y,energy) data array unfolded into 2D, so it's actually 10000x40. The HSI_data_SVD.npy is the result of the SVD PCA (as run with s.decomposition(True) in HyperSpy), and then the HSI_SVD_varimax.npy is the same data, but after the varimax rotation. McrAL.pkl is the McrAR object after fitting, which can be opened by:

import pickle
with open('McrAL.pkl', 'rb') as fl: 
    mcr_result = pickle.load(fl)

This was the HyperSpy/numpy code used to generate that data and test the MCR implementation:

import hyperspy.api as hs
import numpy as np
import matplotlib.pylab as plt
import scipy.stats as stats

mu1 = -10
variance1 = 900
sigma1 = np.sqrt(variance1)
x1 = np.linspace(mu1 - 3*sigma1, mu1 + 3*sigma1, 200)
x1 = np.linspace(-100,100)
y1 = 10000*stats.norm.pdf(x1, mu1, sigma1)

mu2 = 5
variance2 = 700
sigma2 = np.sqrt(variance2)
x2 = np.linspace(mu2 - 3*sigma2, mu2 + 3*sigma2, 200)
x2 = np.linspace(-100,100)
y2 = 10000*stats.norm.pdf(x2, mu2, sigma2)

mu3 = 25
variance3 = 1000
sigma3 = np.sqrt(variance3)
x3 = np.linspace(-100,100)
y3 = 10000*stats.norm.pdf(x3, mu3, sigma3)

s = hs.signals.Signal1D(np.zeros([100,100,50], np.int32))
s.metadata.General.title = 'example spectrum image'
s.inav[0:50,:] = y1
s.inav[50:75,:] = y2
s.inav[75:,:] = y3
s.add_poissonian_noise()
s = s.isig[:40]
s2 = s.deepcopy()
s2.metadata.General.title = 'example spectrum image w/ zeros'
s2.isig[29] = 0
s2.change_dtype('float32')
s2.decomposition(True)
s2.decomposition(algorithm='MCR', output_dimension=3)

The code that runs pyMCR in HyperSpy is on an experimental branch here. It looks like the problem arises because the C_opt_ attribute of the McrAR object has NaN values where the spectral channel is zero. This propagates into our factors_out array here, and then since we do a normalization here, we get NaNs throughout the whole thing.

AndrewHerzing · 2019-06-14T13:55:09Z

@CCampJr : @jat255 and I have discussed this and we think the simplest way to proceed is for us to check for any NaN's in the C_opt_ output of the MCR and set them to zero prior to the final normalization. We will return a warning to alert the user that this has been done. Do you think this sounds okay? A more fundamental question is whether the presence of the zero's in the input data will affect the output of the MCR and whether we should be warning users about this prior to performing the fit in the first place. Does anything in your experience suggest that this is something we should be worried about?

CCampJr · 2019-06-17T14:21:59Z

@AndrewHerzing @jat255 Sorry for my delay: was off in the woods hiking Fri-Sun.

I'll be curious to see why things are failing as it's not obvious to me.

jat255 · 2020-05-06T17:41:19Z

I'm just getting back to this now. Looks like we never quite figured out what was going wrong here... For the sake of getting this included in HyperSpy (see hyperspy/hyperspy#2172) I think we'll go with the approach suggested by @AndrewHerzing:

to check for any NaN's in the C_opt_ output of the MCR and set them to zero prior to the final normalization. We will return a warning to alert the user that this has been done.

We can revisit this if necessary to figure it out "for real"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MCR Fails When All-zero Spectral Channels Present #27

MCR Fails When All-zero Spectral Channels Present #27

AndrewHerzing commented Jun 12, 2019

CCampJr commented Jun 13, 2019 •

edited

Loading

jat255 commented Jun 13, 2019

charlesll commented Jun 13, 2019

charlesll commented Jun 13, 2019

CCampJr commented Jun 13, 2019

jat255 commented Jun 13, 2019 •

edited

Loading

CCampJr commented Jun 13, 2019

CCampJr commented Jun 13, 2019 •

edited

Loading

jat255 commented Jun 13, 2019

jat255 commented Jun 13, 2019

CCampJr commented Jun 13, 2019

CCampJr commented Jun 13, 2019

jat255 commented Jun 13, 2019

AndrewHerzing commented Jun 13, 2019

CCampJr commented Jun 13, 2019

AndrewHerzing commented Jun 13, 2019

jat255 commented Jun 13, 2019 •

edited

Loading

AndrewHerzing commented Jun 14, 2019

CCampJr commented Jun 17, 2019

jat255 commented May 6, 2020

MCR Fails When All-zero Spectral Channels Present #27

MCR Fails When All-zero Spectral Channels Present #27

Comments

AndrewHerzing commented Jun 12, 2019

CCampJr commented Jun 13, 2019 • edited Loading

jat255 commented Jun 13, 2019

charlesll commented Jun 13, 2019

charlesll commented Jun 13, 2019

CCampJr commented Jun 13, 2019

jat255 commented Jun 13, 2019 • edited Loading

CCampJr commented Jun 13, 2019

CCampJr commented Jun 13, 2019 • edited Loading

jat255 commented Jun 13, 2019

jat255 commented Jun 13, 2019

CCampJr commented Jun 13, 2019

CCampJr commented Jun 13, 2019

jat255 commented Jun 13, 2019

AndrewHerzing commented Jun 13, 2019

CCampJr commented Jun 13, 2019

AndrewHerzing commented Jun 13, 2019

jat255 commented Jun 13, 2019 • edited Loading

AndrewHerzing commented Jun 14, 2019

CCampJr commented Jun 17, 2019

jat255 commented May 6, 2020

CCampJr commented Jun 13, 2019 •

edited

Loading

jat255 commented Jun 13, 2019 •

edited

Loading

CCampJr commented Jun 13, 2019 •

edited

Loading

jat255 commented Jun 13, 2019 •

edited

Loading